]>
Commit | Line | Data |
---|---|---|
1f673135 | 1 | \input texinfo @c -*- texinfo -*- |
debc7065 FB |
2 | @c %**start of header |
3 | @setfilename qemu-tech.info | |
e080e785 SW |
4 | |
5 | @documentlanguage en | |
6 | @documentencoding UTF-8 | |
7 | ||
debc7065 FB |
8 | @settitle QEMU Internals |
9 | @exampleindent 0 | |
10 | @paragraphindent 0 | |
11 | @c %**end of header | |
1f673135 | 12 | |
a1a32b05 SW |
13 | @ifinfo |
14 | @direntry | |
15 | * QEMU Internals: (qemu-tech). The QEMU Emulator Internals. | |
16 | @end direntry | |
17 | @end ifinfo | |
18 | ||
1f673135 | 19 | @iftex |
1f673135 FB |
20 | @titlepage |
21 | @sp 7 | |
22 | @center @titlefont{QEMU Internals} | |
23 | @sp 3 | |
24 | @end titlepage | |
25 | @end iftex | |
26 | ||
debc7065 FB |
27 | @ifnottex |
28 | @node Top | |
29 | @top | |
30 | ||
31 | @menu | |
32 | * Introduction:: | |
33 | * QEMU Internals:: | |
34 | * Regression Tests:: | |
debc7065 FB |
35 | @end menu |
36 | @end ifnottex | |
37 | ||
38 | @contents | |
39 | ||
40 | @node Introduction | |
1f673135 FB |
41 | @chapter Introduction |
42 | ||
debc7065 | 43 | @menu |
3aeaea65 MF |
44 | * intro_features:: Features |
45 | * intro_x86_emulation:: x86 and x86-64 emulation | |
46 | * intro_arm_emulation:: ARM emulation | |
47 | * intro_mips_emulation:: MIPS emulation | |
48 | * intro_ppc_emulation:: PowerPC emulation | |
49 | * intro_sparc_emulation:: Sparc32 and Sparc64 emulation | |
50 | * intro_xtensa_emulation:: Xtensa emulation | |
51 | * intro_other_emulation:: Other CPU emulation | |
debc7065 FB |
52 | @end menu |
53 | ||
54 | @node intro_features | |
1f673135 FB |
55 | @section Features |
56 | ||
57 | QEMU is a FAST! processor emulator using a portable dynamic | |
58 | translator. | |
59 | ||
60 | QEMU has two operating modes: | |
61 | ||
62 | @itemize @minus | |
63 | ||
5fafdf24 | 64 | @item |
998a0501 BS |
65 | Full system emulation. In this mode (full platform virtualization), |
66 | QEMU emulates a full system (usually a PC), including a processor and | |
67 | various peripherals. It can be used to launch several different | |
68 | Operating Systems at once without rebooting the host machine or to | |
69 | debug system code. | |
1f673135 | 70 | |
5fafdf24 | 71 | @item |
998a0501 BS |
72 | User mode emulation. In this mode (application level virtualization), |
73 | QEMU can launch processes compiled for one CPU on another CPU, however | |
74 | the Operating Systems must match. This can be used for example to ease | |
75 | cross-compilation and cross-debugging. | |
1f673135 FB |
76 | @end itemize |
77 | ||
78 | As QEMU requires no host kernel driver to run, it is very safe and | |
79 | easy to use. | |
80 | ||
81 | QEMU generic features: | |
82 | ||
5fafdf24 | 83 | @itemize |
1f673135 FB |
84 | |
85 | @item User space only or full system emulation. | |
86 | ||
debc7065 | 87 | @item Using dynamic translation to native code for reasonable speed. |
1f673135 | 88 | |
998a0501 BS |
89 | @item |
90 | Working on x86, x86_64 and PowerPC32/64 hosts. Being tested on ARM, | |
d41f3c3c | 91 | S390x, Sparc32 and Sparc64. |
1f673135 FB |
92 | |
93 | @item Self-modifying code support. | |
94 | ||
95 | @item Precise exceptions support. | |
96 | ||
998a0501 BS |
97 | @item |
98 | Floating point library supporting both full software emulation and | |
99 | native host FPU instructions. | |
100 | ||
1f673135 FB |
101 | @end itemize |
102 | ||
103 | QEMU user mode emulation features: | |
5fafdf24 | 104 | @itemize |
1f673135 FB |
105 | @item Generic Linux system call converter, including most ioctls. |
106 | ||
107 | @item clone() emulation using native CPU clone() to use Linux scheduler for threads. | |
108 | ||
5fafdf24 | 109 | @item Accurate signal handling by remapping host signals to target signals. |
1f673135 | 110 | @end itemize |
1f673135 | 111 | |
998a0501 | 112 | Linux user emulator (Linux host only) can be used to launch the Wine |
0adb1246 | 113 | Windows API emulator (@url{http://www.winehq.org}). A BSD user emulator for BSD |
998a0501 BS |
114 | hosts is under development. It would also be possible to develop a |
115 | similar user emulator for Solaris. | |
116 | ||
1f673135 | 117 | QEMU full system emulation features: |
5fafdf24 | 118 | @itemize |
998a0501 BS |
119 | @item |
120 | QEMU uses a full software MMU for maximum portability. | |
121 | ||
122 | @item | |
4a1418e0 AL |
123 | QEMU can optionally use an in-kernel accelerator, like kvm. The accelerators |
124 | execute some of the guest code natively, while | |
998a0501 BS |
125 | continuing to emulate the rest of the machine. |
126 | ||
127 | @item | |
128 | Various hardware devices can be emulated and in some cases, host | |
129 | devices (e.g. serial and parallel ports, USB, drives) can be used | |
130 | transparently by the guest Operating System. Host device passthrough | |
131 | can be used for talking to external physical peripherals (e.g. a | |
132 | webcam, modem or tape drive). | |
133 | ||
134 | @item | |
135 | Symmetric multiprocessing (SMP) even on a host with a single CPU. On a | |
136 | SMP host system, QEMU can use only one CPU fully due to difficulty in | |
137 | implementing atomic memory accesses efficiently. | |
138 | ||
1f673135 FB |
139 | @end itemize |
140 | ||
debc7065 | 141 | @node intro_x86_emulation |
998a0501 | 142 | @section x86 and x86-64 emulation |
1f673135 FB |
143 | |
144 | QEMU x86 target features: | |
145 | ||
5fafdf24 | 146 | @itemize |
1f673135 | 147 | |
5fafdf24 | 148 | @item The virtual x86 CPU supports 16 bit and 32 bit addressing with segmentation. |
998a0501 BS |
149 | LDT/GDT and IDT are emulated. VM86 mode is also supported to run |
150 | DOSEMU. There is some support for MMX/3DNow!, SSE, SSE2, SSE3, SSSE3, | |
151 | and SSE4 as well as x86-64 SVM. | |
1f673135 FB |
152 | |
153 | @item Support of host page sizes bigger than 4KB in user mode emulation. | |
154 | ||
155 | @item QEMU can emulate itself on x86. | |
156 | ||
5fafdf24 | 157 | @item An extensive Linux x86 CPU test program is included @file{tests/test-i386}. |
1f673135 FB |
158 | It can be used to test other x86 virtual CPUs. |
159 | ||
160 | @end itemize | |
161 | ||
162 | Current QEMU limitations: | |
163 | ||
5fafdf24 | 164 | @itemize |
1f673135 | 165 | |
998a0501 | 166 | @item Limited x86-64 support. |
1f673135 FB |
167 | |
168 | @item IPC syscalls are missing. | |
169 | ||
5fafdf24 | 170 | @item The x86 segment limits and access rights are not tested at every |
1f673135 FB |
171 | memory access (yet). Hopefully, very few OSes seem to rely on that for |
172 | normal use. | |
173 | ||
1f673135 FB |
174 | @end itemize |
175 | ||
debc7065 | 176 | @node intro_arm_emulation |
1f673135 FB |
177 | @section ARM emulation |
178 | ||
179 | @itemize | |
180 | ||
181 | @item Full ARM 7 user emulation. | |
182 | ||
183 | @item NWFPE FPU support included in user Linux emulation. | |
184 | ||
185 | @item Can run most ARM Linux binaries. | |
186 | ||
187 | @end itemize | |
188 | ||
24d4de45 TS |
189 | @node intro_mips_emulation |
190 | @section MIPS emulation | |
191 | ||
192 | @itemize | |
193 | ||
194 | @item The system emulation allows full MIPS32/MIPS64 Release 2 emulation, | |
195 | including privileged instructions, FPU and MMU, in both little and big | |
196 | endian modes. | |
197 | ||
198 | @item The Linux userland emulation can run many 32 bit MIPS Linux binaries. | |
199 | ||
200 | @end itemize | |
201 | ||
202 | Current QEMU limitations: | |
203 | ||
204 | @itemize | |
205 | ||
206 | @item Self-modifying code is not always handled correctly. | |
207 | ||
208 | @item 64 bit userland emulation is not implemented. | |
209 | ||
210 | @item The system emulation is not complete enough to run real firmware. | |
211 | ||
b1f45238 TS |
212 | @item The watchpoint debug facility is not implemented. |
213 | ||
24d4de45 TS |
214 | @end itemize |
215 | ||
debc7065 | 216 | @node intro_ppc_emulation |
1f673135 FB |
217 | @section PowerPC emulation |
218 | ||
219 | @itemize | |
220 | ||
5fafdf24 | 221 | @item Full PowerPC 32 bit emulation, including privileged instructions, |
1f673135 FB |
222 | FPU and MMU. |
223 | ||
224 | @item Can run most PowerPC Linux binaries. | |
225 | ||
226 | @end itemize | |
227 | ||
debc7065 | 228 | @node intro_sparc_emulation |
998a0501 | 229 | @section Sparc32 and Sparc64 emulation |
1f673135 FB |
230 | |
231 | @itemize | |
232 | ||
f6b647cd | 233 | @item Full SPARC V8 emulation, including privileged |
3475187d | 234 | instructions, FPU and MMU. SPARC V9 emulation includes most privileged |
a785e42e | 235 | and VIS instructions, FPU and I/D MMU. Alignment is fully enforced. |
1f673135 | 236 | |
a785e42e BS |
237 | @item Can run most 32-bit SPARC Linux binaries, SPARC32PLUS Linux binaries and |
238 | some 64-bit SPARC Linux binaries. | |
3475187d FB |
239 | |
240 | @end itemize | |
241 | ||
242 | Current QEMU limitations: | |
243 | ||
5fafdf24 | 244 | @itemize |
3475187d | 245 | |
3475187d FB |
246 | @item IPC syscalls are missing. |
247 | ||
1f587329 | 248 | @item Floating point exception support is buggy. |
3475187d FB |
249 | |
250 | @item Atomic instructions are not correctly implemented. | |
251 | ||
998a0501 BS |
252 | @item There are still some problems with Sparc64 emulators. |
253 | ||
254 | @end itemize | |
255 | ||
3aeaea65 MF |
256 | @node intro_xtensa_emulation |
257 | @section Xtensa emulation | |
258 | ||
259 | @itemize | |
260 | ||
261 | @item Core Xtensa ISA emulation, including most options: code density, | |
262 | loop, extended L32R, 16- and 32-bit multiplication, 32-bit division, | |
044d003d MF |
263 | MAC16, miscellaneous operations, boolean, FP coprocessor, coprocessor |
264 | context, debug, multiprocessor synchronization, | |
3aeaea65 MF |
265 | conditional store, exceptions, relocatable vectors, unaligned exception, |
266 | interrupts (including high priority and timer), hardware alignment, | |
267 | region protection, region translation, MMU, windowed registers, thread | |
268 | pointer, processor ID. | |
269 | ||
044d003d MF |
270 | @item Not implemented options: data/instruction cache (including cache |
271 | prefetch and locking), XLMI, processor interface. Also options not | |
272 | covered by the core ISA (e.g. FLIX, wide branches) are not implemented. | |
3aeaea65 MF |
273 | |
274 | @item Can run most Xtensa Linux binaries. | |
275 | ||
276 | @item New core configuration that requires no additional instructions | |
277 | may be created from overlay with minimal amount of hand-written code. | |
278 | ||
279 | @end itemize | |
280 | ||
998a0501 BS |
281 | @node intro_other_emulation |
282 | @section Other CPU emulation | |
1f673135 | 283 | |
998a0501 BS |
284 | In addition to the above, QEMU supports emulation of other CPUs with |
285 | varying levels of success. These are: | |
286 | ||
287 | @itemize | |
288 | ||
289 | @item | |
290 | Alpha | |
291 | @item | |
292 | CRIS | |
293 | @item | |
294 | M68k | |
295 | @item | |
296 | SH4 | |
1f673135 FB |
297 | @end itemize |
298 | ||
debc7065 | 299 | @node QEMU Internals |
1f673135 FB |
300 | @chapter QEMU Internals |
301 | ||
debc7065 FB |
302 | @menu |
303 | * QEMU compared to other emulators:: | |
304 | * Portable dynamic translation:: | |
debc7065 FB |
305 | * Condition code optimisations:: |
306 | * CPU state optimisations:: | |
307 | * Translation cache:: | |
308 | * Direct block chaining:: | |
309 | * Self-modifying code and translated code invalidation:: | |
310 | * Exception support:: | |
311 | * MMU emulation:: | |
998a0501 | 312 | * Device emulation:: |
debc7065 FB |
313 | * Hardware interrupts:: |
314 | * User emulation specific details:: | |
315 | * Bibliography:: | |
316 | @end menu | |
317 | ||
318 | @node QEMU compared to other emulators | |
1f673135 FB |
319 | @section QEMU compared to other emulators |
320 | ||
8e9620a6 | 321 | Like bochs [1], QEMU emulates an x86 CPU. But QEMU is much faster than |
1f673135 FB |
322 | bochs as it uses dynamic compilation. Bochs is closely tied to x86 PC |
323 | emulation while QEMU can emulate several processors. | |
324 | ||
325 | Like Valgrind [2], QEMU does user space emulation and dynamic | |
326 | translation. Valgrind is mainly a memory debugger while QEMU has no | |
327 | support for it (QEMU could be used to detect out of bound memory | |
328 | accesses as Valgrind, but it has no support to track uninitialised data | |
329 | as Valgrind does). The Valgrind dynamic translator generates better code | |
330 | than QEMU (in particular it does register allocation) but it is closely | |
331 | tied to an x86 host and target and has no support for precise exceptions | |
332 | and system emulation. | |
333 | ||
8e9620a6 | 334 | EM86 [3] is the closest project to user space QEMU (and QEMU still uses |
1f673135 FB |
335 | some of its code, in particular the ELF file loader). EM86 was limited |
336 | to an alpha host and used a proprietary and slow interpreter (the | |
8e9620a6 | 337 | interpreter part of the FX!32 Digital Win32 code translator [4]). |
1f673135 | 338 | |
8e9620a6 TH |
339 | TWIN from Willows Software was a Windows API emulator like Wine. It is less |
340 | accurate than Wine but includes a protected mode x86 interpreter to launch | |
341 | x86 Windows executables. Such an approach has greater potential because most | |
342 | of the Windows API is executed natively but it is far more difficult to | |
343 | develop because all the data structures and function parameters exchanged | |
1f673135 FB |
344 | between the API and the x86 code must be converted. |
345 | ||
8e9620a6 | 346 | User mode Linux [5] was the only solution before QEMU to launch a |
1f673135 FB |
347 | Linux kernel as a process while not needing any host kernel |
348 | patches. However, user mode Linux requires heavy kernel patches while | |
349 | QEMU accepts unpatched Linux kernels. The price to pay is that QEMU is | |
350 | slower. | |
351 | ||
8e9620a6 | 352 | The Plex86 [6] PC virtualizer is done in the same spirit as the now |
998a0501 BS |
353 | obsolete qemu-fast system emulator. It requires a patched Linux kernel |
354 | to work (you cannot launch the same kernel on your PC), but the | |
355 | patches are really small. As it is a PC virtualizer (no emulation is | |
356 | done except for some privileged instructions), it has the potential of | |
357 | being faster than QEMU. The downside is that a complicated (and | |
358 | potentially unsafe) host kernel patch is needed. | |
1f673135 | 359 | |
8e9620a6 TH |
360 | The commercial PC Virtualizers (VMWare [7], VirtualPC [8]) are faster |
361 | than QEMU (without virtualization), but they all need specific, proprietary | |
1f673135 FB |
362 | and potentially unsafe host drivers. Moreover, they are unable to |
363 | provide cycle exact simulation as an emulator can. | |
364 | ||
8e9620a6 TH |
365 | VirtualBox [9], Xen [10] and KVM [11] are based on QEMU. QEMU-SystemC |
366 | [12] uses QEMU to simulate a system where some hardware devices are | |
998a0501 BS |
367 | developed in SystemC. |
368 | ||
debc7065 | 369 | @node Portable dynamic translation |
1f673135 FB |
370 | @section Portable dynamic translation |
371 | ||
372 | QEMU is a dynamic translator. When it first encounters a piece of code, | |
373 | it converts it to the host instruction set. Usually dynamic translators | |
374 | are very complicated and highly CPU dependent. QEMU uses some tricks | |
375 | which make it relatively easily portable and simple while achieving good | |
376 | performances. | |
377 | ||
998a0501 BS |
378 | After the release of version 0.9.1, QEMU switched to a new method of |
379 | generating code, Tiny Code Generator or TCG. TCG relaxes the | |
380 | dependency on the exact version of the compiler used. The basic idea | |
381 | is to split every target instruction into a couple of RISC-like TCG | |
382 | ops (see @code{target-i386/translate.c}). Some optimizations can be | |
383 | performed at this stage, including liveness analysis and trivial | |
384 | constant expression evaluation. TCG ops are then implemented in the | |
385 | host CPU back end, also known as TCG target (see | |
ce151109 | 386 | @code{tcg/i386/tcg-target.inc.c}). For more information, please take a |
998a0501 | 387 | look at @code{tcg/README}. |
1f673135 | 388 | |
debc7065 | 389 | @node Condition code optimisations |
1f673135 FB |
390 | @section Condition code optimisations |
391 | ||
998a0501 BS |
392 | Lazy evaluation of CPU condition codes (@code{EFLAGS} register on x86) |
393 | is important for CPUs where every instruction sets the condition | |
394 | codes. It tends to be less important on conventional RISC systems | |
f0f26a06 BS |
395 | where condition codes are only updated when explicitly requested. On |
396 | Sparc64, costly update of both 32 and 64 bit condition codes can be | |
397 | avoided with lazy evaluation. | |
998a0501 BS |
398 | |
399 | Instead of computing the condition codes after each x86 instruction, | |
400 | QEMU just stores one operand (called @code{CC_SRC}), the result | |
401 | (called @code{CC_DST}) and the type of operation (called | |
402 | @code{CC_OP}). When the condition codes are needed, the condition | |
403 | codes can be calculated using this information. In addition, an | |
404 | optimized calculation can be performed for some instruction types like | |
405 | conditional branches. | |
1f673135 | 406 | |
1235fc06 | 407 | @code{CC_OP} is almost never explicitly set in the generated code |
1f673135 FB |
408 | because it is known at translation time. |
409 | ||
f0f26a06 BS |
410 | The lazy condition code evaluation is used on x86, m68k, cris and |
411 | Sparc. ARM uses a simplified variant for the N and Z flags. | |
1f673135 | 412 | |
debc7065 | 413 | @node CPU state optimisations |
1f673135 FB |
414 | @section CPU state optimisations |
415 | ||
998a0501 BS |
416 | The target CPUs have many internal states which change the way it |
417 | evaluates instructions. In order to achieve a good speed, the | |
418 | translation phase considers that some state information of the virtual | |
419 | CPU cannot change in it. The state is recorded in the Translation | |
420 | Block (TB). If the state changes (e.g. privilege level), a new TB will | |
421 | be generated and the previous TB won't be used anymore until the state | |
422 | matches the state recorded in the previous TB. For example, if the SS, | |
423 | DS and ES segments have a zero base, then the translator does not even | |
424 | generate an addition for the segment base. | |
1f673135 FB |
425 | |
426 | [The FPU stack pointer register is not handled that way yet]. | |
427 | ||
debc7065 | 428 | @node Translation cache |
1f673135 FB |
429 | @section Translation cache |
430 | ||
27c8efcb | 431 | A 32 MByte cache holds the most recently used translations. For |
1f673135 FB |
432 | simplicity, it is completely flushed when it is full. A translation unit |
433 | contains just a single basic block (a block of x86 instructions | |
434 | terminated by a jump or by a virtual CPU state change which the | |
435 | translator cannot deduce statically). | |
436 | ||
debc7065 | 437 | @node Direct block chaining |
1f673135 FB |
438 | @section Direct block chaining |
439 | ||
440 | After each translated basic block is executed, QEMU uses the simulated | |
d274e07c | 441 | Program Counter (PC) and other cpu state information (such as the CS |
1f673135 FB |
442 | segment base value) to find the next basic block. |
443 | ||
444 | In order to accelerate the most common cases where the new simulated PC | |
445 | is known, QEMU can patch a basic block so that it jumps directly to the | |
446 | next one. | |
447 | ||
448 | The most portable code uses an indirect jump. An indirect jump makes | |
449 | it easier to make the jump target modification atomic. On some host | |
450 | architectures (such as x86 or PowerPC), the @code{JUMP} opcode is | |
451 | directly patched so that the block chaining has no overhead. | |
452 | ||
debc7065 | 453 | @node Self-modifying code and translated code invalidation |
1f673135 FB |
454 | @section Self-modifying code and translated code invalidation |
455 | ||
456 | Self-modifying code is a special challenge in x86 emulation because no | |
457 | instruction cache invalidation is signaled by the application when code | |
458 | is modified. | |
459 | ||
460 | When translated code is generated for a basic block, the corresponding | |
998a0501 BS |
461 | host page is write protected if it is not already read-only. Then, if |
462 | a write access is done to the page, Linux raises a SEGV signal. QEMU | |
463 | then invalidates all the translated code in the page and enables write | |
464 | accesses to the page. | |
1f673135 FB |
465 | |
466 | Correct translated code invalidation is done efficiently by maintaining | |
467 | a linked list of every translated block contained in a given page. Other | |
5fafdf24 | 468 | linked lists are also maintained to undo direct block chaining. |
1f673135 | 469 | |
998a0501 BS |
470 | On RISC targets, correctly written software uses memory barriers and |
471 | cache flushes, so some of the protection above would not be | |
472 | necessary. However, QEMU still requires that the generated code always | |
473 | matches the target instructions in memory in order to handle | |
474 | exceptions correctly. | |
1f673135 | 475 | |
debc7065 | 476 | @node Exception support |
1f673135 FB |
477 | @section Exception support |
478 | ||
479 | longjmp() is used when an exception such as division by zero is | |
5fafdf24 | 480 | encountered. |
1f673135 FB |
481 | |
482 | The host SIGSEGV and SIGBUS signal handlers are used to get invalid | |
998a0501 BS |
483 | memory accesses. The simulated program counter is found by |
484 | retranslating the corresponding basic block and by looking where the | |
485 | host program counter was at the exception point. | |
1f673135 FB |
486 | |
487 | The virtual CPU cannot retrieve the exact @code{EFLAGS} register because | |
488 | in some cases it is not computed because of condition code | |
489 | optimisations. It is not a big concern because the emulated code can | |
490 | still be restarted in any cases. | |
491 | ||
debc7065 | 492 | @node MMU emulation |
1f673135 FB |
493 | @section MMU emulation |
494 | ||
998a0501 BS |
495 | For system emulation QEMU supports a soft MMU. In that mode, the MMU |
496 | virtual to physical address translation is done at every memory | |
497 | access. QEMU uses an address translation cache to speed up the | |
498 | translation. | |
1f673135 FB |
499 | |
500 | In order to avoid flushing the translated code each time the MMU | |
501 | mappings change, QEMU uses a physically indexed translation cache. It | |
5fafdf24 | 502 | means that each basic block is indexed with its physical address. |
1f673135 FB |
503 | |
504 | When MMU mappings change, only the chaining of the basic blocks is | |
505 | reset (i.e. a basic block can no longer jump directly to another one). | |
506 | ||
998a0501 BS |
507 | @node Device emulation |
508 | @section Device emulation | |
509 | ||
510 | Systems emulated by QEMU are organized by boards. At initialization | |
511 | phase, each board instantiates a number of CPUs, devices, RAM and | |
512 | ROM. Each device in turn can assign I/O ports or memory areas (for | |
513 | MMIO) to its handlers. When the emulation starts, an access to the | |
514 | ports or MMIO memory areas assigned to the device causes the | |
515 | corresponding handler to be called. | |
516 | ||
517 | RAM and ROM are handled more optimally, only the offset to the host | |
518 | memory needs to be added to the guest address. | |
519 | ||
520 | The video RAM of VGA and other display cards is special: it can be | |
521 | read or written directly like RAM, but write accesses cause the memory | |
522 | to be marked with VGA_DIRTY flag as well. | |
523 | ||
524 | QEMU supports some device classes like serial and parallel ports, USB, | |
525 | drives and network devices, by providing APIs for easier connection to | |
526 | the generic, higher level implementations. The API hides the | |
527 | implementation details from the devices, like native device use or | |
528 | advanced block device formats like QCOW. | |
529 | ||
530 | Usually the devices implement a reset method and register support for | |
531 | saving and loading of the device state. The devices can also use | |
532 | timers, especially together with the use of bottom halves (BHs). | |
533 | ||
debc7065 | 534 | @node Hardware interrupts |
1f673135 FB |
535 | @section Hardware interrupts |
536 | ||
e1b4382c | 537 | In order to be faster, QEMU does not check at every basic block if a |
e8dc0938 | 538 | hardware interrupt is pending. Instead, the user must asynchronously |
1f673135 FB |
539 | call a specific function to tell that an interrupt is pending. This |
540 | function resets the chaining of the currently executing basic | |
541 | block. It ensures that the execution will return soon in the main loop | |
542 | of the CPU emulator. Then the main loop can test if the interrupt is | |
543 | pending and handle it. | |
544 | ||
debc7065 | 545 | @node User emulation specific details |
1f673135 FB |
546 | @section User emulation specific details |
547 | ||
548 | @subsection Linux system call translation | |
549 | ||
550 | QEMU includes a generic system call translator for Linux. It means that | |
551 | the parameters of the system calls can be converted to fix the | |
552 | endianness and 32/64 bit issues. The IOCTLs are converted with a generic | |
553 | type description system (see @file{ioctls.h} and @file{thunk.c}). | |
554 | ||
555 | QEMU supports host CPUs which have pages bigger than 4KB. It records all | |
556 | the mappings the process does and try to emulated the @code{mmap()} | |
557 | system calls in cases where the host @code{mmap()} call would fail | |
558 | because of bad page alignment. | |
559 | ||
560 | @subsection Linux signals | |
561 | ||
562 | Normal and real-time signals are queued along with their information | |
563 | (@code{siginfo_t}) as it is done in the Linux kernel. Then an interrupt | |
564 | request is done to the virtual CPU. When it is interrupted, one queued | |
565 | signal is handled by generating a stack frame in the virtual CPU as the | |
566 | Linux kernel does. The @code{sigreturn()} system call is emulated to return | |
567 | from the virtual signal handler. | |
568 | ||
569 | Some signals (such as SIGALRM) directly come from the host. Other | |
e8dc0938 | 570 | signals are synthesized from the virtual CPU exceptions such as SIGFPE |
1f673135 FB |
571 | when a division by zero is done (see @code{main.c:cpu_loop()}). |
572 | ||
573 | The blocked signal mask is still handled by the host Linux kernel so | |
574 | that most signal system calls can be redirected directly to the host | |
575 | Linux kernel. Only the @code{sigaction()} and @code{sigreturn()} system | |
576 | calls need to be fully emulated (see @file{signal.c}). | |
577 | ||
578 | @subsection clone() system call and threads | |
579 | ||
580 | The Linux clone() system call is usually used to create a thread. QEMU | |
581 | uses the host clone() system call so that real host threads are created | |
582 | for each emulated thread. One virtual CPU instance is created for each | |
583 | thread. | |
584 | ||
585 | The virtual x86 CPU atomic operations are emulated with a global lock so | |
586 | that their semantic is preserved. | |
587 | ||
588 | Note that currently there are still some locking issues in QEMU. In | |
589 | particular, the translated cache flush is not protected yet against | |
590 | reentrancy. | |
591 | ||
592 | @subsection Self-virtualization | |
593 | ||
594 | QEMU was conceived so that ultimately it can emulate itself. Although | |
595 | it is not very useful, it is an important test to show the power of the | |
596 | emulator. | |
597 | ||
598 | Achieving self-virtualization is not easy because there may be address | |
998a0501 BS |
599 | space conflicts. QEMU user emulators solve this problem by being an |
600 | executable ELF shared object as the ld-linux.so ELF interpreter. That | |
601 | way, it can be relocated at load time. | |
1f673135 | 602 | |
debc7065 | 603 | @node Bibliography |
1f673135 FB |
604 | @section Bibliography |
605 | ||
606 | @table @asis | |
607 | ||
5fafdf24 | 608 | @item [1] |
8e9620a6 TH |
609 | @url{http://bochs.sourceforge.net/}, the Bochs IA-32 Emulator Project, |
610 | by Kevin Lawton et al. | |
1f673135 FB |
611 | |
612 | @item [2] | |
8e9620a6 TH |
613 | @url{http://www.valgrind.org/}, Valgrind, an open-source memory debugger |
614 | for GNU/Linux. | |
1f673135 FB |
615 | |
616 | @item [3] | |
8e9620a6 TH |
617 | @url{http://ftp.dreamtime.org/pub/linux/Linux-Alpha/em86/v0.2/docs/em86.html}, |
618 | the EM86 x86 emulator on Alpha-Linux. | |
1f673135 FB |
619 | |
620 | @item [4] | |
debc7065 | 621 | @url{http://www.usenix.org/publications/library/proceedings/usenix-nt97/@/full_papers/chernoff/chernoff.pdf}, |
1f673135 FB |
622 | DIGITAL FX!32: Running 32-Bit x86 Applications on Alpha NT, by Anton |
623 | Chernoff and Ray Hookway. | |
624 | ||
8e9620a6 | 625 | @item [5] |
5fafdf24 | 626 | @url{http://user-mode-linux.sourceforge.net/}, |
1f673135 FB |
627 | The User-mode Linux Kernel. |
628 | ||
8e9620a6 | 629 | @item [6] |
5fafdf24 | 630 | @url{http://www.plex86.org/}, |
1f673135 FB |
631 | The new Plex86 project. |
632 | ||
8e9620a6 | 633 | @item [7] |
5fafdf24 | 634 | @url{http://www.vmware.com/}, |
1f673135 FB |
635 | The VMWare PC virtualizer. |
636 | ||
8e9620a6 TH |
637 | @item [8] |
638 | @url{https://www.microsoft.com/download/details.aspx?id=3702}, | |
1f673135 FB |
639 | The VirtualPC PC virtualizer. |
640 | ||
8e9620a6 | 641 | @item [9] |
998a0501 BS |
642 | @url{http://virtualbox.org/}, |
643 | The VirtualBox PC virtualizer. | |
644 | ||
8e9620a6 | 645 | @item [10] |
998a0501 BS |
646 | @url{http://www.xen.org/}, |
647 | The Xen hypervisor. | |
648 | ||
8e9620a6 TH |
649 | @item [11] |
650 | @url{http://www.linux-kvm.org/}, | |
998a0501 BS |
651 | Kernel Based Virtual Machine (KVM). |
652 | ||
8e9620a6 | 653 | @item [12] |
998a0501 BS |
654 | @url{http://www.greensocs.com/projects/QEMUSystemC}, |
655 | QEMU-SystemC, a hardware co-simulator. | |
656 | ||
1f673135 FB |
657 | @end table |
658 | ||
debc7065 | 659 | @node Regression Tests |
1f673135 FB |
660 | @chapter Regression Tests |
661 | ||
662 | In the directory @file{tests/}, various interesting testing programs | |
b1f45238 | 663 | are available. They are used for regression testing. |
1f673135 | 664 | |
debc7065 FB |
665 | @menu |
666 | * test-i386:: | |
667 | * linux-test:: | |
debc7065 FB |
668 | @end menu |
669 | ||
670 | @node test-i386 | |
1f673135 FB |
671 | @section @file{test-i386} |
672 | ||
673 | This program executes most of the 16 bit and 32 bit x86 instructions and | |
674 | generates a text output. It can be compared with the output obtained with | |
675 | a real CPU or another emulator. The target @code{make test} runs this | |
676 | program and a @code{diff} on the generated output. | |
677 | ||
678 | The Linux system call @code{modify_ldt()} is used to create x86 selectors | |
679 | to test some 16 bit addressing and 32 bit with segmentation cases. | |
680 | ||
681 | The Linux system call @code{vm86()} is used to test vm86 emulation. | |
682 | ||
683 | Various exceptions are raised to test most of the x86 user space | |
684 | exception reporting. | |
685 | ||
debc7065 | 686 | @node linux-test |
1f673135 FB |
687 | @section @file{linux-test} |
688 | ||
689 | This program tests various Linux system calls. It is used to verify | |
690 | that the system call parameters are correctly converted between target | |
691 | and host CPUs. | |
692 | ||
debc7065 | 693 | @bye |