]> git.ipfire.org Git - thirdparty/kernel/stable.git/blame - Documentation/virtual/kvm/api.txt
KVM: PPC: Allow book3s_hv guests to use SMT processor modes
[thirdparty/kernel/stable.git] / Documentation / virtual / kvm / api.txt
CommitLineData
9c1b96e3
AK
1The Definitive KVM (Kernel-based Virtual Machine) API Documentation
2===================================================================
3
41. General description
5
6The kvm API is a set of ioctls that are issued to control various aspects
7of a virtual machine. The ioctls belong to three classes
8
9 - System ioctls: These query and set global attributes which affect the
10 whole kvm subsystem. In addition a system ioctl is used to create
11 virtual machines
12
13 - VM ioctls: These query and set attributes that affect an entire virtual
14 machine, for example memory layout. In addition a VM ioctl is used to
15 create virtual cpus (vcpus).
16
17 Only run VM ioctls from the same process (address space) that was used
18 to create the VM.
19
20 - vcpu ioctls: These query and set attributes that control the operation
21 of a single virtual cpu.
22
23 Only run vcpu ioctls from the same thread that was used to create the
24 vcpu.
25
2044892d 262. File descriptors
9c1b96e3
AK
27
28The kvm API is centered around file descriptors. An initial
29open("/dev/kvm") obtains a handle to the kvm subsystem; this handle
30can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this
2044892d 31handle will create a VM file descriptor which can be used to issue VM
9c1b96e3
AK
32ioctls. A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu
33and return a file descriptor pointing to it. Finally, ioctls on a vcpu
34fd can be used to control the vcpu, including the important task of
35actually running guest code.
36
37In general file descriptors can be migrated among processes by means
38of fork() and the SCM_RIGHTS facility of unix domain socket. These
39kinds of tricks are explicitly not supported by kvm. While they will
40not cause harm to the host, their actual behavior is not guaranteed by
41the API. The only supported use is one virtual machine per process,
42and one vcpu per thread.
43
443. Extensions
45
46As of Linux 2.6.22, the KVM ABI has been stabilized: no backward
47incompatible change are allowed. However, there is an extension
48facility that allows backward-compatible extensions to the API to be
49queried and used.
50
51The extension mechanism is not based on on the Linux version number.
52Instead, kvm defines extension identifiers and a facility to query
53whether a particular extension identifier is available. If it is, a
54set of ioctls is available for application use.
55
564. API description
57
58This section describes ioctls that can be used to control kvm guests.
59For each ioctl, the following information is provided along with a
60description:
61
62 Capability: which KVM extension provides this ioctl. Can be 'basic',
63 which means that is will be provided by any kernel that supports
64 API version 12 (see section 4.1), or a KVM_CAP_xyz constant, which
65 means availability needs to be checked with KVM_CHECK_EXTENSION
66 (see section 4.4).
67
68 Architectures: which instruction set architectures provide this ioctl.
69 x86 includes both i386 and x86_64.
70
71 Type: system, vm, or vcpu.
72
73 Parameters: what parameters are accepted by the ioctl.
74
75 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL)
76 are not detailed, but errors with specific meanings are.
77
784.1 KVM_GET_API_VERSION
79
80Capability: basic
81Architectures: all
82Type: system ioctl
83Parameters: none
84Returns: the constant KVM_API_VERSION (=12)
85
86This identifies the API version as the stable kvm API. It is not
87expected that this number will change. However, Linux 2.6.20 and
882.6.21 report earlier versions; these are not documented and not
89supported. Applications should refuse to run if KVM_GET_API_VERSION
90returns a value other than 12. If this check passes, all ioctls
91described as 'basic' will be available.
92
934.2 KVM_CREATE_VM
94
95Capability: basic
96Architectures: all
97Type: system ioctl
98Parameters: none
99Returns: a VM fd that can be used to control the new virtual machine.
100
101The new VM has no virtual cpus and no memory. An mmap() of a VM fd
102will access the virtual machine's physical address space; offset zero
103corresponds to guest physical address zero. Use of mmap() on a VM fd
104is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is
105available.
106
1074.3 KVM_GET_MSR_INDEX_LIST
108
109Capability: basic
110Architectures: x86
111Type: system
112Parameters: struct kvm_msr_list (in/out)
113Returns: 0 on success; -1 on error
114Errors:
115 E2BIG: the msr index list is to be to fit in the array specified by
116 the user.
117
118struct kvm_msr_list {
119 __u32 nmsrs; /* number of msrs in entries */
120 __u32 indices[0];
121};
122
123This ioctl returns the guest msrs that are supported. The list varies
124by kvm version and host processor, but does not change otherwise. The
125user fills in the size of the indices array in nmsrs, and in return
126kvm adjusts nmsrs to reflect the actual number of msrs and fills in
127the indices array with their numbers.
128
2e2602ca
AK
129Note: if kvm indicates supports MCE (KVM_CAP_MCE), then the MCE bank MSRs are
130not returned in the MSR list, as different vcpus can have a different number
131of banks, as set via the KVM_X86_SETUP_MCE ioctl.
132
9c1b96e3
AK
1334.4 KVM_CHECK_EXTENSION
134
135Capability: basic
136Architectures: all
137Type: system ioctl
138Parameters: extension identifier (KVM_CAP_*)
139Returns: 0 if unsupported; 1 (or some other positive integer) if supported
140
141The API allows the application to query about extensions to the core
142kvm API. Userspace passes an extension identifier (an integer) and
143receives an integer that describes the extension availability.
144Generally 0 means no and 1 means yes, but some extensions may report
145additional information in the integer return value.
146
1474.5 KVM_GET_VCPU_MMAP_SIZE
148
149Capability: basic
150Architectures: all
151Type: system ioctl
152Parameters: none
153Returns: size of vcpu mmap area, in bytes
154
155The KVM_RUN ioctl (cf.) communicates with userspace via a shared
156memory region. This ioctl returns the size of that region. See the
157KVM_RUN documentation for details.
158
1594.6 KVM_SET_MEMORY_REGION
160
161Capability: basic
162Architectures: all
163Type: vm ioctl
164Parameters: struct kvm_memory_region (in)
165Returns: 0 on success, -1 on error
166
b74a07be 167This ioctl is obsolete and has been removed.
9c1b96e3 168
68ba6974 1694.7 KVM_CREATE_VCPU
9c1b96e3
AK
170
171Capability: basic
172Architectures: all
173Type: vm ioctl
174Parameters: vcpu id (apic id on x86)
175Returns: vcpu fd on success, -1 on error
176
177This API adds a vcpu to a virtual machine. The vcpu id is a small integer
76d25402
PE
178in the range [0, max_vcpus). You can use KVM_CAP_NR_VCPUS of the
179KVM_CHECK_EXTENSION ioctl() to determine the value for max_vcpus at run-time.
180If the KVM_CAP_NR_VCPUS does not exist, you should assume that max_vcpus is 4
181cpus max.
9c1b96e3 182
371fefd6
PM
183On powerpc using book3s_hv mode, the vcpus are mapped onto virtual
184threads in one or more virtual CPU cores. (This is because the
185hardware requires all the hardware threads in a CPU core to be in the
186same partition.) The KVM_CAP_PPC_SMT capability indicates the number
187of vcpus per virtual core (vcore). The vcore id is obtained by
188dividing the vcpu id by the number of vcpus per vcore. The vcpus in a
189given vcore will always be in the same physical core as each other
190(though that might be a different physical core from time to time).
191Userspace can control the threading (SMT) mode of the guest by its
192allocation of vcpu ids. For example, if userspace wants
193single-threaded guest vcpus, it should make all vcpu ids be a multiple
194of the number of vcpus per vcore.
195
68ba6974 1964.8 KVM_GET_DIRTY_LOG (vm ioctl)
9c1b96e3
AK
197
198Capability: basic
199Architectures: x86
200Type: vm ioctl
201Parameters: struct kvm_dirty_log (in/out)
202Returns: 0 on success, -1 on error
203
204/* for KVM_GET_DIRTY_LOG */
205struct kvm_dirty_log {
206 __u32 slot;
207 __u32 padding;
208 union {
209 void __user *dirty_bitmap; /* one bit per page */
210 __u64 padding;
211 };
212};
213
214Given a memory slot, return a bitmap containing any pages dirtied
215since the last call to this ioctl. Bit 0 is the first page in the
216memory slot. Ensure the entire structure is cleared to avoid padding
217issues.
218
68ba6974 2194.9 KVM_SET_MEMORY_ALIAS
9c1b96e3
AK
220
221Capability: basic
222Architectures: x86
223Type: vm ioctl
224Parameters: struct kvm_memory_alias (in)
225Returns: 0 (success), -1 (error)
226
a1f4d395 227This ioctl is obsolete and has been removed.
9c1b96e3 228
68ba6974 2294.10 KVM_RUN
9c1b96e3
AK
230
231Capability: basic
232Architectures: all
233Type: vcpu ioctl
234Parameters: none
235Returns: 0 on success, -1 on error
236Errors:
237 EINTR: an unmasked signal is pending
238
239This ioctl is used to run a guest virtual cpu. While there are no
240explicit parameters, there is an implicit parameter block that can be
241obtained by mmap()ing the vcpu fd at offset 0, with the size given by
242KVM_GET_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct
243kvm_run' (see below).
244
68ba6974 2454.11 KVM_GET_REGS
9c1b96e3
AK
246
247Capability: basic
248Architectures: all
249Type: vcpu ioctl
250Parameters: struct kvm_regs (out)
251Returns: 0 on success, -1 on error
252
253Reads the general purpose registers from the vcpu.
254
255/* x86 */
256struct kvm_regs {
257 /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
258 __u64 rax, rbx, rcx, rdx;
259 __u64 rsi, rdi, rsp, rbp;
260 __u64 r8, r9, r10, r11;
261 __u64 r12, r13, r14, r15;
262 __u64 rip, rflags;
263};
264
68ba6974 2654.12 KVM_SET_REGS
9c1b96e3
AK
266
267Capability: basic
268Architectures: all
269Type: vcpu ioctl
270Parameters: struct kvm_regs (in)
271Returns: 0 on success, -1 on error
272
273Writes the general purpose registers into the vcpu.
274
275See KVM_GET_REGS for the data structure.
276
68ba6974 2774.13 KVM_GET_SREGS
9c1b96e3
AK
278
279Capability: basic
5ce941ee 280Architectures: x86, ppc
9c1b96e3
AK
281Type: vcpu ioctl
282Parameters: struct kvm_sregs (out)
283Returns: 0 on success, -1 on error
284
285Reads special registers from the vcpu.
286
287/* x86 */
288struct kvm_sregs {
289 struct kvm_segment cs, ds, es, fs, gs, ss;
290 struct kvm_segment tr, ldt;
291 struct kvm_dtable gdt, idt;
292 __u64 cr0, cr2, cr3, cr4, cr8;
293 __u64 efer;
294 __u64 apic_base;
295 __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
296};
297
5ce941ee
SW
298/* ppc -- see arch/powerpc/include/asm/kvm.h */
299
9c1b96e3
AK
300interrupt_bitmap is a bitmap of pending external interrupts. At most
301one bit may be set. This interrupt has been acknowledged by the APIC
302but not yet injected into the cpu core.
303
68ba6974 3044.14 KVM_SET_SREGS
9c1b96e3
AK
305
306Capability: basic
5ce941ee 307Architectures: x86, ppc
9c1b96e3
AK
308Type: vcpu ioctl
309Parameters: struct kvm_sregs (in)
310Returns: 0 on success, -1 on error
311
312Writes special registers into the vcpu. See KVM_GET_SREGS for the
313data structures.
314
68ba6974 3154.15 KVM_TRANSLATE
9c1b96e3
AK
316
317Capability: basic
318Architectures: x86
319Type: vcpu ioctl
320Parameters: struct kvm_translation (in/out)
321Returns: 0 on success, -1 on error
322
323Translates a virtual address according to the vcpu's current address
324translation mode.
325
326struct kvm_translation {
327 /* in */
328 __u64 linear_address;
329
330 /* out */
331 __u64 physical_address;
332 __u8 valid;
333 __u8 writeable;
334 __u8 usermode;
335 __u8 pad[5];
336};
337
68ba6974 3384.16 KVM_INTERRUPT
9c1b96e3
AK
339
340Capability: basic
6f7a2bd4 341Architectures: x86, ppc
9c1b96e3
AK
342Type: vcpu ioctl
343Parameters: struct kvm_interrupt (in)
344Returns: 0 on success, -1 on error
345
346Queues a hardware interrupt vector to be injected. This is only
6f7a2bd4 347useful if in-kernel local APIC or equivalent is not used.
9c1b96e3
AK
348
349/* for KVM_INTERRUPT */
350struct kvm_interrupt {
351 /* in */
352 __u32 irq;
353};
354
6f7a2bd4
AG
355X86:
356
9c1b96e3
AK
357Note 'irq' is an interrupt vector, not an interrupt pin or line.
358
6f7a2bd4
AG
359PPC:
360
361Queues an external interrupt to be injected. This ioctl is overleaded
362with 3 different irq values:
363
364a) KVM_INTERRUPT_SET
365
366 This injects an edge type external interrupt into the guest once it's ready
367 to receive interrupts. When injected, the interrupt is done.
368
369b) KVM_INTERRUPT_UNSET
370
371 This unsets any pending interrupt.
372
373 Only available with KVM_CAP_PPC_UNSET_IRQ.
374
375c) KVM_INTERRUPT_SET_LEVEL
376
377 This injects a level type external interrupt into the guest context. The
378 interrupt stays pending until a specific ioctl with KVM_INTERRUPT_UNSET
379 is triggered.
380
381 Only available with KVM_CAP_PPC_IRQ_LEVEL.
382
383Note that any value for 'irq' other than the ones stated above is invalid
384and incurs unexpected behavior.
385
68ba6974 3864.17 KVM_DEBUG_GUEST
9c1b96e3
AK
387
388Capability: basic
389Architectures: none
390Type: vcpu ioctl
391Parameters: none)
392Returns: -1 on error
393
394Support for this has been removed. Use KVM_SET_GUEST_DEBUG instead.
395
68ba6974 3964.18 KVM_GET_MSRS
9c1b96e3
AK
397
398Capability: basic
399Architectures: x86
400Type: vcpu ioctl
401Parameters: struct kvm_msrs (in/out)
402Returns: 0 on success, -1 on error
403
404Reads model-specific registers from the vcpu. Supported msr indices can
405be obtained using KVM_GET_MSR_INDEX_LIST.
406
407struct kvm_msrs {
408 __u32 nmsrs; /* number of msrs in entries */
409 __u32 pad;
410
411 struct kvm_msr_entry entries[0];
412};
413
414struct kvm_msr_entry {
415 __u32 index;
416 __u32 reserved;
417 __u64 data;
418};
419
420Application code should set the 'nmsrs' member (which indicates the
421size of the entries array) and the 'index' member of each array entry.
422kvm will fill in the 'data' member.
423
68ba6974 4244.19 KVM_SET_MSRS
9c1b96e3
AK
425
426Capability: basic
427Architectures: x86
428Type: vcpu ioctl
429Parameters: struct kvm_msrs (in)
430Returns: 0 on success, -1 on error
431
432Writes model-specific registers to the vcpu. See KVM_GET_MSRS for the
433data structures.
434
435Application code should set the 'nmsrs' member (which indicates the
436size of the entries array), and the 'index' and 'data' members of each
437array entry.
438
68ba6974 4394.20 KVM_SET_CPUID
9c1b96e3
AK
440
441Capability: basic
442Architectures: x86
443Type: vcpu ioctl
444Parameters: struct kvm_cpuid (in)
445Returns: 0 on success, -1 on error
446
447Defines the vcpu responses to the cpuid instruction. Applications
448should use the KVM_SET_CPUID2 ioctl if available.
449
450
451struct kvm_cpuid_entry {
452 __u32 function;
453 __u32 eax;
454 __u32 ebx;
455 __u32 ecx;
456 __u32 edx;
457 __u32 padding;
458};
459
460/* for KVM_SET_CPUID */
461struct kvm_cpuid {
462 __u32 nent;
463 __u32 padding;
464 struct kvm_cpuid_entry entries[0];
465};
466
68ba6974 4674.21 KVM_SET_SIGNAL_MASK
9c1b96e3
AK
468
469Capability: basic
470Architectures: x86
471Type: vcpu ioctl
472Parameters: struct kvm_signal_mask (in)
473Returns: 0 on success, -1 on error
474
475Defines which signals are blocked during execution of KVM_RUN. This
476signal mask temporarily overrides the threads signal mask. Any
477unblocked signal received (except SIGKILL and SIGSTOP, which retain
478their traditional behaviour) will cause KVM_RUN to return with -EINTR.
479
480Note the signal will only be delivered if not blocked by the original
481signal mask.
482
483/* for KVM_SET_SIGNAL_MASK */
484struct kvm_signal_mask {
485 __u32 len;
486 __u8 sigset[0];
487};
488
68ba6974 4894.22 KVM_GET_FPU
9c1b96e3
AK
490
491Capability: basic
492Architectures: x86
493Type: vcpu ioctl
494Parameters: struct kvm_fpu (out)
495Returns: 0 on success, -1 on error
496
497Reads the floating point state from the vcpu.
498
499/* for KVM_GET_FPU and KVM_SET_FPU */
500struct kvm_fpu {
501 __u8 fpr[8][16];
502 __u16 fcw;
503 __u16 fsw;
504 __u8 ftwx; /* in fxsave format */
505 __u8 pad1;
506 __u16 last_opcode;
507 __u64 last_ip;
508 __u64 last_dp;
509 __u8 xmm[16][16];
510 __u32 mxcsr;
511 __u32 pad2;
512};
513
68ba6974 5144.23 KVM_SET_FPU
9c1b96e3
AK
515
516Capability: basic
517Architectures: x86
518Type: vcpu ioctl
519Parameters: struct kvm_fpu (in)
520Returns: 0 on success, -1 on error
521
522Writes the floating point state to the vcpu.
523
524/* for KVM_GET_FPU and KVM_SET_FPU */
525struct kvm_fpu {
526 __u8 fpr[8][16];
527 __u16 fcw;
528 __u16 fsw;
529 __u8 ftwx; /* in fxsave format */
530 __u8 pad1;
531 __u16 last_opcode;
532 __u64 last_ip;
533 __u64 last_dp;
534 __u8 xmm[16][16];
535 __u32 mxcsr;
536 __u32 pad2;
537};
538
68ba6974 5394.24 KVM_CREATE_IRQCHIP
5dadbfd6
AK
540
541Capability: KVM_CAP_IRQCHIP
542Architectures: x86, ia64
543Type: vm ioctl
544Parameters: none
545Returns: 0 on success, -1 on error
546
547Creates an interrupt controller model in the kernel. On x86, creates a virtual
548ioapic, a virtual PIC (two PICs, nested), and sets up future vcpus to have a
549local APIC. IRQ routing for GSIs 0-15 is set to both PIC and IOAPIC; GSI 16-23
550only go to the IOAPIC. On ia64, a IOSAPIC is created.
551
68ba6974 5524.25 KVM_IRQ_LINE
5dadbfd6
AK
553
554Capability: KVM_CAP_IRQCHIP
555Architectures: x86, ia64
556Type: vm ioctl
557Parameters: struct kvm_irq_level
558Returns: 0 on success, -1 on error
559
560Sets the level of a GSI input to the interrupt controller model in the kernel.
561Requires that an interrupt controller model has been previously created with
562KVM_CREATE_IRQCHIP. Note that edge-triggered interrupts require the level
563to be set to 1 and then back to 0.
564
565struct kvm_irq_level {
566 union {
567 __u32 irq; /* GSI */
568 __s32 status; /* not used for KVM_IRQ_LEVEL */
569 };
570 __u32 level; /* 0 or 1 */
571};
572
68ba6974 5734.26 KVM_GET_IRQCHIP
5dadbfd6
AK
574
575Capability: KVM_CAP_IRQCHIP
576Architectures: x86, ia64
577Type: vm ioctl
578Parameters: struct kvm_irqchip (in/out)
579Returns: 0 on success, -1 on error
580
581Reads the state of a kernel interrupt controller created with
582KVM_CREATE_IRQCHIP into a buffer provided by the caller.
583
584struct kvm_irqchip {
585 __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
586 __u32 pad;
587 union {
588 char dummy[512]; /* reserving space */
589 struct kvm_pic_state pic;
590 struct kvm_ioapic_state ioapic;
591 } chip;
592};
593
68ba6974 5944.27 KVM_SET_IRQCHIP
5dadbfd6
AK
595
596Capability: KVM_CAP_IRQCHIP
597Architectures: x86, ia64
598Type: vm ioctl
599Parameters: struct kvm_irqchip (in)
600Returns: 0 on success, -1 on error
601
602Sets the state of a kernel interrupt controller created with
603KVM_CREATE_IRQCHIP from a buffer provided by the caller.
604
605struct kvm_irqchip {
606 __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
607 __u32 pad;
608 union {
609 char dummy[512]; /* reserving space */
610 struct kvm_pic_state pic;
611 struct kvm_ioapic_state ioapic;
612 } chip;
613};
614
68ba6974 6154.28 KVM_XEN_HVM_CONFIG
ffde22ac
ES
616
617Capability: KVM_CAP_XEN_HVM
618Architectures: x86
619Type: vm ioctl
620Parameters: struct kvm_xen_hvm_config (in)
621Returns: 0 on success, -1 on error
622
623Sets the MSR that the Xen HVM guest uses to initialize its hypercall
624page, and provides the starting address and size of the hypercall
625blobs in userspace. When the guest writes the MSR, kvm copies one
626page of a blob (32- or 64-bit, depending on the vcpu mode) to guest
627memory.
628
629struct kvm_xen_hvm_config {
630 __u32 flags;
631 __u32 msr;
632 __u64 blob_addr_32;
633 __u64 blob_addr_64;
634 __u8 blob_size_32;
635 __u8 blob_size_64;
636 __u8 pad2[30];
637};
638
68ba6974 6394.29 KVM_GET_CLOCK
afbcf7ab
GC
640
641Capability: KVM_CAP_ADJUST_CLOCK
642Architectures: x86
643Type: vm ioctl
644Parameters: struct kvm_clock_data (out)
645Returns: 0 on success, -1 on error
646
647Gets the current timestamp of kvmclock as seen by the current guest. In
648conjunction with KVM_SET_CLOCK, it is used to ensure monotonicity on scenarios
649such as migration.
650
651struct kvm_clock_data {
652 __u64 clock; /* kvmclock current value */
653 __u32 flags;
654 __u32 pad[9];
655};
656
68ba6974 6574.30 KVM_SET_CLOCK
afbcf7ab
GC
658
659Capability: KVM_CAP_ADJUST_CLOCK
660Architectures: x86
661Type: vm ioctl
662Parameters: struct kvm_clock_data (in)
663Returns: 0 on success, -1 on error
664
2044892d 665Sets the current timestamp of kvmclock to the value specified in its parameter.
afbcf7ab
GC
666In conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on scenarios
667such as migration.
668
669struct kvm_clock_data {
670 __u64 clock; /* kvmclock current value */
671 __u32 flags;
672 __u32 pad[9];
673};
674
68ba6974 6754.31 KVM_GET_VCPU_EVENTS
3cfc3092
JK
676
677Capability: KVM_CAP_VCPU_EVENTS
48005f64 678Extended by: KVM_CAP_INTR_SHADOW
3cfc3092
JK
679Architectures: x86
680Type: vm ioctl
681Parameters: struct kvm_vcpu_event (out)
682Returns: 0 on success, -1 on error
683
684Gets currently pending exceptions, interrupts, and NMIs as well as related
685states of the vcpu.
686
687struct kvm_vcpu_events {
688 struct {
689 __u8 injected;
690 __u8 nr;
691 __u8 has_error_code;
692 __u8 pad;
693 __u32 error_code;
694 } exception;
695 struct {
696 __u8 injected;
697 __u8 nr;
698 __u8 soft;
48005f64 699 __u8 shadow;
3cfc3092
JK
700 } interrupt;
701 struct {
702 __u8 injected;
703 __u8 pending;
704 __u8 masked;
705 __u8 pad;
706 } nmi;
707 __u32 sipi_vector;
dab4b911 708 __u32 flags;
3cfc3092
JK
709};
710
48005f64
JK
711KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that
712interrupt.shadow contains a valid state. Otherwise, this field is undefined.
713
68ba6974 7144.32 KVM_SET_VCPU_EVENTS
3cfc3092
JK
715
716Capability: KVM_CAP_VCPU_EVENTS
48005f64 717Extended by: KVM_CAP_INTR_SHADOW
3cfc3092
JK
718Architectures: x86
719Type: vm ioctl
720Parameters: struct kvm_vcpu_event (in)
721Returns: 0 on success, -1 on error
722
723Set pending exceptions, interrupts, and NMIs as well as related states of the
724vcpu.
725
726See KVM_GET_VCPU_EVENTS for the data structure.
727
dab4b911
JK
728Fields that may be modified asynchronously by running VCPUs can be excluded
729from the update. These fields are nmi.pending and sipi_vector. Keep the
730corresponding bits in the flags field cleared to suppress overwriting the
731current in-kernel state. The bits are:
732
733KVM_VCPUEVENT_VALID_NMI_PENDING - transfer nmi.pending to the kernel
734KVM_VCPUEVENT_VALID_SIPI_VECTOR - transfer sipi_vector
735
48005f64
JK
736If KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in
737the flags field to signal that interrupt.shadow contains a valid state and
738shall be written into the VCPU.
739
68ba6974 7404.33 KVM_GET_DEBUGREGS
a1efbe77
JK
741
742Capability: KVM_CAP_DEBUGREGS
743Architectures: x86
744Type: vm ioctl
745Parameters: struct kvm_debugregs (out)
746Returns: 0 on success, -1 on error
747
748Reads debug registers from the vcpu.
749
750struct kvm_debugregs {
751 __u64 db[4];
752 __u64 dr6;
753 __u64 dr7;
754 __u64 flags;
755 __u64 reserved[9];
756};
757
68ba6974 7584.34 KVM_SET_DEBUGREGS
a1efbe77
JK
759
760Capability: KVM_CAP_DEBUGREGS
761Architectures: x86
762Type: vm ioctl
763Parameters: struct kvm_debugregs (in)
764Returns: 0 on success, -1 on error
765
766Writes debug registers into the vcpu.
767
768See KVM_GET_DEBUGREGS for the data structure. The flags field is unused
769yet and must be cleared on entry.
770
68ba6974 7714.35 KVM_SET_USER_MEMORY_REGION
0f2d8f4d
AK
772
773Capability: KVM_CAP_USER_MEM
774Architectures: all
775Type: vm ioctl
776Parameters: struct kvm_userspace_memory_region (in)
777Returns: 0 on success, -1 on error
778
779struct kvm_userspace_memory_region {
780 __u32 slot;
781 __u32 flags;
782 __u64 guest_phys_addr;
783 __u64 memory_size; /* bytes */
784 __u64 userspace_addr; /* start of the userspace allocated memory */
785};
786
787/* for kvm_memory_region::flags */
788#define KVM_MEM_LOG_DIRTY_PAGES 1UL
789
790This ioctl allows the user to create or modify a guest physical memory
791slot. When changing an existing slot, it may be moved in the guest
792physical memory space, or its flags may be modified. It may not be
793resized. Slots may not overlap in guest physical address space.
794
795Memory for the region is taken starting at the address denoted by the
796field userspace_addr, which must point at user addressable memory for
797the entire memory slot size. Any object may back this memory, including
798anonymous memory, ordinary files, and hugetlbfs.
799
800It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
801be identical. This allows large pages in the guest to be backed by large
802pages in the host.
803
804The flags field supports just one flag, KVM_MEM_LOG_DIRTY_PAGES, which
805instructs kvm to keep track of writes to memory within the slot. See
806the KVM_GET_DIRTY_LOG ioctl.
807
808When the KVM_CAP_SYNC_MMU capability, changes in the backing of the memory
809region are automatically reflected into the guest. For example, an mmap()
810that affects the region will be made visible immediately. Another example
811is madvise(MADV_DROP).
812
813It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl.
814The KVM_SET_MEMORY_REGION does not allow fine grained control over memory
815allocation and is deprecated.
3cfc3092 816
68ba6974 8174.36 KVM_SET_TSS_ADDR
8a5416db
AK
818
819Capability: KVM_CAP_SET_TSS_ADDR
820Architectures: x86
821Type: vm ioctl
822Parameters: unsigned long tss_address (in)
823Returns: 0 on success, -1 on error
824
825This ioctl defines the physical address of a three-page region in the guest
826physical address space. The region must be within the first 4GB of the
827guest physical address space and must not conflict with any memory slot
828or any mmio address. The guest may malfunction if it accesses this memory
829region.
830
831This ioctl is required on Intel-based hosts. This is needed on Intel hardware
832because of a quirk in the virtualization implementation (see the internals
833documentation when it pops into existence).
834
68ba6974 8354.37 KVM_ENABLE_CAP
71fbfd5f
AG
836
837Capability: KVM_CAP_ENABLE_CAP
838Architectures: ppc
839Type: vcpu ioctl
840Parameters: struct kvm_enable_cap (in)
841Returns: 0 on success; -1 on error
842
843+Not all extensions are enabled by default. Using this ioctl the application
844can enable an extension, making it available to the guest.
845
846On systems that do not support this ioctl, it always fails. On systems that
847do support it, it only works for extensions that are supported for enablement.
848
849To check if a capability can be enabled, the KVM_CHECK_EXTENSION ioctl should
850be used.
851
852struct kvm_enable_cap {
853 /* in */
854 __u32 cap;
855
856The capability that is supposed to get enabled.
857
858 __u32 flags;
859
860A bitfield indicating future enhancements. Has to be 0 for now.
861
862 __u64 args[4];
863
864Arguments for enabling a feature. If a feature needs initial values to
865function properly, this is the place to put them.
866
867 __u8 pad[64];
868};
869
68ba6974 8704.38 KVM_GET_MP_STATE
b843f065
AK
871
872Capability: KVM_CAP_MP_STATE
873Architectures: x86, ia64
874Type: vcpu ioctl
875Parameters: struct kvm_mp_state (out)
876Returns: 0 on success; -1 on error
877
878struct kvm_mp_state {
879 __u32 mp_state;
880};
881
882Returns the vcpu's current "multiprocessing state" (though also valid on
883uniprocessor guests).
884
885Possible values are:
886
887 - KVM_MP_STATE_RUNNABLE: the vcpu is currently running
888 - KVM_MP_STATE_UNINITIALIZED: the vcpu is an application processor (AP)
889 which has not yet received an INIT signal
890 - KVM_MP_STATE_INIT_RECEIVED: the vcpu has received an INIT signal, and is
891 now ready for a SIPI
892 - KVM_MP_STATE_HALTED: the vcpu has executed a HLT instruction and
893 is waiting for an interrupt
894 - KVM_MP_STATE_SIPI_RECEIVED: the vcpu has just received a SIPI (vector
b595076a 895 accessible via KVM_GET_VCPU_EVENTS)
b843f065
AK
896
897This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel
898irqchip, the multiprocessing state must be maintained by userspace.
899
68ba6974 9004.39 KVM_SET_MP_STATE
b843f065
AK
901
902Capability: KVM_CAP_MP_STATE
903Architectures: x86, ia64
904Type: vcpu ioctl
905Parameters: struct kvm_mp_state (in)
906Returns: 0 on success; -1 on error
907
908Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
909arguments.
910
911This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel
912irqchip, the multiprocessing state must be maintained by userspace.
913
68ba6974 9144.40 KVM_SET_IDENTITY_MAP_ADDR
47dbb84f
AK
915
916Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR
917Architectures: x86
918Type: vm ioctl
919Parameters: unsigned long identity (in)
920Returns: 0 on success, -1 on error
921
922This ioctl defines the physical address of a one-page region in the guest
923physical address space. The region must be within the first 4GB of the
924guest physical address space and must not conflict with any memory slot
925or any mmio address. The guest may malfunction if it accesses this memory
926region.
927
928This ioctl is required on Intel-based hosts. This is needed on Intel hardware
929because of a quirk in the virtualization implementation (see the internals
930documentation when it pops into existence).
931
68ba6974 9324.41 KVM_SET_BOOT_CPU_ID
57bc24cf
AK
933
934Capability: KVM_CAP_SET_BOOT_CPU_ID
935Architectures: x86, ia64
936Type: vm ioctl
937Parameters: unsigned long vcpu_id
938Returns: 0 on success, -1 on error
939
940Define which vcpu is the Bootstrap Processor (BSP). Values are the same
941as the vcpu id in KVM_CREATE_VCPU. If this ioctl is not called, the default
942is vcpu 0.
943
68ba6974 9444.42 KVM_GET_XSAVE
2d5b5a66
SY
945
946Capability: KVM_CAP_XSAVE
947Architectures: x86
948Type: vcpu ioctl
949Parameters: struct kvm_xsave (out)
950Returns: 0 on success, -1 on error
951
952struct kvm_xsave {
953 __u32 region[1024];
954};
955
956This ioctl would copy current vcpu's xsave struct to the userspace.
957
68ba6974 9584.43 KVM_SET_XSAVE
2d5b5a66
SY
959
960Capability: KVM_CAP_XSAVE
961Architectures: x86
962Type: vcpu ioctl
963Parameters: struct kvm_xsave (in)
964Returns: 0 on success, -1 on error
965
966struct kvm_xsave {
967 __u32 region[1024];
968};
969
970This ioctl would copy userspace's xsave struct to the kernel.
971
68ba6974 9724.44 KVM_GET_XCRS
2d5b5a66
SY
973
974Capability: KVM_CAP_XCRS
975Architectures: x86
976Type: vcpu ioctl
977Parameters: struct kvm_xcrs (out)
978Returns: 0 on success, -1 on error
979
980struct kvm_xcr {
981 __u32 xcr;
982 __u32 reserved;
983 __u64 value;
984};
985
986struct kvm_xcrs {
987 __u32 nr_xcrs;
988 __u32 flags;
989 struct kvm_xcr xcrs[KVM_MAX_XCRS];
990 __u64 padding[16];
991};
992
993This ioctl would copy current vcpu's xcrs to the userspace.
994
68ba6974 9954.45 KVM_SET_XCRS
2d5b5a66
SY
996
997Capability: KVM_CAP_XCRS
998Architectures: x86
999Type: vcpu ioctl
1000Parameters: struct kvm_xcrs (in)
1001Returns: 0 on success, -1 on error
1002
1003struct kvm_xcr {
1004 __u32 xcr;
1005 __u32 reserved;
1006 __u64 value;
1007};
1008
1009struct kvm_xcrs {
1010 __u32 nr_xcrs;
1011 __u32 flags;
1012 struct kvm_xcr xcrs[KVM_MAX_XCRS];
1013 __u64 padding[16];
1014};
1015
1016This ioctl would set vcpu's xcr to the value userspace specified.
1017
68ba6974 10184.46 KVM_GET_SUPPORTED_CPUID
d153513d
AK
1019
1020Capability: KVM_CAP_EXT_CPUID
1021Architectures: x86
1022Type: system ioctl
1023Parameters: struct kvm_cpuid2 (in/out)
1024Returns: 0 on success, -1 on error
1025
1026struct kvm_cpuid2 {
1027 __u32 nent;
1028 __u32 padding;
1029 struct kvm_cpuid_entry2 entries[0];
1030};
1031
1032#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX 1
1033#define KVM_CPUID_FLAG_STATEFUL_FUNC 2
1034#define KVM_CPUID_FLAG_STATE_READ_NEXT 4
1035
1036struct kvm_cpuid_entry2 {
1037 __u32 function;
1038 __u32 index;
1039 __u32 flags;
1040 __u32 eax;
1041 __u32 ebx;
1042 __u32 ecx;
1043 __u32 edx;
1044 __u32 padding[3];
1045};
1046
1047This ioctl returns x86 cpuid features which are supported by both the hardware
1048and kvm. Userspace can use the information returned by this ioctl to
1049construct cpuid information (for KVM_SET_CPUID2) that is consistent with
1050hardware, kernel, and userspace capabilities, and with user requirements (for
1051example, the user may wish to constrain cpuid to emulate older hardware,
1052or for feature consistency across a cluster).
1053
1054Userspace invokes KVM_GET_SUPPORTED_CPUID by passing a kvm_cpuid2 structure
1055with the 'nent' field indicating the number of entries in the variable-size
1056array 'entries'. If the number of entries is too low to describe the cpu
1057capabilities, an error (E2BIG) is returned. If the number is too high,
1058the 'nent' field is adjusted and an error (ENOMEM) is returned. If the
1059number is just right, the 'nent' field is adjusted to the number of valid
1060entries in the 'entries' array, which is then filled.
1061
1062The entries returned are the host cpuid as returned by the cpuid instruction,
c39cbd2a
AK
1063with unknown or unsupported features masked out. Some features (for example,
1064x2apic), may not be present in the host cpu, but are exposed by kvm if it can
1065emulate them efficiently. The fields in each entry are defined as follows:
d153513d
AK
1066
1067 function: the eax value used to obtain the entry
1068 index: the ecx value used to obtain the entry (for entries that are
1069 affected by ecx)
1070 flags: an OR of zero or more of the following:
1071 KVM_CPUID_FLAG_SIGNIFCANT_INDEX:
1072 if the index field is valid
1073 KVM_CPUID_FLAG_STATEFUL_FUNC:
1074 if cpuid for this function returns different values for successive
1075 invocations; there will be several entries with the same function,
1076 all with this flag set
1077 KVM_CPUID_FLAG_STATE_READ_NEXT:
1078 for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is
1079 the first entry to be read by a cpu
1080 eax, ebx, ecx, edx: the values returned by the cpuid instruction for
1081 this function/index combination
1082
68ba6974 10834.47 KVM_PPC_GET_PVINFO
15711e9c
AG
1084
1085Capability: KVM_CAP_PPC_GET_PVINFO
1086Architectures: ppc
1087Type: vm ioctl
1088Parameters: struct kvm_ppc_pvinfo (out)
1089Returns: 0 on success, !0 on error
1090
1091struct kvm_ppc_pvinfo {
1092 __u32 flags;
1093 __u32 hcall[4];
1094 __u8 pad[108];
1095};
1096
1097This ioctl fetches PV specific information that need to be passed to the guest
1098using the device tree or other means from vm context.
1099
1100For now the only implemented piece of information distributed here is an array
1101of 4 instructions that make up a hypercall.
1102
1103If any additional field gets added to this structure later on, a bit for that
1104additional piece of information will be set in the flags bitmap.
1105
68ba6974 11064.48 KVM_ASSIGN_PCI_DEVICE
49f48172
JK
1107
1108Capability: KVM_CAP_DEVICE_ASSIGNMENT
1109Architectures: x86 ia64
1110Type: vm ioctl
1111Parameters: struct kvm_assigned_pci_dev (in)
1112Returns: 0 on success, -1 on error
1113
1114Assigns a host PCI device to the VM.
1115
1116struct kvm_assigned_pci_dev {
1117 __u32 assigned_dev_id;
1118 __u32 busnr;
1119 __u32 devfn;
1120 __u32 flags;
1121 __u32 segnr;
1122 union {
1123 __u32 reserved[11];
1124 };
1125};
1126
1127The PCI device is specified by the triple segnr, busnr, and devfn.
1128Identification in succeeding service requests is done via assigned_dev_id. The
1129following flags are specified:
1130
1131/* Depends on KVM_CAP_IOMMU */
1132#define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
1133
68ba6974 11344.49 KVM_DEASSIGN_PCI_DEVICE
49f48172
JK
1135
1136Capability: KVM_CAP_DEVICE_DEASSIGNMENT
1137Architectures: x86 ia64
1138Type: vm ioctl
1139Parameters: struct kvm_assigned_pci_dev (in)
1140Returns: 0 on success, -1 on error
1141
1142Ends PCI device assignment, releasing all associated resources.
1143
1144See KVM_CAP_DEVICE_ASSIGNMENT for the data structure. Only assigned_dev_id is
1145used in kvm_assigned_pci_dev to identify the device.
1146
68ba6974 11474.50 KVM_ASSIGN_DEV_IRQ
49f48172
JK
1148
1149Capability: KVM_CAP_ASSIGN_DEV_IRQ
1150Architectures: x86 ia64
1151Type: vm ioctl
1152Parameters: struct kvm_assigned_irq (in)
1153Returns: 0 on success, -1 on error
1154
1155Assigns an IRQ to a passed-through device.
1156
1157struct kvm_assigned_irq {
1158 __u32 assigned_dev_id;
91e3d71d 1159 __u32 host_irq; /* ignored (legacy field) */
49f48172
JK
1160 __u32 guest_irq;
1161 __u32 flags;
1162 union {
49f48172
JK
1163 __u32 reserved[12];
1164 };
1165};
1166
1167The following flags are defined:
1168
1169#define KVM_DEV_IRQ_HOST_INTX (1 << 0)
1170#define KVM_DEV_IRQ_HOST_MSI (1 << 1)
1171#define KVM_DEV_IRQ_HOST_MSIX (1 << 2)
1172
1173#define KVM_DEV_IRQ_GUEST_INTX (1 << 8)
1174#define KVM_DEV_IRQ_GUEST_MSI (1 << 9)
1175#define KVM_DEV_IRQ_GUEST_MSIX (1 << 10)
1176
1177It is not valid to specify multiple types per host or guest IRQ. However, the
1178IRQ type of host and guest can differ or can even be null.
1179
68ba6974 11804.51 KVM_DEASSIGN_DEV_IRQ
49f48172
JK
1181
1182Capability: KVM_CAP_ASSIGN_DEV_IRQ
1183Architectures: x86 ia64
1184Type: vm ioctl
1185Parameters: struct kvm_assigned_irq (in)
1186Returns: 0 on success, -1 on error
1187
1188Ends an IRQ assignment to a passed-through device.
1189
1190See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is specified
1191by assigned_dev_id, flags must correspond to the IRQ type specified on
1192KVM_ASSIGN_DEV_IRQ. Partial deassignment of host or guest IRQ is allowed.
1193
68ba6974 11944.52 KVM_SET_GSI_ROUTING
49f48172
JK
1195
1196Capability: KVM_CAP_IRQ_ROUTING
1197Architectures: x86 ia64
1198Type: vm ioctl
1199Parameters: struct kvm_irq_routing (in)
1200Returns: 0 on success, -1 on error
1201
1202Sets the GSI routing table entries, overwriting any previously set entries.
1203
1204struct kvm_irq_routing {
1205 __u32 nr;
1206 __u32 flags;
1207 struct kvm_irq_routing_entry entries[0];
1208};
1209
1210No flags are specified so far, the corresponding field must be set to zero.
1211
1212struct kvm_irq_routing_entry {
1213 __u32 gsi;
1214 __u32 type;
1215 __u32 flags;
1216 __u32 pad;
1217 union {
1218 struct kvm_irq_routing_irqchip irqchip;
1219 struct kvm_irq_routing_msi msi;
1220 __u32 pad[8];
1221 } u;
1222};
1223
1224/* gsi routing entry types */
1225#define KVM_IRQ_ROUTING_IRQCHIP 1
1226#define KVM_IRQ_ROUTING_MSI 2
1227
1228No flags are specified so far, the corresponding field must be set to zero.
1229
1230struct kvm_irq_routing_irqchip {
1231 __u32 irqchip;
1232 __u32 pin;
1233};
1234
1235struct kvm_irq_routing_msi {
1236 __u32 address_lo;
1237 __u32 address_hi;
1238 __u32 data;
1239 __u32 pad;
1240};
1241
68ba6974 12424.53 KVM_ASSIGN_SET_MSIX_NR
49f48172
JK
1243
1244Capability: KVM_CAP_DEVICE_MSIX
1245Architectures: x86 ia64
1246Type: vm ioctl
1247Parameters: struct kvm_assigned_msix_nr (in)
1248Returns: 0 on success, -1 on error
1249
58f0964e
JK
1250Set the number of MSI-X interrupts for an assigned device. The number is
1251reset again by terminating the MSI-X assignment of the device via
1252KVM_DEASSIGN_DEV_IRQ. Calling this service more than once at any earlier
1253point will fail.
49f48172
JK
1254
1255struct kvm_assigned_msix_nr {
1256 __u32 assigned_dev_id;
1257 __u16 entry_nr;
1258 __u16 padding;
1259};
1260
1261#define KVM_MAX_MSIX_PER_DEV 256
1262
68ba6974 12634.54 KVM_ASSIGN_SET_MSIX_ENTRY
49f48172
JK
1264
1265Capability: KVM_CAP_DEVICE_MSIX
1266Architectures: x86 ia64
1267Type: vm ioctl
1268Parameters: struct kvm_assigned_msix_entry (in)
1269Returns: 0 on success, -1 on error
1270
1271Specifies the routing of an MSI-X assigned device interrupt to a GSI. Setting
1272the GSI vector to zero means disabling the interrupt.
1273
1274struct kvm_assigned_msix_entry {
1275 __u32 assigned_dev_id;
1276 __u32 gsi;
1277 __u16 entry; /* The index of entry in the MSI-X table */
1278 __u16 padding[3];
1279};
1280
92a1f12d
JR
12814.54 KVM_SET_TSC_KHZ
1282
1283Capability: KVM_CAP_TSC_CONTROL
1284Architectures: x86
1285Type: vcpu ioctl
1286Parameters: virtual tsc_khz
1287Returns: 0 on success, -1 on error
1288
1289Specifies the tsc frequency for the virtual machine. The unit of the
1290frequency is KHz.
1291
12924.55 KVM_GET_TSC_KHZ
1293
1294Capability: KVM_CAP_GET_TSC_KHZ
1295Architectures: x86
1296Type: vcpu ioctl
1297Parameters: none
1298Returns: virtual tsc-khz on success, negative value on error
1299
1300Returns the tsc frequency of the guest. The unit of the return value is
1301KHz. If the host has unstable tsc this ioctl returns -EIO instead as an
1302error.
1303
e7677933
AK
13044.56 KVM_GET_LAPIC
1305
1306Capability: KVM_CAP_IRQCHIP
1307Architectures: x86
1308Type: vcpu ioctl
1309Parameters: struct kvm_lapic_state (out)
1310Returns: 0 on success, -1 on error
1311
1312#define KVM_APIC_REG_SIZE 0x400
1313struct kvm_lapic_state {
1314 char regs[KVM_APIC_REG_SIZE];
1315};
1316
1317Reads the Local APIC registers and copies them into the input argument. The
1318data format and layout are the same as documented in the architecture manual.
1319
13204.57 KVM_SET_LAPIC
1321
1322Capability: KVM_CAP_IRQCHIP
1323Architectures: x86
1324Type: vcpu ioctl
1325Parameters: struct kvm_lapic_state (in)
1326Returns: 0 on success, -1 on error
1327
1328#define KVM_APIC_REG_SIZE 0x400
1329struct kvm_lapic_state {
1330 char regs[KVM_APIC_REG_SIZE];
1331};
1332
1333Copies the input argument into the the Local APIC registers. The data format
1334and layout are the same as documented in the architecture manual.
1335
7f4382e8 13364.58 KVM_IOEVENTFD
55399a02
SL
1337
1338Capability: KVM_CAP_IOEVENTFD
1339Architectures: all
1340Type: vm ioctl
1341Parameters: struct kvm_ioeventfd (in)
1342Returns: 0 on success, !0 on error
1343
1344This ioctl attaches or detaches an ioeventfd to a legal pio/mmio address
1345within the guest. A guest write in the registered address will signal the
1346provided event instead of triggering an exit.
1347
1348struct kvm_ioeventfd {
1349 __u64 datamatch;
1350 __u64 addr; /* legal pio/mmio address */
1351 __u32 len; /* 1, 2, 4, or 8 bytes */
1352 __s32 fd;
1353 __u32 flags;
1354 __u8 pad[36];
1355};
1356
1357The following flags are defined:
1358
1359#define KVM_IOEVENTFD_FLAG_DATAMATCH (1 << kvm_ioeventfd_flag_nr_datamatch)
1360#define KVM_IOEVENTFD_FLAG_PIO (1 << kvm_ioeventfd_flag_nr_pio)
1361#define KVM_IOEVENTFD_FLAG_DEASSIGN (1 << kvm_ioeventfd_flag_nr_deassign)
1362
1363If datamatch flag is set, the event will be signaled only if the written value
1364to the registered address is equal to datamatch in struct kvm_ioeventfd.
1365
54738c09
DG
13664.62 KVM_CREATE_SPAPR_TCE
1367
1368Capability: KVM_CAP_SPAPR_TCE
1369Architectures: powerpc
1370Type: vm ioctl
1371Parameters: struct kvm_create_spapr_tce (in)
1372Returns: file descriptor for manipulating the created TCE table
1373
1374This creates a virtual TCE (translation control entry) table, which
1375is an IOMMU for PAPR-style virtual I/O. It is used to translate
1376logical addresses used in virtual I/O into guest physical addresses,
1377and provides a scatter/gather capability for PAPR virtual I/O.
1378
1379/* for KVM_CAP_SPAPR_TCE */
1380struct kvm_create_spapr_tce {
1381 __u64 liobn;
1382 __u32 window_size;
1383};
1384
1385The liobn field gives the logical IO bus number for which to create a
1386TCE table. The window_size field specifies the size of the DMA window
1387which this TCE table will translate - the table will contain one 64
1388bit TCE entry for every 4kiB of the DMA window.
1389
1390When the guest issues an H_PUT_TCE hcall on a liobn for which a TCE
1391table has been created using this ioctl(), the kernel will handle it
1392in real mode, updating the TCE table. H_PUT_TCE calls for other
1393liobns will cause a vm exit and must be handled by userspace.
1394
1395The return value is a file descriptor which can be passed to mmap(2)
1396to map the created TCE table into userspace. This lets userspace read
1397the entries written by kernel-handled H_PUT_TCE calls, and also lets
1398userspace update the TCE table directly which is useful in some
1399circumstances.
1400
9c1b96e3
AK
14015. The kvm_run structure
1402
1403Application code obtains a pointer to the kvm_run structure by
1404mmap()ing a vcpu fd. From that point, application code can control
1405execution by changing fields in kvm_run prior to calling the KVM_RUN
1406ioctl, and obtain information about the reason KVM_RUN returned by
1407looking up structure members.
1408
1409struct kvm_run {
1410 /* in */
1411 __u8 request_interrupt_window;
1412
1413Request that KVM_RUN return when it becomes possible to inject external
1414interrupts into the guest. Useful in conjunction with KVM_INTERRUPT.
1415
1416 __u8 padding1[7];
1417
1418 /* out */
1419 __u32 exit_reason;
1420
1421When KVM_RUN has returned successfully (return value 0), this informs
1422application code why KVM_RUN has returned. Allowable values for this
1423field are detailed below.
1424
1425 __u8 ready_for_interrupt_injection;
1426
1427If request_interrupt_window has been specified, this field indicates
1428an interrupt can be injected now with KVM_INTERRUPT.
1429
1430 __u8 if_flag;
1431
1432The value of the current interrupt flag. Only valid if in-kernel
1433local APIC is not used.
1434
1435 __u8 padding2[2];
1436
1437 /* in (pre_kvm_run), out (post_kvm_run) */
1438 __u64 cr8;
1439
1440The value of the cr8 register. Only valid if in-kernel local APIC is
1441not used. Both input and output.
1442
1443 __u64 apic_base;
1444
1445The value of the APIC BASE msr. Only valid if in-kernel local
1446APIC is not used. Both input and output.
1447
1448 union {
1449 /* KVM_EXIT_UNKNOWN */
1450 struct {
1451 __u64 hardware_exit_reason;
1452 } hw;
1453
1454If exit_reason is KVM_EXIT_UNKNOWN, the vcpu has exited due to unknown
1455reasons. Further architecture-specific information is available in
1456hardware_exit_reason.
1457
1458 /* KVM_EXIT_FAIL_ENTRY */
1459 struct {
1460 __u64 hardware_entry_failure_reason;
1461 } fail_entry;
1462
1463If exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due
1464to unknown reasons. Further architecture-specific information is
1465available in hardware_entry_failure_reason.
1466
1467 /* KVM_EXIT_EXCEPTION */
1468 struct {
1469 __u32 exception;
1470 __u32 error_code;
1471 } ex;
1472
1473Unused.
1474
1475 /* KVM_EXIT_IO */
1476 struct {
1477#define KVM_EXIT_IO_IN 0
1478#define KVM_EXIT_IO_OUT 1
1479 __u8 direction;
1480 __u8 size; /* bytes */
1481 __u16 port;
1482 __u32 count;
1483 __u64 data_offset; /* relative to kvm_run start */
1484 } io;
1485
2044892d 1486If exit_reason is KVM_EXIT_IO, then the vcpu has
9c1b96e3
AK
1487executed a port I/O instruction which could not be satisfied by kvm.
1488data_offset describes where the data is located (KVM_EXIT_IO_OUT) or
1489where kvm expects application code to place the data for the next
2044892d 1490KVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a packed array.
9c1b96e3
AK
1491
1492 struct {
1493 struct kvm_debug_exit_arch arch;
1494 } debug;
1495
1496Unused.
1497
1498 /* KVM_EXIT_MMIO */
1499 struct {
1500 __u64 phys_addr;
1501 __u8 data[8];
1502 __u32 len;
1503 __u8 is_write;
1504 } mmio;
1505
2044892d 1506If exit_reason is KVM_EXIT_MMIO, then the vcpu has
9c1b96e3
AK
1507executed a memory-mapped I/O instruction which could not be satisfied
1508by kvm. The 'data' member contains the written data if 'is_write' is
1509true, and should be filled by application code otherwise.
1510
ad0a048b
AG
1511NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI, the corresponding
1512operations are complete (and guest state is consistent) only after userspace
1513has re-entered the kernel with KVM_RUN. The kernel side will first finish
67961344
MT
1514incomplete operations and then check for pending signals. Userspace
1515can re-enter the guest with an unmasked signal pending to complete
1516pending operations.
1517
9c1b96e3
AK
1518 /* KVM_EXIT_HYPERCALL */
1519 struct {
1520 __u64 nr;
1521 __u64 args[6];
1522 __u64 ret;
1523 __u32 longmode;
1524 __u32 pad;
1525 } hypercall;
1526
647dc49e
AK
1527Unused. This was once used for 'hypercall to userspace'. To implement
1528such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
1529Note KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
9c1b96e3
AK
1530
1531 /* KVM_EXIT_TPR_ACCESS */
1532 struct {
1533 __u64 rip;
1534 __u32 is_write;
1535 __u32 pad;
1536 } tpr_access;
1537
1538To be documented (KVM_TPR_ACCESS_REPORTING).
1539
1540 /* KVM_EXIT_S390_SIEIC */
1541 struct {
1542 __u8 icptcode;
1543 __u64 mask; /* psw upper half */
1544 __u64 addr; /* psw lower half */
1545 __u16 ipa;
1546 __u32 ipb;
1547 } s390_sieic;
1548
1549s390 specific.
1550
1551 /* KVM_EXIT_S390_RESET */
1552#define KVM_S390_RESET_POR 1
1553#define KVM_S390_RESET_CLEAR 2
1554#define KVM_S390_RESET_SUBSYSTEM 4
1555#define KVM_S390_RESET_CPU_INIT 8
1556#define KVM_S390_RESET_IPL 16
1557 __u64 s390_reset_flags;
1558
1559s390 specific.
1560
1561 /* KVM_EXIT_DCR */
1562 struct {
1563 __u32 dcrn;
1564 __u32 data;
1565 __u8 is_write;
1566 } dcr;
1567
1568powerpc specific.
1569
ad0a048b
AG
1570 /* KVM_EXIT_OSI */
1571 struct {
1572 __u64 gprs[32];
1573 } osi;
1574
1575MOL uses a special hypercall interface it calls 'OSI'. To enable it, we catch
1576hypercalls and exit with this exit struct that contains all the guest gprs.
1577
1578If exit_reason is KVM_EXIT_OSI, then the vcpu has triggered such a hypercall.
1579Userspace can now handle the hypercall and when it's done modify the gprs as
1580necessary. Upon guest entry all guest GPRs will then be replaced by the values
1581in this struct.
1582
de56a948
PM
1583 /* KVM_EXIT_PAPR_HCALL */
1584 struct {
1585 __u64 nr;
1586 __u64 ret;
1587 __u64 args[9];
1588 } papr_hcall;
1589
1590This is used on 64-bit PowerPC when emulating a pSeries partition,
1591e.g. with the 'pseries' machine type in qemu. It occurs when the
1592guest does a hypercall using the 'sc 1' instruction. The 'nr' field
1593contains the hypercall number (from the guest R3), and 'args' contains
1594the arguments (from the guest R4 - R12). Userspace should put the
1595return code in 'ret' and any extra returned values in args[].
1596The possible hypercalls are defined in the Power Architecture Platform
1597Requirements (PAPR) document available from www.power.org (free
1598developer registration required to access it).
1599
9c1b96e3
AK
1600 /* Fix the size of the union. */
1601 char padding[256];
1602 };
1603};