]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man7/vdso.7
pldd.1, bpf.2, chdir.2, clone.2, fanotify_init.2, fanotify_mark.2, intro.2, ipc.2...
[thirdparty/man-pages.git] / man7 / vdso.7
CommitLineData
2800db82
MF
1.\" Written by Mike Frysinger <vapier@gentoo.org>
2.\"
3.\" %%%LICENSE_START(PUBLIC_DOMAIN)
4.\" This page is in the public domain.
5.\" %%%LICENSE_END
6.\"
fb634bd8 7.\" Useful background:
8635ed1b
MK
8.\" http://articles.manugarg.com/systemcallinlinux2_6.html
9.\" https://lwn.net/Articles/446528/
10.\" http://www.linuxjournal.com/content/creating-vdso-colonels-other-chicken
11.\" http://www.trilithium.com/johan/2005/08/linux-gate/
fb634bd8 12.\"
63121bd4 13.TH VDSO 7 2019-08-02 "Linux" "Linux Programmer's Manual"
2800db82 14.SH NAME
3e179634 15vdso \- overview of the virtual ELF dynamic shared object
2800db82
MF
16.SH SYNOPSIS
17.B #include <sys/auxv.h>
dbfe9c70 18.PP
2800db82
MF
19.B void *vdso = (uintptr_t) getauxval(AT_SYSINFO_EHDR);
20.SH DESCRIPTION
e1549829 21The "vDSO" (virtual dynamic shared object) is a small shared library that
8635ed1b 22the kernel automatically maps into the
2800db82 23address space of all user-space applications.
fb634bd8 24Applications usually do not need to concern themselves with these details
2800db82 25as the vDSO is most commonly called by the C library.
f6816de9 26This way you can code in the normal way using standard functions
fb634bd8
MK
27and the C library will take care
28of using any functionality that is available via the vDSO.
5711c04f 29.PP
2800db82 30Why does the vDSO exist at all?
8635ed1b 31There are some system calls the kernel provides that
dd6b62ec 32user-space code ends up using frequently,
8635ed1b 33to the point that such calls can dominate overall performance.
fb634bd8 34This is due both to the frequency of the call as well as the
35432a03 35context-switch overhead that results
2800db82 36from exiting user space and entering the kernel.
5711c04f 37.PP
8635ed1b
MK
38The rest of this documentation is geared toward the curious and/or
39C library writers rather than general developers.
2800db82
MF
40If you're trying to call the vDSO in your own application rather than using
41the C library, you're most likely doing it wrong.
42.SS Example background
43Making system calls can be slow.
fb634bd8
MK
44In x86 32-bit systems, you can trigger a software interrupt
45.RI ( "int $0x80" )
46to tell the kernel you wish to make a system call.
47However, this instruction is expensive: it goes through
48the full interrupt-handling paths
49in the processor's microcode as well as in the kernel.
50Newer processors have faster (but backward incompatible) instructions to
2800db82
MF
51initiate system calls.
52Rather than require the C library to figure out if this functionality is
8635ed1b 53available at run time,
fb634bd8 54the C library can use functions provided by the kernel in
2800db82 55the vDSO.
5711c04f 56.PP
2800db82 57Note that the terminology can be confusing.
fb634bd8
MK
58On x86 systems, the vDSO function
59used to determine the preferred method of making a system call is
9ea5bc66 60named "__kernel_vsyscall", but on x86-64,
8635ed1b
MK
61the term "vsyscall" also refers to an obsolete way to ask the kernel
62what time it is or what CPU the caller is on.
5711c04f 63.PP
fb634bd8
MK
64One frequently used system call is
65.BR gettimeofday (2).
66This system call is called both directly by user-space applications
67as well as indirectly by
2800db82 68the C library.
8635ed1b
MK
69Think timestamps or timing loops or polling\(emall of these
70frequently need to know what time it is right now.
71This information is also not secret\(emany application in any
72privilege mode (root or any unprivileged user) will get the same answer.
73Thus the kernel arranges for the information required to answer
74this question to be placed in memory the process can access.
fb634bd8
MK
75Now a call to
76.BR gettimeofday (2)
77changes from a system call to a normal function
2800db82
MF
78call and a few memory accesses.
79.SS Finding the vDSO
8635ed1b
MK
80The base address of the vDSO (if one exists) is passed by the kernel to
81each program in the initial auxiliary vector (see
d3532647 82.BR getauxval (3)),
fb634bd8 83via the
2800db82
MF
84.B AT_SYSINFO_EHDR
85tag.
5711c04f 86.PP
2800db82
MF
87You must not assume the vDSO is mapped at any particular location in the
88user's memory map.
8635ed1b 89The base address will usually be randomized at run time every time a new
2800db82
MF
90process image is created (at
91.BR execve (2)
92time).
fb634bd8
MK
93This is done for security reasons,
94to prevent "return-to-libc" attacks.
5711c04f 95.PP
fb634bd8 96For some architectures, there is also an
2800db82
MF
97.B AT_SYSINFO
98tag.
99This is used only for locating the vsyscall entry point and is frequently
100omitted or set to 0 (meaning it's not available).
fb634bd8
MK
101This tag is a throwback to the initial vDSO work (see
102.IR History
103below) and its use should be avoided.
2800db82
MF
104.SS File format
105Since the vDSO is a fully formed ELF image, you can do symbol lookups on it.
8635ed1b
MK
106This allows new symbols to be added with newer kernel releases,
107and allows the C library to detect available functionality at
108run time when running under different kernel versions.
fb634bd8 109Oftentimes the C library will do detection with the first call and then
2800db82 110cache the result for subsequent calls.
5711c04f 111.PP
2800db82
MF
112All symbols are also versioned (using the GNU version format).
113This allows the kernel to update the function signature without breaking
fb634bd8 114backward compatibility.
2800db82
MF
115This means changing the arguments that the function accepts as well as the
116return value.
fb634bd8
MK
117Thus, when looking up a symbol in the vDSO,
118you must always include the version
2800db82 119to match the ABI you expect.
5711c04f 120.PP
fb634bd8
MK
121Typically the vDSO follows the naming convention of prefixing
122all symbols with "__vdso_" or "__kernel_"
123so as to distinguish them from other standard symbols.
124For example, the "gettimeofday" function is named "__vdso_gettimeofday".
5711c04f 125.PP
fb634bd8
MK
126You use the standard C calling conventions when calling
127any of these functions.
2800db82
MF
128No need to worry about weird register or stack behavior.
129.SH NOTES
130.SS Source
8635ed1b
MK
131When you compile the kernel,
132it will automatically compile and link the vDSO code for you.
fb634bd8 133You will frequently find it under the architecture-specific directory:
5711c04f 134.PP
587ff4d5 135 find arch/$ARCH/ \-name \(aq*vdso*.so*\(aq \-o \-name \(aq*gate*.so*\(aq
787dd4ad 136.\"
2800db82 137.SS vDSO names
21ffc8d1 138The name of the vDSO varies across architectures.
d3532647 139It will often show up in things like glibc's
fb634bd8
MK
140.BR ldd (1)
141output.
2800db82
MF
142The exact name should not matter to any code, so do not hardcode it.
143.if t \{\
144.ft CW
145\}
146.TS
147l l.
148user ABI vDSO name
149_
587ff4d5
JW
150aarch64 linux\-vdso.so.1
151arm linux\-vdso.so.1
152ia64 linux\-gate.so.1
153mips linux\-vdso.so.1
154ppc/32 linux\-vdso32.so.1
155ppc/64 linux\-vdso64.so.1
77479ef6 156riscv linux\-vdso.so.1
587ff4d5
JW
157s390 linux\-vdso32.so.1
158s390x linux\-vdso64.so.1
159sh linux\-gate.so.1
160i386 linux\-gate.so.1
161x86-64 linux\-vdso.so.1
162x86/x32 linux\-vdso.so.1
2800db82
MF
163.TE
164.if t \{\
165.in
166.ft P
167\}
314d88f6 168.SS strace(1), seccomp(2), and the vDSO
afc40b07
MK
169When tracing systems calls with
170.BR strace (1),
171symbols (system calls) that are exported by the vDSO will
172.I not
173appear in the trace output.
314d88f6
MK
174Those system calls will likewise not be visible to
175.BR seccomp (2)
176filters.
dd6b62ec 177.SH ARCHITECTURE-SPECIFIC NOTES
f6816de9
MK
178The subsections below provide architecture-specific notes
179on the vDSO.
5711c04f 180.PP
f6816de9
MK
181Note that the vDSO that is used is based on the ABI of your user-space code
182and not the ABI of the kernel.
183Thus, for example,
184when you run an i386 32-bit ELF binary,
185you'll get the same vDSO regardless of whether you run it under
9ea5bc66 186an i386 32-bit kernel or under an x86-64 64-bit kernel.
dd6b62ec 187Therefore, the name of the user-space ABI should be used to determine
f6816de9 188which of the sections below is relevant.
fb634bd8 189.SS ARM functions
ebfc3611
NL
190.\" See linux/arch/arm/vdso/vdso.lds.S
191.\" Commit: 8512287a8165592466cb9cb347ba94892e9c56a5
192The table below lists the symbols exported by the vDSO.
193.if t \{\
194.ft CW
195\}
196.TS
197l l.
198symbol version
199_
200__vdso_gettimeofday LINUX_2.6 (exported since Linux 4.1)
201__vdso_clock_gettime LINUX_2.6 (exported since Linux 4.1)
202.TE
203.if t \{\
204.in
205.ft P
206\}
5711c04f 207.PP
2800db82
MF
208.\" See linux/arch/arm/kernel/entry-armv.S
209.\" See linux/Documentation/arm/kernel_user_helpers.txt
ebfc3611 210Additionally, the ARM port has a code page full of utility functions.
2800db82
MF
211Since it's just a raw page of code, there is no ELF information for doing
212symbol lookups or versioning.
213It does provide support for different versions though.
5711c04f 214.PP
fb634bd8
MK
215For information on this code page,
216it's best to refer to the kernel documentation
2800db82 217as it's extremely detailed and covers everything you need to know:
fb634bd8 218.IR Documentation/arm/kernel_user_helpers.txt .
2800db82
MF
219.SS aarch64 functions
220.\" See linux/arch/arm64/kernel/vdso/vdso.lds.S
f6816de9 221The table below lists the symbols exported by the vDSO.
2800db82
MF
222.if t \{\
223.ft CW
224\}
225.TS
226l l.
227symbol version
228_
229__kernel_rt_sigreturn LINUX_2.6.39
230__kernel_gettimeofday LINUX_2.6.39
231__kernel_clock_gettime LINUX_2.6.39
232__kernel_clock_getres LINUX_2.6.39
233.TE
234.if t \{\
235.in
236.ft P
237\}
ec7c7493 238.SS bfin (Blackfin) functions (port removed in Linux 4.17)
2800db82
MF
239.\" See linux/arch/blackfin/kernel/fixed_code.S
240.\" See http://docs.blackfin.uclinux.org/doku.php?id=linux-kernel:fixed-code
8635ed1b
MK
241As this CPU lacks a memory management unit (MMU),
242it doesn't set up a vDSO in the normal sense.
243Instead, it maps at boot time a few raw functions into
244a fixed location in memory.
2800db82 245User-space applications then call directly into that region.
8635ed1b
MK
246There is no provision for backward compatibility
247beyond sniffing raw opcodes,
fb634bd8 248but as this is an embedded CPU, it can get away with things\(emsome of the
2800db82 249object formats it runs aren't even ELF based (they're bFLT/FLAT).
5711c04f 250.PP
f6816de9
MK
251For information on this code page,
252it's best to refer to the public documentation:
2800db82 253.br
5465ae95 254http://docs.blackfin.uclinux.org/doku.php?id=linux\-kernel:fixed\-code
a5a3afb9
ZLK
255.SS mips functions
256.\" See linux/arch/mips/vdso/vdso.ld.S
5711c04f 257.PP
a5a3afb9
ZLK
258The table below lists the symbols exported by the vDSO.
259.if t \{\
260.ft CW
261\}
262.TS
263l l.
264symbol version
265_
266__kernel_gettimeofday LINUX_2.6 (exported since Linux 4.4)
267__kernel_clock_gettime LINUX_2.6 (exported since Linux 4.4)
268.TE
269.if t \{\
270.in
271.ft P
272\}
2800db82
MF
273.SS ia64 (Itanium) functions
274.\" See linux/arch/ia64/kernel/gate.lds.S
275.\" Also linux/arch/ia64/kernel/fsys.S and linux/Documentation/ia64/fsys.txt
f6816de9 276The table below lists the symbols exported by the vDSO.
2800db82
MF
277.if t \{\
278.ft CW
279\}
280.TS
281l l.
282symbol version
283_
284__kernel_sigtramp LINUX_2.5
285__kernel_syscall_via_break LINUX_2.5
286__kernel_syscall_via_epc LINUX_2.5
287.TE
288.if t \{\
289.in
290.ft P
291\}
5711c04f 292.PP
fb634bd8 293The Itanium port is somewhat tricky.
8635ed1b
MK
294In addition to the vDSO above, it also has "light-weight system calls"
295(also known as "fast syscalls" or "fsys").
fb634bd8
MK
296You can invoke these via the
297.I __kernel_syscall_via_epc
298vDSO helper.
2800db82
MF
299The system calls listed here have the same semantics as if you called them
300directly via
fb634bd8 301.BR syscall (2),
2800db82
MF
302so refer to the relevant
303documentation for each.
304The table below lists the functions available via this mechanism.
305.if t \{\
306.ft CW
307\}
308.TS
309l.
310function
311_
312clock_gettime
313getcpu
314getpid
315getppid
316gettimeofday
317set_tid_address
318.TE
319.if t \{\
320.in
321.ft P
322\}
323.SS parisc (hppa) functions
324.\" See linux/arch/parisc/kernel/syscall.S
325.\" See linux/Documentation/parisc/registers
0201f482 326The parisc port has a code page with utility functions
8635ed1b 327called a gateway page.
fb634bd8
MK
328Rather than use the normal ELF auxiliary vector approach,
329it passes the address of
2800db82
MF
330the page to the process via the SR2 register.
331The permissions on the page are such that merely executing those addresses
dd6b62ec 332automatically executes with kernel privileges and not in user space.
2800db82 333This is done to match the way HP-UX works.
5711c04f 334.PP
2800db82
MF
335Since it's just a raw page of code, there is no ELF information for doing
336symbol lookups or versioning.
fb634bd8
MK
337Simply call into the appropriate offset via the branch instruction,
338for example:
5711c04f 339.PP
fb634bd8 340 ble <offset>(%sr2, %r0)
2800db82
MF
341.if t \{\
342.ft CW
343\}
344.TS
345l l.
346offset function
347_
0201f482
HD
34800b0 lws_entry (CAS operations)
34900e0 set_thread_pointer (used by glibc)
2800db82 3500100 linux_gateway_entry (syscall)
2800db82
MF
351.TE
352.if t \{\
353.in
354.ft P
355\}
356.SS ppc/32 functions
357.\" See linux/arch/powerpc/kernel/vdso32/vdso32.lds.S
f6816de9 358The table below lists the symbols exported by the vDSO.
2800db82
MF
359The functions marked with a
360.I *
f6816de9
MK
361are available only when the kernel is
362a PowerPC64 (64-bit) kernel.
2800db82
MF
363.if t \{\
364.ft CW
365\}
366.TS
367l l.
368symbol version
369_
370__kernel_clock_getres LINUX_2.6.15
371__kernel_clock_gettime LINUX_2.6.15
372__kernel_datapage_offset LINUX_2.6.15
373__kernel_get_syscall_map LINUX_2.6.15
374__kernel_get_tbfreq LINUX_2.6.15
375__kernel_getcpu \fI*\fR LINUX_2.6.15
376__kernel_gettimeofday LINUX_2.6.15
377__kernel_sigtramp_rt32 LINUX_2.6.15
378__kernel_sigtramp32 LINUX_2.6.15
379__kernel_sync_dicache LINUX_2.6.15
380__kernel_sync_dicache_p5 LINUX_2.6.15
381.TE
382.if t \{\
383.in
384.ft P
385\}
5711c04f 386.PP
7d166241
MK
387The
388.B CLOCK_REALTIME_COARSE
389and
390.B CLOCK_MONOTONIC_COARSE
391clocks are
392.I not
393supported by the
394.I __kernel_clock_getres
395and
396.I __kernel_clock_gettime
397interfaces;
398the kernel falls back to the real system call.
2800db82
MF
399.SS ppc/64 functions
400.\" See linux/arch/powerpc/kernel/vdso64/vdso64.lds.S
f6816de9 401The table below lists the symbols exported by the vDSO.
2800db82
MF
402.if t \{\
403.ft CW
404\}
405.TS
406l l.
407symbol version
408_
409__kernel_clock_getres LINUX_2.6.15
410__kernel_clock_gettime LINUX_2.6.15
411__kernel_datapage_offset LINUX_2.6.15
412__kernel_get_syscall_map LINUX_2.6.15
413__kernel_get_tbfreq LINUX_2.6.15
414__kernel_getcpu LINUX_2.6.15
415__kernel_gettimeofday LINUX_2.6.15
416__kernel_sigtramp_rt64 LINUX_2.6.15
417__kernel_sync_dicache LINUX_2.6.15
418__kernel_sync_dicache_p5 LINUX_2.6.15
419.TE
420.if t \{\
421.in
422.ft P
423\}
5711c04f 424.PP
7d166241
MK
425The
426.B CLOCK_REALTIME_COARSE
427and
428.B CLOCK_MONOTONIC_COARSE
429clocks are
430.I not
431supported by the
432.I __kernel_clock_getres
433and
434.I __kernel_clock_gettime
435interfaces;
436the kernel falls back to the real system call.
77479ef6
TK
437.SS riscv functions
438.\" See linux/arch/riscv/kernel/vdso/vdso.lds.S
439The table below lists the symbols exported by the vDSO.
440.if t \{\
441.ft CW
442\}
443.TS
444l l.
445symbol version
446_
447__kernel_rt_sigreturn LINUX_4.15
448__kernel_gettimeofday LINUX_4.15
449__kernel_clock_gettime LINUX_4.15
450__kernel_clock_getres LINUX_4.15
451__kernel_getcpu LINUX_4.15
452__kernel_flush_icache LINUX_4.15
453.TE
454.if t \{\
455.in
456.ft P
457\}
2800db82
MF
458.SS s390 functions
459.\" See linux/arch/s390/kernel/vdso32/vdso32.lds.S
f6816de9 460The table below lists the symbols exported by the vDSO.
2800db82
MF
461.if t \{\
462.ft CW
463\}
464.TS
465l l.
466symbol version
467_
468__kernel_clock_getres LINUX_2.6.29
469__kernel_clock_gettime LINUX_2.6.29
470__kernel_gettimeofday LINUX_2.6.29
471.TE
472.if t \{\
473.in
474.ft P
475\}
476.SS s390x functions
477.\" See linux/arch/s390/kernel/vdso64/vdso64.lds.S
f6816de9 478The table below lists the symbols exported by the vDSO.
2800db82
MF
479.if t \{\
480.ft CW
481\}
482.TS
483l l.
484symbol version
485_
486__kernel_clock_getres LINUX_2.6.29
487__kernel_clock_gettime LINUX_2.6.29
488__kernel_gettimeofday LINUX_2.6.29
489.TE
490.if t \{\
491.in
492.ft P
493\}
494.SS sh (SuperH) functions
495.\" See linux/arch/sh/kernel/vsyscall/vsyscall.lds.S
f6816de9 496The table below lists the symbols exported by the vDSO.
2800db82
MF
497.if t \{\
498.ft CW
499\}
500.TS
501l l.
502symbol version
503_
504__kernel_rt_sigreturn LINUX_2.6
505__kernel_sigreturn LINUX_2.6
506__kernel_vsyscall LINUX_2.6
507.TE
508.if t \{\
509.in
510.ft P
511\}
512.SS i386 functions
513.\" See linux/arch/x86/vdso/vdso32/vdso32.lds.S
f6816de9 514The table below lists the symbols exported by the vDSO.
2800db82
MF
515.if t \{\
516.ft CW
517\}
518.TS
519l l.
520symbol version
521_
522__kernel_sigreturn LINUX_2.5
523__kernel_rt_sigreturn LINUX_2.5
524__kernel_vsyscall LINUX_2.5
1b294717
MF
525.\" Added in 7a59ed415f5b57469e22e41fc4188d5399e0b194 and updated
526.\" in 37c975545ec63320789962bf307f000f08fabd48.
54a82012
MK
527__vdso_clock_gettime LINUX_2.6 (exported since Linux 3.15)
528__vdso_gettimeofday LINUX_2.6 (exported since Linux 3.15)
529__vdso_time LINUX_2.6 (exported since Linux 3.15)
2800db82
MF
530.TE
531.if t \{\
532.in
533.ft P
534\}
9ea5bc66 535.SS x86-64 functions
2800db82 536.\" See linux/arch/x86/vdso/vdso.lds.S
f6816de9 537The table below lists the symbols exported by the vDSO.
2800db82
MF
538All of these symbols are also available without the "__vdso_" prefix, but
539you should ignore those and stick to the names below.
540.if t \{\
541.ft CW
542\}
543.TS
544l l.
545symbol version
546_
547__vdso_clock_gettime LINUX_2.6
548__vdso_getcpu LINUX_2.6
549__vdso_gettimeofday LINUX_2.6
550__vdso_time LINUX_2.6
551.TE
552.if t \{\
553.in
554.ft P
555\}
556.SS x86/x32 functions
557.\" See linux/arch/x86/vdso/vdso32.lds.S
f6816de9 558The table below lists the symbols exported by the vDSO.
2800db82
MF
559.if t \{\
560.ft CW
561\}
562.TS
563l l.
564symbol version
565_
566__vdso_clock_gettime LINUX_2.6
567__vdso_getcpu LINUX_2.6
568__vdso_gettimeofday LINUX_2.6
569__vdso_time LINUX_2.6
570.TE
571.if t \{\
572.in
573.ft P
574\}
575.SS History
fb634bd8
MK
576The vDSO was originally just a single function\(emthe vsyscall.
577In older kernels, you might see that name
578in a process's memory map rather than "vdso".
d3532647 579Over time, people realized that this mechanism
fb634bd8 580was a great way to pass more functionality
2800db82
MF
581to user space, so it was reconceived as a vDSO in the current format.
582.SH SEE ALSO
583.BR syscalls (2),
584.BR getauxval (3),
585.BR proc (5)
5711c04f 586.PP
fb634bd8 587The documents, examples, and source code in the Linux source code tree:
e646a1ba 588.PP
fb634bd8 589.in +4n
e646a1ba 590.EX
2800db82 591Documentation/ABI/stable/vdso
fb634bd8 592Documentation/ia64/fsys.txt
2800db82 593Documentation/vDSO/* (includes examples of using the vDSO)
fb634bd8 594
587ff4d5 595find arch/ \-iname \(aq*vdso*\(aq \-o \-iname \(aq*gate*\(aq
b8302363 596.EE
fb634bd8 597.in