]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man/man7/vdso.7
man/, share/mk/: Move man*/ to man/
[thirdparty/man-pages.git] / man / man7 / vdso.7
CommitLineData
a1eaacb1 1'\" t
2800db82
MF
2.\" Written by Mike Frysinger <vapier@gentoo.org>
3.\"
4.\" %%%LICENSE_START(PUBLIC_DOMAIN)
5.\" This page is in the public domain.
6.\" %%%LICENSE_END
7.\"
fb634bd8 8.\" Useful background:
8635ed1b
MK
9.\" http://articles.manugarg.com/systemcallinlinux2_6.html
10.\" https://lwn.net/Articles/446528/
11.\" http://www.linuxjournal.com/content/creating-vdso-colonels-other-chicken
12.\" http://www.trilithium.com/johan/2005/08/linux-gate/
fb634bd8 13.\"
a5ebdc8d 14.TH vDSO 7 (date) "Linux man-pages (unreleased)"
2800db82 15.SH NAME
3e179634 16vdso \- overview of the virtual ELF dynamic shared object
2800db82 17.SH SYNOPSIS
c7db92b9 18.nf
2800db82 19.B #include <sys/auxv.h>
c6d039a3 20.P
2800db82 21.B void *vdso = (uintptr_t) getauxval(AT_SYSINFO_EHDR);
c7db92b9 22.fi
2800db82 23.SH DESCRIPTION
e1549829 24The "vDSO" (virtual dynamic shared object) is a small shared library that
8635ed1b 25the kernel automatically maps into the
2800db82 26address space of all user-space applications.
fb634bd8 27Applications usually do not need to concern themselves with these details
2800db82 28as the vDSO is most commonly called by the C library.
f6816de9 29This way you can code in the normal way using standard functions
fb634bd8
MK
30and the C library will take care
31of using any functionality that is available via the vDSO.
c6d039a3 32.P
2800db82 33Why does the vDSO exist at all?
8635ed1b 34There are some system calls the kernel provides that
dd6b62ec 35user-space code ends up using frequently,
8635ed1b 36to the point that such calls can dominate overall performance.
fb634bd8 37This is due both to the frequency of the call as well as the
35432a03 38context-switch overhead that results
2800db82 39from exiting user space and entering the kernel.
c6d039a3 40.P
8635ed1b
MK
41The rest of this documentation is geared toward the curious and/or
42C library writers rather than general developers.
2800db82
MF
43If you're trying to call the vDSO in your own application rather than using
44the C library, you're most likely doing it wrong.
45.SS Example background
46Making system calls can be slow.
fb634bd8
MK
47In x86 32-bit systems, you can trigger a software interrupt
48.RI ( "int $0x80" )
49to tell the kernel you wish to make a system call.
50However, this instruction is expensive: it goes through
51the full interrupt-handling paths
52in the processor's microcode as well as in the kernel.
53Newer processors have faster (but backward incompatible) instructions to
2800db82
MF
54initiate system calls.
55Rather than require the C library to figure out if this functionality is
8635ed1b 56available at run time,
fb634bd8 57the C library can use functions provided by the kernel in
2800db82 58the vDSO.
c6d039a3 59.P
2800db82 60Note that the terminology can be confusing.
fb634bd8
MK
61On x86 systems, the vDSO function
62used to determine the preferred method of making a system call is
9ea5bc66 63named "__kernel_vsyscall", but on x86-64,
8635ed1b
MK
64the term "vsyscall" also refers to an obsolete way to ask the kernel
65what time it is or what CPU the caller is on.
c6d039a3 66.P
fb634bd8
MK
67One frequently used system call is
68.BR gettimeofday (2).
69This system call is called both directly by user-space applications
70as well as indirectly by
2800db82 71the C library.
36546c38 72Think timestamps or timing loops or polling\[em]all of these
8635ed1b 73frequently need to know what time it is right now.
36546c38 74This information is also not secret\[em]any application in any
8635ed1b
MK
75privilege mode (root or any unprivileged user) will get the same answer.
76Thus the kernel arranges for the information required to answer
77this question to be placed in memory the process can access.
fb634bd8
MK
78Now a call to
79.BR gettimeofday (2)
80changes from a system call to a normal function
2800db82
MF
81call and a few memory accesses.
82.SS Finding the vDSO
8635ed1b
MK
83The base address of the vDSO (if one exists) is passed by the kernel to
84each program in the initial auxiliary vector (see
d3532647 85.BR getauxval (3)),
fb634bd8 86via the
2800db82
MF
87.B AT_SYSINFO_EHDR
88tag.
c6d039a3 89.P
2800db82
MF
90You must not assume the vDSO is mapped at any particular location in the
91user's memory map.
8635ed1b 92The base address will usually be randomized at run time every time a new
2800db82
MF
93process image is created (at
94.BR execve (2)
95time).
fb634bd8
MK
96This is done for security reasons,
97to prevent "return-to-libc" attacks.
c6d039a3 98.P
fb634bd8 99For some architectures, there is also an
2800db82
MF
100.B AT_SYSINFO
101tag.
102This is used only for locating the vsyscall entry point and is frequently
103omitted or set to 0 (meaning it's not available).
fb634bd8 104This tag is a throwback to the initial vDSO work (see
1ae6b2c7 105.I History
fb634bd8 106below) and its use should be avoided.
2800db82
MF
107.SS File format
108Since the vDSO is a fully formed ELF image, you can do symbol lookups on it.
8635ed1b
MK
109This allows new symbols to be added with newer kernel releases,
110and allows the C library to detect available functionality at
111run time when running under different kernel versions.
fb634bd8 112Oftentimes the C library will do detection with the first call and then
2800db82 113cache the result for subsequent calls.
c6d039a3 114.P
2800db82
MF
115All symbols are also versioned (using the GNU version format).
116This allows the kernel to update the function signature without breaking
fb634bd8 117backward compatibility.
2800db82
MF
118This means changing the arguments that the function accepts as well as the
119return value.
fb634bd8
MK
120Thus, when looking up a symbol in the vDSO,
121you must always include the version
2800db82 122to match the ABI you expect.
c6d039a3 123.P
fb634bd8
MK
124Typically the vDSO follows the naming convention of prefixing
125all symbols with "__vdso_" or "__kernel_"
126so as to distinguish them from other standard symbols.
127For example, the "gettimeofday" function is named "__vdso_gettimeofday".
c6d039a3 128.P
fb634bd8
MK
129You use the standard C calling conventions when calling
130any of these functions.
2800db82
MF
131No need to worry about weird register or stack behavior.
132.SH NOTES
133.SS Source
8635ed1b
MK
134When you compile the kernel,
135it will automatically compile and link the vDSO code for you.
fb634bd8 136You will frequently find it under the architecture-specific directory:
c6d039a3 137.P
1ae6b2c7
AC
138.in +4n
139.EX
b957f81f 140find arch/$ARCH/ \-name \[aq]*vdso*.so*\[aq] \-o \-name \[aq]*gate*.so*\[aq]
1ae6b2c7
AC
141.EE
142.in
787dd4ad 143.\"
2800db82 144.SS vDSO names
21ffc8d1 145The name of the vDSO varies across architectures.
d3532647 146It will often show up in things like glibc's
fb634bd8
MK
147.BR ldd (1)
148output.
2800db82
MF
149The exact name should not matter to any code, so do not hardcode it.
150.if t \{\
151.ft CW
152\}
153.TS
154l l.
155user ABI vDSO name
156_
587ff4d5
JW
157aarch64 linux\-vdso.so.1
158arm linux\-vdso.so.1
159ia64 linux\-gate.so.1
160mips linux\-vdso.so.1
161ppc/32 linux\-vdso32.so.1
162ppc/64 linux\-vdso64.so.1
77479ef6 163riscv linux\-vdso.so.1
587ff4d5
JW
164s390 linux\-vdso32.so.1
165s390x linux\-vdso64.so.1
166sh linux\-gate.so.1
167i386 linux\-gate.so.1
168x86-64 linux\-vdso.so.1
169x86/x32 linux\-vdso.so.1
2800db82
MF
170.TE
171.if t \{\
172.in
173.ft P
174\}
314d88f6 175.SS strace(1), seccomp(2), and the vDSO
3aa341fc 176When tracing system calls with
afc40b07
MK
177.BR strace (1),
178symbols (system calls) that are exported by the vDSO will
179.I not
180appear in the trace output.
314d88f6
MK
181Those system calls will likewise not be visible to
182.BR seccomp (2)
183filters.
dd6b62ec 184.SH ARCHITECTURE-SPECIFIC NOTES
f6816de9
MK
185The subsections below provide architecture-specific notes
186on the vDSO.
c6d039a3 187.P
f6816de9
MK
188Note that the vDSO that is used is based on the ABI of your user-space code
189and not the ABI of the kernel.
190Thus, for example,
191when you run an i386 32-bit ELF binary,
192you'll get the same vDSO regardless of whether you run it under
9ea5bc66 193an i386 32-bit kernel or under an x86-64 64-bit kernel.
dd6b62ec 194Therefore, the name of the user-space ABI should be used to determine
f6816de9 195which of the sections below is relevant.
fb634bd8 196.SS ARM functions
ebfc3611
NL
197.\" See linux/arch/arm/vdso/vdso.lds.S
198.\" Commit: 8512287a8165592466cb9cb347ba94892e9c56a5
199The table below lists the symbols exported by the vDSO.
200.if t \{\
201.ft CW
202\}
203.TS
204l l.
205symbol version
206_
207__vdso_gettimeofday LINUX_2.6 (exported since Linux 4.1)
208__vdso_clock_gettime LINUX_2.6 (exported since Linux 4.1)
209.TE
210.if t \{\
211.in
212.ft P
213\}
c6d039a3 214.P
2800db82 215.\" See linux/arch/arm/kernel/entry-armv.S
68635d93 216.\" See linux/Documentation/arm/kernel_user_helpers.rst
ebfc3611 217Additionally, the ARM port has a code page full of utility functions.
2800db82
MF
218Since it's just a raw page of code, there is no ELF information for doing
219symbol lookups or versioning.
220It does provide support for different versions though.
c6d039a3 221.P
fb634bd8
MK
222For information on this code page,
223it's best to refer to the kernel documentation
2800db82 224as it's extremely detailed and covers everything you need to know:
68635d93 225.IR Documentation/arm/kernel_user_helpers.rst .
2800db82
MF
226.SS aarch64 functions
227.\" See linux/arch/arm64/kernel/vdso/vdso.lds.S
f6816de9 228The table below lists the symbols exported by the vDSO.
2800db82
MF
229.if t \{\
230.ft CW
231\}
232.TS
233l l.
234symbol version
235_
236__kernel_rt_sigreturn LINUX_2.6.39
237__kernel_gettimeofday LINUX_2.6.39
238__kernel_clock_gettime LINUX_2.6.39
239__kernel_clock_getres LINUX_2.6.39
240.TE
241.if t \{\
242.in
243.ft P
244\}
ec7c7493 245.SS bfin (Blackfin) functions (port removed in Linux 4.17)
2800db82
MF
246.\" See linux/arch/blackfin/kernel/fixed_code.S
247.\" See http://docs.blackfin.uclinux.org/doku.php?id=linux-kernel:fixed-code
8635ed1b
MK
248As this CPU lacks a memory management unit (MMU),
249it doesn't set up a vDSO in the normal sense.
250Instead, it maps at boot time a few raw functions into
251a fixed location in memory.
2800db82 252User-space applications then call directly into that region.
8635ed1b
MK
253There is no provision for backward compatibility
254beyond sniffing raw opcodes,
36546c38 255but as this is an embedded CPU, it can get away with things\[em]some of the
2800db82 256object formats it runs aren't even ELF based (they're bFLT/FLAT).
c6d039a3 257.P
f6816de9
MK
258For information on this code page,
259it's best to refer to the public documentation:
2800db82 260.br
5465ae95 261http://docs.blackfin.uclinux.org/doku.php?id=linux\-kernel:fixed\-code
a5a3afb9
ZLK
262.SS mips functions
263.\" See linux/arch/mips/vdso/vdso.ld.S
a5a3afb9
ZLK
264The table below lists the symbols exported by the vDSO.
265.if t \{\
266.ft CW
267\}
268.TS
269l l.
270symbol version
271_
272__kernel_gettimeofday LINUX_2.6 (exported since Linux 4.4)
273__kernel_clock_gettime LINUX_2.6 (exported since Linux 4.4)
274.TE
275.if t \{\
276.in
277.ft P
278\}
2800db82
MF
279.SS ia64 (Itanium) functions
280.\" See linux/arch/ia64/kernel/gate.lds.S
68635d93 281.\" Also linux/arch/ia64/kernel/fsys.S and linux/Documentation/ia64/fsys.rst
f6816de9 282The table below lists the symbols exported by the vDSO.
2800db82
MF
283.if t \{\
284.ft CW
285\}
286.TS
287l l.
288symbol version
289_
290__kernel_sigtramp LINUX_2.5
291__kernel_syscall_via_break LINUX_2.5
292__kernel_syscall_via_epc LINUX_2.5
293.TE
294.if t \{\
295.in
296.ft P
297\}
c6d039a3 298.P
fb634bd8 299The Itanium port is somewhat tricky.
8635ed1b
MK
300In addition to the vDSO above, it also has "light-weight system calls"
301(also known as "fast syscalls" or "fsys").
fb634bd8
MK
302You can invoke these via the
303.I __kernel_syscall_via_epc
304vDSO helper.
2800db82
MF
305The system calls listed here have the same semantics as if you called them
306directly via
fb634bd8 307.BR syscall (2),
2800db82
MF
308so refer to the relevant
309documentation for each.
310The table below lists the functions available via this mechanism.
311.if t \{\
312.ft CW
313\}
314.TS
315l.
316function
317_
318clock_gettime
319getcpu
320getpid
321getppid
322gettimeofday
323set_tid_address
324.TE
325.if t \{\
326.in
327.ft P
328\}
329.SS parisc (hppa) functions
330.\" See linux/arch/parisc/kernel/syscall.S
68635d93 331.\" See linux/Documentation/parisc/registers.rst
0201f482 332The parisc port has a code page with utility functions
8635ed1b 333called a gateway page.
fb634bd8
MK
334Rather than use the normal ELF auxiliary vector approach,
335it passes the address of
2800db82
MF
336the page to the process via the SR2 register.
337The permissions on the page are such that merely executing those addresses
dd6b62ec 338automatically executes with kernel privileges and not in user space.
2800db82 339This is done to match the way HP-UX works.
c6d039a3 340.P
2800db82
MF
341Since it's just a raw page of code, there is no ELF information for doing
342symbol lookups or versioning.
fb634bd8
MK
343Simply call into the appropriate offset via the branch instruction,
344for example:
c6d039a3 345.P
1ae6b2c7
AC
346.in +4n
347.EX
348ble <offset>(%sr2, %r0)
349.EE
350.in
2800db82
MF
351.if t \{\
352.ft CW
353\}
354.TS
355l l.
356offset function
357_
0201f482
HD
35800b0 lws_entry (CAS operations)
35900e0 set_thread_pointer (used by glibc)
2800db82 3600100 linux_gateway_entry (syscall)
2800db82
MF
361.TE
362.if t \{\
363.in
364.ft P
365\}
366.SS ppc/32 functions
367.\" See linux/arch/powerpc/kernel/vdso32/vdso32.lds.S
f6816de9 368The table below lists the symbols exported by the vDSO.
2800db82
MF
369The functions marked with a
370.I *
f6816de9
MK
371are available only when the kernel is
372a PowerPC64 (64-bit) kernel.
2800db82
MF
373.if t \{\
374.ft CW
375\}
376.TS
377l l.
378symbol version
379_
380__kernel_clock_getres LINUX_2.6.15
381__kernel_clock_gettime LINUX_2.6.15
ae5cc0dc 382__kernel_clock_gettime64 LINUX_5.11
2800db82
MF
383__kernel_datapage_offset LINUX_2.6.15
384__kernel_get_syscall_map LINUX_2.6.15
385__kernel_get_tbfreq LINUX_2.6.15
386__kernel_getcpu \fI*\fR LINUX_2.6.15
387__kernel_gettimeofday LINUX_2.6.15
388__kernel_sigtramp_rt32 LINUX_2.6.15
389__kernel_sigtramp32 LINUX_2.6.15
390__kernel_sync_dicache LINUX_2.6.15
391__kernel_sync_dicache_p5 LINUX_2.6.15
392.TE
393.if t \{\
394.in
395.ft P
396\}
c6d039a3 397.P
b324e17d 398Before Linux 5.6,
5fc054ec
MK
399.\" commit 654abc69ef2e69712e6d4e8a6cb9292b97a4aa39
400the
7d166241
MK
401.B CLOCK_REALTIME_COARSE
402and
403.B CLOCK_MONOTONIC_COARSE
404clocks are
405.I not
406supported by the
407.I __kernel_clock_getres
408and
409.I __kernel_clock_gettime
410interfaces;
411the kernel falls back to the real system call.
2800db82
MF
412.SS ppc/64 functions
413.\" See linux/arch/powerpc/kernel/vdso64/vdso64.lds.S
f6816de9 414The table below lists the symbols exported by the vDSO.
2800db82
MF
415.if t \{\
416.ft CW
417\}
418.TS
419l l.
420symbol version
421_
422__kernel_clock_getres LINUX_2.6.15
423__kernel_clock_gettime LINUX_2.6.15
424__kernel_datapage_offset LINUX_2.6.15
425__kernel_get_syscall_map LINUX_2.6.15
426__kernel_get_tbfreq LINUX_2.6.15
427__kernel_getcpu LINUX_2.6.15
428__kernel_gettimeofday LINUX_2.6.15
429__kernel_sigtramp_rt64 LINUX_2.6.15
430__kernel_sync_dicache LINUX_2.6.15
431__kernel_sync_dicache_p5 LINUX_2.6.15
432.TE
433.if t \{\
434.in
435.ft P
436\}
c6d039a3 437.P
b324e17d 438Before Linux 4.16,
5fc054ec
MK
439.\" commit 5c929885f1bb4b77f85b1769c49405a0e0f154a1
440the
7d166241
MK
441.B CLOCK_REALTIME_COARSE
442and
443.B CLOCK_MONOTONIC_COARSE
444clocks are
445.I not
446supported by the
447.I __kernel_clock_getres
448and
449.I __kernel_clock_gettime
450interfaces;
451the kernel falls back to the real system call.
77479ef6
TK
452.SS riscv functions
453.\" See linux/arch/riscv/kernel/vdso/vdso.lds.S
454The table below lists the symbols exported by the vDSO.
455.if t \{\
456.ft CW
457\}
458.TS
459l l.
460symbol version
461_
fd0e0d34
EH
462__vdso_rt_sigreturn LINUX_4.15
463__vdso_gettimeofday LINUX_4.15
464__vdso_clock_gettime LINUX_4.15
465__vdso_clock_getres LINUX_4.15
466__vdso_getcpu LINUX_4.15
467__vdso_flush_icache LINUX_4.15
77479ef6
TK
468.TE
469.if t \{\
470.in
471.ft P
472\}
2800db82
MF
473.SS s390 functions
474.\" See linux/arch/s390/kernel/vdso32/vdso32.lds.S
f6816de9 475The table below lists the symbols exported by the vDSO.
2800db82
MF
476.if t \{\
477.ft CW
478\}
479.TS
480l l.
481symbol version
482_
483__kernel_clock_getres LINUX_2.6.29
484__kernel_clock_gettime LINUX_2.6.29
485__kernel_gettimeofday LINUX_2.6.29
486.TE
487.if t \{\
488.in
489.ft P
490\}
491.SS s390x functions
492.\" See linux/arch/s390/kernel/vdso64/vdso64.lds.S
f6816de9 493The table below lists the symbols exported by the vDSO.
2800db82
MF
494.if t \{\
495.ft CW
496\}
497.TS
498l l.
499symbol version
500_
501__kernel_clock_getres LINUX_2.6.29
502__kernel_clock_gettime LINUX_2.6.29
503__kernel_gettimeofday LINUX_2.6.29
504.TE
505.if t \{\
506.in
507.ft P
508\}
509.SS sh (SuperH) functions
510.\" See linux/arch/sh/kernel/vsyscall/vsyscall.lds.S
f6816de9 511The table below lists the symbols exported by the vDSO.
2800db82
MF
512.if t \{\
513.ft CW
514\}
515.TS
516l l.
517symbol version
518_
519__kernel_rt_sigreturn LINUX_2.6
520__kernel_sigreturn LINUX_2.6
521__kernel_vsyscall LINUX_2.6
522.TE
523.if t \{\
524.in
525.ft P
526\}
527.SS i386 functions
528.\" See linux/arch/x86/vdso/vdso32/vdso32.lds.S
f6816de9 529The table below lists the symbols exported by the vDSO.
2800db82
MF
530.if t \{\
531.ft CW
532\}
533.TS
534l l.
535symbol version
536_
537__kernel_sigreturn LINUX_2.5
538__kernel_rt_sigreturn LINUX_2.5
539__kernel_vsyscall LINUX_2.5
1b294717
MF
540.\" Added in 7a59ed415f5b57469e22e41fc4188d5399e0b194 and updated
541.\" in 37c975545ec63320789962bf307f000f08fabd48.
54a82012
MK
542__vdso_clock_gettime LINUX_2.6 (exported since Linux 3.15)
543__vdso_gettimeofday LINUX_2.6 (exported since Linux 3.15)
544__vdso_time LINUX_2.6 (exported since Linux 3.15)
2800db82
MF
545.TE
546.if t \{\
547.in
548.ft P
549\}
9ea5bc66 550.SS x86-64 functions
2800db82 551.\" See linux/arch/x86/vdso/vdso.lds.S
f6816de9 552The table below lists the symbols exported by the vDSO.
2800db82
MF
553All of these symbols are also available without the "__vdso_" prefix, but
554you should ignore those and stick to the names below.
555.if t \{\
556.ft CW
557\}
558.TS
559l l.
560symbol version
561_
562__vdso_clock_gettime LINUX_2.6
563__vdso_getcpu LINUX_2.6
564__vdso_gettimeofday LINUX_2.6
565__vdso_time LINUX_2.6
566.TE
567.if t \{\
568.in
569.ft P
570\}
571.SS x86/x32 functions
572.\" See linux/arch/x86/vdso/vdso32.lds.S
f6816de9 573The table below lists the symbols exported by the vDSO.
2800db82
MF
574.if t \{\
575.ft CW
576\}
577.TS
578l l.
579symbol version
580_
581__vdso_clock_gettime LINUX_2.6
582__vdso_getcpu LINUX_2.6
583__vdso_gettimeofday LINUX_2.6
584__vdso_time LINUX_2.6
585.TE
586.if t \{\
587.in
588.ft P
589\}
590.SS History
36546c38 591The vDSO was originally just a single function\[em]the vsyscall.
fb634bd8
MK
592In older kernels, you might see that name
593in a process's memory map rather than "vdso".
d3532647 594Over time, people realized that this mechanism
fb634bd8 595was a great way to pass more functionality
2800db82
MF
596to user space, so it was reconceived as a vDSO in the current format.
597.SH SEE ALSO
598.BR syscalls (2),
599.BR getauxval (3),
600.BR proc (5)
c6d039a3 601.P
fb634bd8 602The documents, examples, and source code in the Linux source code tree:
c6d039a3 603.P
fb634bd8 604.in +4n
e646a1ba 605.EX
2800db82 606Documentation/ABI/stable/vdso
68635d93 607Documentation/ia64/fsys.rst
2800db82 608Documentation/vDSO/* (includes examples of using the vDSO)
c6d039a3 609.P
b957f81f 610find arch/ \-iname \[aq]*vdso*\[aq] \-o \-iname \[aq]*gate*\[aq]
b8302363 611.EE
fb634bd8 612.in