]>
Commit | Line | Data |
---|---|---|
1a80f4e0 LP |
1 | --- |
2 | title: Native Journal Protocol | |
3 | category: Interfaces | |
4 | layout: default | |
0aff7b75 | 5 | SPDX-License-Identifier: LGPL-2.1-or-later |
1a80f4e0 LP |
6 | --- |
7 | ||
8 | # Native Journal Protocol | |
9 | ||
10 | `systemd-journald.service` accepts log data via various protocols: | |
11 | ||
12 | * Classic RFC3164 BSD syslog via the `/dev/log` socket | |
13 | * STDOUT/STDERR of programs via `StandardOutput=journal` + `StandardError=journal` in service files (both of which are default settings) | |
14 | * Kernel log messages via the `/dev/kmsg` device node | |
15 | * Audit records via the kernel's audit subsystem | |
16 | * Structured log messages via `journald`'s native protocol | |
17 | ||
18 | The latter is what this document is about: if you are developing a program and | |
19 | want to pass structured log data to `journald`, it's the Journal's native | |
f223fd6a | 20 | protocol that you want to use. The systemd project provides the |
1a80f4e0 LP |
21 | [`sd_journal_print(3)`](https://www.freedesktop.org/software/systemd/man/sd_journal_print.html) |
22 | API that implements the client side of this protocol. This document explains | |
23 | what this interface does behind the scenes, in case you'd like to implement a | |
24 | client for it yourself, without linking to `libsystemd` — for example because | |
25 | you work in a programming language other than C or otherwise want to avoid the | |
26 | dependency. | |
27 | ||
28 | ## Basics | |
29 | ||
30 | The native protocol of `journald` is spoken on the | |
31 | `/run/systemd/journal/socket` `AF_UNIX`/`SOCK_DGRAM` socket on which | |
32 | `systemd-journald.service` listens. Each datagram sent to this socket | |
33 | encapsulates one journal entry that shall be written. Since datagrams are | |
34 | subject to a size limit and we want to allow large journal entries, datagrams | |
35 | sent over this socket may come in one of two formats: | |
36 | ||
37 | * A datagram with the literal journal entry data as payload, without | |
38 | any file descriptors attached. | |
39 | ||
40 | * A datagram with an empty payload, but with a single | |
41 | [`memfd`](https://man7.org/linux/man-pages/man2/memfd_create.2.html) | |
42 | file descriptor that contains the literal journal entry data. | |
43 | ||
44 | Other combinations are not permitted, i.e. datagrams with both payload and file | |
45 | descriptors, or datagrams with neither, or more than one file descriptor. Such | |
46 | datagrams are ignored. The `memfd` file descriptor should be fully sealed. The | |
47 | binary format in the datagram payload and in the `memfd` memory is | |
48 | identical. Typically a client would attempt to first send the data as datagram | |
49 | payload, but if this fails with an `EMSGSIZE` error it would immediately retry | |
50 | via the `memfd` logic. | |
51 | ||
52 | A client probably should bump up the `SO_SNDBUF` socket option of its `AF_UNIX` | |
53 | socket towards `journald` in order to delay blocking I/O as much as possible. | |
54 | ||
55 | ## Data Format | |
56 | ||
57 | Each datagram should consist of a number of environment-like key/value | |
58 | assignments. Unlike environment variable assignments the value may contain NUL | |
59 | bytes however, as well as any other binary data. Keys may not include the `=` | |
60 | or newline characters (or any other control characters or non-ASCII characters) | |
61 | and may not be empty. | |
62 | ||
4bb37359 | 63 | Serialization into the datagram payload or `memfd` is straightforward: each |
1a80f4e0 LP |
64 | key/value pair is serialized via one of two methods: |
65 | ||
66 | * The first method inserts a `=` character between key and value, and suffixes | |
67 | the result with `\n` (i.e. the newline character, ASCII code 10). Example: a | |
68 | key `FOO` with a value `BAR` is serialized `F`, `O`, `O`, `=`, `B`, `A`, `R`, | |
69 | `\n`. | |
70 | ||
71 | * The second method should be used if the value of a field contains a `\n` | |
72 | byte. In this case, the key name is serialized as is, followed by a `\n` | |
da890466 | 73 | character, followed by a (non-aligned) little-endian unsigned 64-bit integer |
1a80f4e0 LP |
74 | encoding the size of the value, followed by the literal value data, followed by |
75 | `\n`. Example: a key `FOO` with a value `BAR` may be serialized using this | |
76 | second method as: `F`, `O`, `O`, `\n`, `\003`, `\000`, `\000`, `\000`, `\000`, | |
77 | `\000`, `\000`, `\000`, `B`, `A`, `R`, `\n`. | |
78 | ||
79 | If the value of a key/value pair contains a newline character (`\n`), it *must* | |
80 | be serialized using the second method. If it does not, either method is | |
81 | permitted. However, it is generally recommended to use the first method if | |
82 | possible for all key/value pairs where applicable since the generated datagrams | |
83 | are easily recognized and understood by the human eye this way, without any | |
84 | manual binary decoding — which improves the debugging experience a lot, in | |
85 | particular with tools such as `strace` that can show datagram content as text | |
86 | dump. After all, log messages are highly relevant for debugging programs, hence | |
87 | optimizing log traffic for readability without special tools is generally | |
88 | desirable. | |
89 | ||
90 | Note that keys that begin with `_` have special semantics in `journald`: they | |
91 | are *trusted* and implicitly appended by `journald` on the receiving | |
92 | side. Clients should not send them — if they do anyway, they will be ignored. | |
93 | ||
94 | The most important key/value pair to send is `MESSAGE=`, as that contains the | |
95 | actual log message text. Other relevant keys a client should send in most cases | |
96 | are `PRIORITY=`, `CODE_FILE=`, `CODE_LINE=`, `CODE_FUNC=`, `ERRNO=`. It's | |
97 | recommended to generate these fields implicitly on the client side. For further | |
98 | information see the [relevant documentation of these | |
99 | fields](https://www.freedesktop.org/software/systemd/man/systemd.journal-fields.html). | |
100 | ||
101 | The order in which the fields are serialized within one datagram is undefined | |
102 | and may be freely chosen by the client. The server side might or might not | |
103 | retain or reorder it when writing it to the Journal. | |
104 | ||
105 | Some programs might generate multi-line log messages (e.g. a stack unwinder | |
106 | generating log output about a stack trace, with one line for each stack | |
107 | frame). It's highly recommended to send these as a single datagram, using a | |
108 | single `MESSAGE=` field with embedded newline characters between the lines (the | |
109 | second serialization method described above must hence be used for this | |
110 | field). If possible do not split up individual events into multiple Journal | |
111 | events that might then be processed and written into the Journal as separate | |
112 | entries. The Journal toolchain is capable of handling multi-line log entries | |
113 | just fine, and it's generally preferred to have a single set of metadata fields | |
114 | associated with each multi-line message. | |
115 | ||
116 | Note that the same keys may be used multiple times within the same datagram, | |
117 | with different values. The Journal supports this and will write such entries to | |
118 | disk without complaining. This is useful for associating a single log entry | |
119 | with multiple suitable objects of the same type at once. This should only be | |
120 | used for specific Journal fields however, where this is expected. Do not use | |
121 | this for Journal fields where this is not expected and where code reasonably | |
122 | assumes per-event uniqueness of the keys. In most cases code that consumes and | |
123 | displays log entries is likely to ignore such non-unique fields or only | |
124 | consider the first of the specified values. Specifically, if a Journal entry | |
125 | contains multiple `MESSAGE=` fields, likely only the first one is | |
126 | displayed. Note that a well-written logging client library thus will not use a | |
127 | plain dictionary for accepting structured log metadata, but rather a data | |
128 | structure that allows non-unique keys, for example an array, or a dictionary | |
129 | that optionally maps to a set of values instead of a single value. | |
130 | ||
131 | ## Example Datagram | |
132 | ||
133 | Here's an encoded message, with various common fields, all encoded according to | |
134 | the first serialization method, with the exception of one, where the value | |
135 | contains a newline character, and thus the second method is needed to be used. | |
136 | ||
137 | ``` | |
138 | PRIORITY=3\n | |
139 | SYSLOG_FACILITY=3\n | |
140 | CODE_FILE=src/foobar.c\n | |
141 | CODE_LINE=77\n | |
142 | BINARY_BLOB\n | |
143 | \004\000\000\000\000\000\000\000xx\nx\n | |
144 | CODE_FUNC=some_func\n | |
145 | SYSLOG_IDENTIFIER=footool\n | |
146 | MESSAGE=Something happened.\n | |
147 | ``` | |
148 | ||
149 | (Lines are broken here after each `\n` to make things more readable. C-style | |
150 | backslash escaping is used.) | |
151 | ||
152 | ## Automatic Protocol Upgrading | |
153 | ||
154 | It might be wise to automatically upgrade to logging via the Journal's native | |
155 | protocol in clients that previously used the BSD syslog protocol. Behaviour in | |
156 | this case should be pretty obvious: try connecting a socket to | |
157 | `/run/systemd/journal/socket` first (on success use the native Journal | |
158 | protocol), and if that fails fall back to `/dev/log` (and use the BSD syslog | |
159 | protocol). | |
160 | ||
161 | Programs normally logging to STDERR might also choose to upgrade to native | |
162 | Journal logging in case they are invoked via systemd's service logic, where | |
163 | STDOUT and STDERR are going to the Journal anyway. By preferring the native | |
164 | protocol over STDERR-based logging, structured metadata can be passed along, | |
165 | including priority information and more — which is not available on STDERR | |
166 | based logging. If a program wants to detect automatically whether its STDERR is | |
167 | connected to the Journal's stream transport, look for the `$JOURNAL_STREAM` | |
168 | environment variable. The systemd service logic sets this variable to a | |
169 | colon-separated pair of device and inode number (formatted in decimal ASCII) of | |
170 | the STDERR file descriptor. If the `.st_dev` and `.st_ino` fields of the | |
171 | `struct stat` data returned by `fstat(STDERR_FILENO, …)` match these values a | |
172 | program can be sure its STDERR is connected to the Journal, and may then opt to | |
173 | upgrade to the native Journal protocol via an `AF_UNIX` socket of its own, and | |
174 | cease to use STDERR. | |
175 | ||
176 | Why bother with this environment variable check? A service program invoked by | |
177 | systemd might employ shell-style I/O redirection on invoked subprograms, and | |
178 | those should likely not upgrade to the native Journal protocol, but instead | |
179 | continue to use the redirected file descriptors passed to them. Thus, by | |
180 | comparing the device and inode number of the actual STDERR file descriptor with | |
181 | the one the service manager passed, one can make sure that no I/O redirection | |
182 | took place for the current program. | |
183 | ||
184 | ## Alternative Implementations | |
185 | ||
186 | If you are looking for alternative implementations of this protocol (besides | |
187 | systemd's own in `sd_journal_print()`), consider | |
df1f621b | 188 | [GLib's](https://gitlab.gnome.org/GNOME/glib/-/blob/main/glib/gmessages.c) or |
1a80f4e0 LP |
189 | [`dbus-broker`'s](https://github.com/bus1/dbus-broker/blob/main/src/util/log.c). |
190 | ||
191 | And that's already all there is to it. |