]> git.ipfire.org Git - thirdparty/systemd.git/blame - docs/JOURNAL_EXPORT_FORMATS.md
man: add self-contained example of notify protocol
[thirdparty/systemd.git] / docs / JOURNAL_EXPORT_FORMATS.md
CommitLineData
5e3ab38e 1---
63812e15 2title: Journal Export Formats
5e3ab38e
ZJS
3category: Interfaces
4layout: default
5SPDX-License-Identifier: LGPL-2.1-or-later
6---
7
63812e15
BF
8# Journal Export Formats
9
10## Journal Export Format
5e3ab38e
ZJS
11
12_Note that this document describes the binary serialization format of journals only, as used for transfer across the network.
d9044a43 13For interfacing with web technologies there's the Journal JSON Format, described below.
1b4dc2ea 14The binary format on disk is documented as the [Journal File Format](JOURNAL_FILE_FORMAT)._
5e3ab38e
ZJS
15
16_Before reading on, please make sure you are aware of the [basic properties of journal entries](https://www.freedesktop.org/software/systemd/man/systemd.journal-fields.html), in particular realize that they may include binary non-text data (though usually don't), and the same field might have multiple values assigned within the same entry (though usually hasn't)._
17
18When exporting journal data for other uses or transferring it via the network/local IPC the _journal export format_ is used. It's a simple serialization of journal entries, that is easy to read without any special tools, but still binary safe where necessary. The format is like this:
19
20* Two journal entries that follow each other are separated by a double newline.
21* Journal fields consisting only of valid non-control UTF-8 codepoints are serialized as they are (i.e. the field name, followed by '=', followed by field data), followed by a newline as separator to the next field. Note that fields containing newlines cannot be formatted like this. Non-control UTF-8 codepoints are the codepoints with value at or above 32 (' '), or equal to 9 (TAB).
da890466 22* Other journal fields are serialized in a special binary safe way: field name, followed by newline, followed by a binary 64-bit little endian size value, followed by the binary field data, followed by a newline as separator to the next field.
11181f8a 23* Entry metadata that is not actually a field is serialized like it was a field, but beginning with two underscores. More specifically, `__CURSOR=`, `__REALTIME_TIMESTAMP=`, `__MONOTONIC_TIMESTAMP=`, `__SEQNUM=`, `__SEQNUM_ID` are introduced this way. Note that these meta-fields are only generated when actual journal files are serialized. They are omitted for entries that do not originate from a journal file (for example because they are transferred for the first time to be stored in one). Or in other words: if you are generating this format you shouldn't care about these special double-underscore fields. But you might find them usable when you deserialize the format generated by us. Additional fields prefixed with two underscores might be added later on, your parser should skip over the fields it does not know.
5e3ab38e
ZJS
24* The order in which fields appear in an entry is undefined and might be different for each entry that is serialized.
25And that's already it.
26
27This format can be generated via `journalctl -o export`.
28
29Here's an example for two serialized entries which consist only of text data:
30
31```
32__CURSOR=s=739ad463348b4ceca5a9e69c95a3c93f;i=4ece7;b=6c7c6013a26343b29e964691ff25d04c;m=4fc72436e;t=4c508a72423d9;x=d3e5610681098c10;p=system.journal
33__REALTIME_TIMESTAMP=1342540861416409
34__MONOTONIC_TIMESTAMP=21415215982
35_BOOT_ID=6c7c6013a26343b29e964691ff25d04c
36_TRANSPORT=syslog
37PRIORITY=4
38SYSLOG_FACILITY=3
39SYSLOG_IDENTIFIER=gdm-password]
40SYSLOG_PID=587
41MESSAGE=AccountsService-DEBUG(+): ActUserManager: ignoring unspecified session '8' since it's not graphical: Success
42_PID=587
43_UID=0
44_GID=500
45_COMM=gdm-session-wor
46_EXE=/usr/libexec/gdm-session-worker
47_CMDLINE=gdm-session-worker [pam/gdm-password]
48_AUDIT_SESSION=2
49_AUDIT_LOGINUID=500
50_SYSTEMD_CGROUP=/user/lennart/2
51_SYSTEMD_SESSION=2
52_SELINUX_CONTEXT=system_u:system_r:xdm_t:s0-s0:c0.c1023
53_SOURCE_REALTIME_TIMESTAMP=1342540861413961
54_MACHINE_ID=a91663387a90b89f185d4e860000001a
55_HOSTNAME=epsilon
56
57__CURSOR=s=739ad463348b4ceca5a9e69c95a3c93f;i=4ece8;b=6c7c6013a26343b29e964691ff25d04c;m=4fc72572f;t=4c508a7243799;x=68597058a89b7246;p=system.journal
58__REALTIME_TIMESTAMP=1342540861421465
59__MONOTONIC_TIMESTAMP=21415221039
60_BOOT_ID=6c7c6013a26343b29e964691ff25d04c
61_TRANSPORT=syslog
62PRIORITY=6
63SYSLOG_FACILITY=9
64SYSLOG_IDENTIFIER=/USR/SBIN/CROND
65SYSLOG_PID=8278
66MESSAGE=(root) CMD (run-parts /etc/cron.hourly)
67_PID=8278
68_UID=0
69_GID=0
70_COMM=run-parts
71_EXE=/usr/bin/bash
72_CMDLINE=/bin/bash /bin/run-parts /etc/cron.hourly
73_AUDIT_SESSION=8
74_AUDIT_LOGINUID=0
75_SYSTEMD_CGROUP=/user/root/8
76_SYSTEMD_SESSION=8
77_SELINUX_CONTEXT=system_u:system_r:crond_t:s0-s0:c0.c1023
78_SOURCE_REALTIME_TIMESTAMP=1342540861416351
79_MACHINE_ID=a91663387a90b89f185d4e860000001a
80_HOSTNAME=epsilon
81
82```
83
84A message with a binary field produced by
85```bash
86python3 -c 'from systemd import journal; journal.send("foo\nbar")'
87journalctl -n1 -o export
88```
89
90```
91__CURSOR=s=bcce4fb8ffcb40e9a6e05eee8b7831bf;i=5ef603;b=ec25d6795f0645619ddac9afdef453ee;m=545242e7049;t=50f1202
92__REALTIME_TIMESTAMP=1423944916375353
93__MONOTONIC_TIMESTAMP=5794517905481
94_BOOT_ID=ec25d6795f0645619ddac9afdef453ee
95_TRANSPORT=journal
96_UID=1001
97_GID=1001
98_CAP_EFFECTIVE=0
99_SYSTEMD_OWNER_UID=1001
100_SYSTEMD_SLICE=user-1001.slice
101_MACHINE_ID=5833158886a8445e801d437313d25eff
102_HOSTNAME=bupkis
103_AUDIT_LOGINUID=1001
104_SELINUX_CONTEXT=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
105CODE_LINE=1
106CODE_FUNC=<module>
107SYSLOG_IDENTIFIER=python3
108_COMM=python3
109_EXE=/usr/bin/python3.4
110_AUDIT_SESSION=35898
111_SYSTEMD_CGROUP=/user.slice/user-1001.slice/session-35898.scope
112_SYSTEMD_SESSION=35898
113_SYSTEMD_UNIT=session-35898.scope
114MESSAGE
115^G^@^@^@^@^@^@^@foo
116bar
117CODE_FILE=<string>
118_PID=16853
119_CMDLINE=python3 -c from systemd import journal; journal.send("foo\nbar")
120_SOURCE_REALTIME_TIMESTAMP=1423944916372858
121```
d9044a43 122
63812e15 123## Journal JSON Format
d9044a43
ZJS
124
125_Note that this section describes the JSON serialization format of the journal only, as used for interfacing with web technologies.
126For binary transfer of journal data across the network there's the Journal Export Format described above.
1b4dc2ea 127The binary format on disk is documented as [Journal File Format](JOURNAL_FILE_FORMAT)._
d9044a43
ZJS
128
129_Before reading on, please make sure you are aware of the [basic properties of journal entries](https://www.freedesktop.org/software/systemd/man/systemd.journal-fields.html), in particular realize that they may include binary non-text data (though usually don't), and the same field might have multiple values assigned within the same entry (though usually hasn't)._
130
131In most cases the Journal JSON serialization is the obvious mapping of the entry field names (as JSON strings) to the entry field values (also as JSON strings) encapsulated in one JSON object. However, there are a few special cases to handle:
132
133* A field that contains non-printable or non-UTF8 is serialized as a number array instead. This is necessary to handle binary data in a safe way without losing data, since JSON cannot embed binary data natively. Each byte of the binary field will be mapped to its numeric value in the range 0…255.
134* The JSON serializer can optionally skip huge (as in larger than a specific threshold) data fields from the JSON object. If that is enabled and a data field is too large, the field name is still included in the JSON object but assigned _null_.
135* Within the same entry, Journal fields may have multiple values assigned. This is not allowed in JSON. The serializer will hence create a single JSON field only for these cases, and assign it an array of values (which the can be strings, _null_ or number arrays, see above).
11181f8a 136* If the JSON data originates from a journal file it may include the special addressing fields `__CURSOR`, `__REALTIME_TIMESTAMP`, `__MONOTONIC_TIMESTAMP`, `__SEQNUM`, `__SEQNUM_ID`, which contain the cursor string of this entry as string, the realtime/monotonic timestamps of this entry as formatted numeric string of usec since the respective epoch, and the sequence number and associated sequence number ID, both formatted as strings.
d9044a43
ZJS
137
138Here's an example, illustrating all cases mentioned above. Consider this entry:
139
140```
141MESSAGE=Hello World
142_UDEV_DEVNODE=/dev/waldo
143_UDEV_DEVLINK=/dev/alias1
144_UDEV_DEVLINK=/dev/alias2
145BINARY=this is a binary value \a
146LARGE=this is a super large value (let's pretend at least, for the sake of this example)
147```
148
149This translates into the following JSON Object:
150```json
151{
152 "MESSAGE" : "Hello World",
153 "_UDEV_DEVNODE" : "/dev/waldo",
154 "_UDEV_DEVLINK" : [ "/dev/alias1", "/dev/alias2" ],
155 "BINARY" : [ 116, 104, 105, 115, 32, 105, 115, 32, 97, 32, 98, 105, 110, 97, 114, 121, 32, 118, 97, 108, 117, 101, 32, 7 ],
156 "LARGE" : null
157}
158```