]> git.ipfire.org Git - thirdparty/systemd.git/blame - src/libsystemd-bus/PORTING-DBUS1
bus: actually, the kernel does enforce validity of bus names...
[thirdparty/systemd.git] / src / libsystemd-bus / PORTING-DBUS1
CommitLineData
00586799
LP
1A few hints on supporting kdbus as backend in your favourite D-Bus library.
2
3~~~
4
5Before you read this, have a look at the DIFFERENCES and
6GVARIANT_SERIALIZATION texts, you find in the same directory where you
7found this.
8
9We invite you to port your favourite D-Bus protocol implementation
10over to kdbus. However, there are a couple of complexities
11involved. On kdbus we only speak GVariant marshalling, kdbus clients
12ignore traffic in dbus1 marshalling. Thus, you need to add a second,
13GVariant compatible marshaller to your libary first.
14
15After you have done that: here's the basic principle how kdbus works:
16
17You connect to a bus by opening its bus node in /dev/kdbus/. All
18busses have a device node there, that starts with a numeric UID of the
19owner of the bus, followed by a dash and a string identifying the
20bus. The system bus is thus called /dev/kdbus/0-system, and for user
21busses the device node is /dev/kdbus/1000-user (if 1000 is your user
22id).
23
24(Before we proceed, please always keep a copy of libsystemd-bus next
25to you, ultimately that's where the details are, this document simply
26is a rough overview to help you grok things.)
27
28CONNECTING
29
30To connect to a bus, simply open() its device node, and issue the
31KDBUS_CMD_HELLO call. That's it. Now you are connected. Do not send
32Hello messages or so (as you would on dbus1), that does not exist for
33kdbus.
34
35The structure you pass to the ioctl will contain a couple of
36parameters that you need to know to operate on the bus.
37
38There are two flags fields, one indicating features of the kdbus
39kernel side ("conn_flags"), the other one ("bus_flags") indicating
40features of the bus owner (i.e. systemd). Both flags fields are 64bit
41in width.
42
43When calling into the ioctl, you need to place your own supported
44feature bits into these fields. This tells the kernel about the
45features you support. When the ioctl returns it will contain the
46features the kernel supports.
47
48If any of the higher 32bit are set on the two flags fields and your
49client does not know what they mean, it must disconnect. The upper
5032bit are used to indicate "incompatible" feature additions on the bus
51system, the lower 32bit indicate "compatible" feature additions. A
52client that does not support a "compatible" feature addition can go on
53communicating with the bus, however a client that does not support an
54"incompatible" feature must not proceed with the connection.
55
56The hello structure also contains another flags field "attach_flags"
57which indicate meta data that is optionally attached to all incoming
58messages. You probably want to set KDBUS_ATTACH_NAMES unconditionally
59in it. This has the effect that all well-known names of a sender are
60attached to all incoming messages. You need this information to
61implement matches that match on a message sender name correctly. Of
62course, you should only request attachment of as little metadata
63fields as you need.
64
65The kernel will return in the "id" field your unique id. This is a
66simple numeric value. For compatibility with classic dbus1 simply
67format this as string and prefix ":0.".
68
69The kernel will also return the bloom filter size used for the signal
70broadcast bloom filter (see below).
71
72The kernel will also return the bus ID of the bus in an 128bit field.
73
74The pool size field returned by the kernel indicates the size of the
75memory mapped buffer.
76
77After the calling the hello ioctl, you should memory map the kdbus
78fd. Use the pool size returned by the hello ioctl as map size. In this
79memory mapped region the kernel will place all your incoming messages.
80
81SENDING MESSAGES
82
83Use the MSG_SEND ioctl to send a message to another peer. The ioctl
84takes a structure that contains a variety of fields:
85
86The flags field corresponds closely to the old dbus1 message header
87flags field, though the DONT_EXPECT_REPLY field got inverted into
88EXPECT_REPLY.
89
90The dst_id/src_id field contains the unique id of the destination and
91the sender. The sender field is overriden by the kernel usually, hence
92you shouldn't fill it in. The destination field can also take the
93special value KDBUS_DST_ID_BROADCAST for broadcast messages. For
94messages intended to a well-known name set the field to
95KDBUS_DST_ID_NAME, and attach the name in a special "items" entry to
96the message (see below).
97
98The payload field indicates the payload. For all dbus traffic it
99should carry the value 0x4442757344427573ULL. (Which encodes
100'DBusDBus').
101
102The cookie field corresponds with the "serial" field of classic
103dbus1. We simply renamed it here (and extended it to 64bit) since we
104didn't want to imply the monotonicity of the assignment the way the
105word "serial" indicates it.
106
107When sending a message that expects a reply, you need to set the
108EXPECT_REPLY flag in the message flag field. In this case you should
109also fill out the "timeout_ns" value which indicates the timeout in
110nsec for this call. If the peer does not respond in this time you will
111get a notifcation of a timeout. Note that this is also used for
112security purposes: a single reply messages is only allowed through the
113bus as long as the timeout has not ended. With this timeout value you
114hence "open a time window" in which the peer might respond to your
115request and the policy allows the response to go through.
116
117When sending a message that is a reply, you need to fill in the
118cookie_reply field, which is similar to the reply_serial field of
119dbus1. Note that a message cannot have EXPECT_REPLY and a reply_serial
120at the same time!
121
122This pretty much explains the ioctl header. The actual payload of the
123data is now referenced in additional items that are attached to this
124ioctl header structure at the end. When sending a message, you attach
125items of the type PAYLOAD_VEC, PAYLOAD_MEMFD, FDS, BLOOM, DST_NAME to
126it:
127
128 KDBUS_ITEM_PAYLOAD_VEC: contains a pointer + length pair for
129 referencing arbitrary user memory. This is how you reference most
130 of your data. It's a lot like the good old iovec structure of glibc.
131
132 KDBUS_ITEM_PAYLOAD_MEMFD: for large data blocks it is prefereable
133 to send prepared "memfds" (see below) over. This is item contains an
134 fd for a memfd plus a size.
135
136 KDBUS_ITEM_PAYLOAD_FDS: for sending over fds attach an item of this
137 type with an array of fds.
138
139 KDBUS_ITEM_BLOOM: the calculated bloom filter of this message, only
140 for undericted (broadcast) message.
141
142 KDBUS_DST_NAME: for messages that are directed to a well-known name
143 (instead of a unique name), this item contains the well-known name
144 field.
145
146A single message may consists on no, one or more payload items of type
147PAYLOAD_VEC or PAYLOAD_MEMFD. D-Bus protocol implementations should
148treat them as a single block that just happens to be split up into
149multiple items. Some restrictions apply however:
150
151 The message header in its entirety must be contained in a single
152 PAYLOAD_VEC item
153
154 You may only split your messsage up right in front of each GVariant
155 contained in the payload as well is immediately before framing of a
156 Gvariant, as well after as any padding bytes if there are any. The
157 padding bytes must be wholly contained in the preceding
158 PAYLOAD_VEC/PAYLOAD_MEMFD item. You may not split up simple types
159 nor arrays of trivial types. The latter is necessary to allow APIs
160 to return direct pointers to linear chunks of fixed size trivial
161 arrays. Examples: The simple types "u", "s", "t" have to be in the
162 same payload item. The array of simple types "ay", "ai" have to be
163 fully in contained in the same payload item. For an array "as" or
164 "a(si)" the only restriction however is to keep each string
165 individually in an uninterrupted item, to keep the framing of each
166 element and the array in a single uninterrupted item, however the
167 various strings might end up in different items.
168
169Note again that splitting up messages into seperate items is up to the
170implementation. Also note that the kdbus kernel side might merge
171seperate items if it deems this to be useful. However, the order in
172which items are contained in the message is left untouched.
173
174PAYLOAD_MEMFD items allow zero-copy data transfer (see below regarding
175the memfd concept). Note however that the overhead of mapping these
176makes them relatively expensive, and only worth the trouble for memory
177blocks > 128K (this value appears to be quite universal across
178architectures, as we tested). Thus we recommend sending PAYLOAD_VEC
179items over for small messages and restore to PAYLOAD_MEMFD items for
180messages > 128K. Since while building up the message you might not
181know yet whether it will grow beyond this boundary a good approach is
182to simply build the message unconditionally in a memfd
183object. However, when the message is sealed to be sent away check for
184the size limit. If the size of the message is < 128K, then simply send
185the data as PAYLOAD_VEC and reuse the memfd. If it is >= 128K, seal
186the memfd and send it as PAYLOAD_MEMFD, and allocate a new memfd for
187the next message.
188
189RECEIVING MESSAGES
190
191Use the MSG_RECV ioctl to read a message from kdbus. This will return
192an offset into the pool memory map, relative to its beginning.
193
194The received message structure more or less follows the structure of
195the message originally sent. However, certain changes have been
196made. In the header the src_id field will be filled in.
197
198The payload items might have gotten merged and PAYLOAD_VEC items are
199not used. Instead you will only find PAYLOAD_OFF and PAYLOAD_MEMFD
200items. The former contain an offset and size into your memory mapped
201pool where you find the payload.
202
203If during the HELLO ioctl you asked for getting meta data attached to
204your message you will find additional KDBUS_ITEM_CREDS,
205KDBUS_ITEM_PID_COMM, KDBUS_ITEM_TID_COMM, KDBUS_ITEM_TIMESTAMP,
206KDBUS_ITEM_EXE, KDBUS_ITEM_CMDLINE, KDBUS_ITEM_CGROUP,
207KDBUS_ITEM_CAPS, KDBUS_ITEM_SECLABEL, KDBUS_ITEM_AUDIT items that
208contain this metadata. This metadata will be for the sender at the
209point in time it sent the message. This information is hence uncached,
210and since it is appended by the kernel trustable. The
211KDBUS_ITEM_SECLABEL item usually contains the SELinux security label
212if it is used.
213
214After processing the message you need to call the KDBUS_CMD_FREE
215ioctl, which releases the message from the pool, and allows the kernel
216to store another message there. Note that the memory used by the pool
217is normal anonymous, swappable memory that is backed by tmpfs. Hence
218there is no need to copy the message out of it quickly, instead you
219can just leave it there as long as you need it and release it via the
220FREE ioctl only after that's done.
221
222BLOOM FILTERS
223
224The kernel does not understand dbus marshalling, it will not look into
225the message payload. To allow clients to subscribe to specific subsets
226of the broadcast matches we emply bloom filters.
227
228When broadcasting messages a bloom filter needs to be attached to the
229message in a KDBUS_ITEM_BLOOM item (and only for broadcasting
230messages!). If you don't know what bloom filters are, read up now on
231Wikipedia. In short: they are a very efficient way how to
232probabilistically check whether a certain word is contained in a
233vocabulary. It knows no false negatives, but it does know false
234positives.
235
236The bloom filter that needs to be included has the parameters m=512
237(bits in the filter), k=8 (nr of hash functions). The underlying hash
238function is SipHash-2-4. We calculate two hash values for an input
239strings, one with the hash key b9660bf0467047c18875c49c54b9bd15 (this
240is supposed to be read as a series of 16 hexadecimially formatted
241bytes), and one with the hash key
242aaa154a2e0714b39bfe1dd2e9fc54a3b. This results in two 64bit hash
243values, A and B. The 8 hash functions for the bloom filter require a 9
244bit output each (since m=512=2^9), to generate these we XOR combine
245the first 8 bit of A shifted to the left by 1, with the first 8 bit of
246B. Then, for the next hash function we use the second 8 bit pair, and
247so on.
248
249For each message to send across the bus we populate the bloom filter
250with all possible matchable strings. If a client then wants to
251subscribe to messages of this type it simply tells the kernel to test
252its own calculated bit mask against the bloom filter of each message.
253
254More specifically the following strings are added to the bloom filter
255of each message that is broadcast:
256
257 The string "interface:" suffixed by the interface name
258
259 The string "member:" suffixed by the member name
260
261 The string "path:" suffixed by the path name
262
263 The string "path-slash-prefix:" suffixed with the path name, and
264 also all prefixes of the path name (cut off at "/"), also prefixed
265 with "path-slash-prefix".
266
267 The string "message-type:" suffixed with the strings "signal",
268 "method_call", "error" or "method_return" for the respective message
269 type of the message.
270
271 If the first argument of the message is a string, "arg0:" suffixed
272 with the first argument.
273
274 If the first argument of the message is a string, "arg0-dot-prefix"
275 suffixed with the first argument, and also all prefixes of the
276 argument (cut off at "."), also prefixed with "arg0-dot-prefix".
277
278 If the first argument of the message is a string,
279 "arg0-slash-prefix" suffixed with the first argument, and also all
280 prefixes of the argument (cut off at "/"), also prefixed with
281 "arg0-slash-prefix".
282
283 Similar for all further arguments that are strings up to 63, for the
284 arguments and their "dot" and "slash" prefixes. On the first
285 argument that is not a string addition to the bloom filter should be
286 stopped however.
287
288(Note that the bloom filter does not container sender nor receiver
289names!)
290
291When a client wants to subscribe to messages matching a certain
292expression it should calculate the bloom mask following the same
293algorithm. The kernel will then simply test the mask againt the
294attached bloom filters.
295
296Note that bloom filters are probabilistic, which means that clients
297might get messages they did not expect. You bus protocol
298implementation must be capable of dealing with these unexpected
299messages (which it needs to anyway, given that transfers are
300relatively unrestricted on kdbus and people can send you all kinds of
301non-sense.).
302
303INSTALLING MATCHES
304
305To install matches for broadcast messages use the KDBUS_CMD_ADD_MATCH
306ioctl. It takes a structure that contains an encoded match expression,
307and that is followed by one or more items, which are combined in an
308AND way. (Meaning: a messages is matched exactly when all items
309attached to the original ioctl struct match).
310
311To match against other user messages add a KDBUS_ITEM_BLOOM item in
312the match (see above). Note that the bloom filter does not include
313matches to the sender names. To additionally check against sender
314names, use the KDBUS_ITEM_ID (for unique id matches) and
315KDBUS_ITEM_NAME (for well-known name matches) item types.
316
317To match against kernel generated messages (see below) you should add
318items of the same type as the kernel messages include,
319i.e. KDBUS_ITEM_NAME_ADD, KDBUS_ITEM_NAME_REMOVE,
320KDBUS_ITEM_NAME_CHANGE, KDBUS_ITEM_ID_ADD, KDBUS_ITEM_ID_REMOVE and
321fill them out. Note however, that you have some wildcards in this
322case, for example the .id field of KDBUS_ITEM_ADD/KDBUS_ITEM_REMOVE
323structures may be set to 0 to match against any id addition/removal.
324
325Note that dbus match strings do no map 1:1 to these ioctl() calls. In
326many cases (where the match string is "underspecified") you might need
327to issue up to six different ioctl() calls for the same match. For
328example, the empty match (which matches against all messages), would
329translate into one KDBUS_ITEM_BLOOM ioctl, one KDBUS_ITEM_NAME_ADD,
330one KDBUS_ITEM_NAME_CHANGE, one KDBUS_ITEM_NAME_REMOVE, one
331KDBUS_ITEM_ID_ADD and one KDBUS_ITEM_ID_REMOVE.
332
333When creating a match you may attach a "cookie" value to them, which
334is used for deleting a match again. The cookie can be selected freely
335be the client. When issuing KDBUS_CMD_REMOVE_MATCH simply pass the
336same cookie as before and all matches matching the same "cookie" value
337will be removed. This is particulary handy for the case where multiple
338ioctl()s are added for a single match strings.
339
340MEMFDS
341
342The "memfd" concept is used for zero-copy data transfers (see
343above). memfds are file descriptors to memory chunks of arbitrary
344sizes. If you have a memfd you can mmap() it to get access to the data
345it contains or write to it. They are comparable to file descriptors to
346unlinked files on a tmpfs, or to anonymous memory that one may refer
347to with an fd. They have one particular property: they can be
348"sealed". A memfd that is "sealed" is protected from alteration. Only
349memfds that are currently not mapped and to which a single fd refers
350may be sealed (they may also be unsealed in that case).
351
352The concept of "sealing" makes memfds useful for using them as
353transport for kdbus messages: only when the receiver knows that the
354message it received cannot change while looking at it can safely parse
355it without having to copy it to a safe memory error. memfds can also
356be reused in multiple messages. A sender may send the same memfd to
357multiple peers, and since it is sealed in can rely that the received
358will not be able to modify it. "Sealing" hence provides both sides of
359a transactiom with the guarantee that the data stays constant and is
360reusable.
361
362memfds are a generic concept that can be used outside of the immediate
363kdbus usecase. You can send them across AF_UNIX sockets too, sealed or
364unsealed. In kdbus themselves they can be used to send zero-copy
365payloads, but may also be sent as normal fds.
366
367memfds are allocated KDBUS_CMD_MEMFD_NEW ioctl. After allocation
368simply memory map them and write to them. To set their size use
369KDBUS_CMD_MEMFD_SIZE_SET. Note that memfds will ne increased in size
370automatically if you touch previously unallocated pages. However, the
371size will only be increased in multiples of the page size in that
372case. Thus, in almost all cases, an explicitl KDBUS_CMD_MEMFD_SIZE_SET
373is necessary, since it allows setting memfd sizes in finer
374granularity. To seal a memfd use the KDBUS_CMD_MEMFD_SEAL_SET ioctl
375call. It will only succeeds if the caller has the only fd reference to
376the memfd open, and if the memfd is currently unmapped.
377
378memfds may be sent across kdbus via KDBUS_ITEM_PAYLOAD_MEMFD items
379attached to messages. If this is done the data included in the memfd
380is considered part of the payload stream of a message, and are treated
381the same way as KDBUS_ITEM_PAYLOAD_VEC by the receiving side. It is
382possible to interleave KDBUS_ITEM_PAYLOAD_MEMFD and
383KDBUS_ITEM_PAYLOAD_VEC items freely, by the reader they will be
384considered a single stream of bytes in the order these items appear in
385the message, that just happens to be split up at various places
386(regarding rules how they may be split up, see above). The kernel will
387refuse taking KDBUS_ITEM_PAYLOAD_MEMFD items that refer to memfds that
388are not sealed.
389
390Note that sealed memfds may be unsealed again if they are not mapped
391you have the only fd reference to them.
392
393Alternatively to sending memfds as KDBUS_ITEM_PAYLOAD_MEMFD items
394(where they just form part of the payload stream of a message) you can
395also simply attach their fds to a message using
396KDBUS_ITEM_PAYLOAD_FDS. In this case the memfd contents is not
397considered part of the payload stream of the message, but simply fds
398like any other that happen to be attached to the message.
399
400MESSAGES FROM THE KERNEL
401
402A couple of messages previousl generated by the dbus1 bus driver are
403now generated by the kernel. Since the kernel does not understand the
404payload marshalling they are shipped in a different format
405though. This is indicated with a the "payload type" field of the
406messages set to 0. Library implementations should take these messages
407and synthesize traditional driver messages for them on reception.
408
409More specifically:
410
411 Instead of the NameOwnerChanged, NameLost, NameAcquired signals
412 there are kernel messages containing KDBUS_ITEM_NAME_ADD,
413 KDBUS_ITEM_NAME_REMOVE, KDBUS_ITEM_NAME_CHANGE, KDBUS_ITEM_ID_ADD,
414 KDBUS_ITEM_ID_REMOVE items are generated (each message will contain
415 exactly one of these items). Note that in In libsystemd-bus we have
416 obsoleted NameLost/NameAcquired messages, since they are entirely
417 redundant to NameOwnerChanged. This library will hence only
418 synthesize NameOwnerChanged messages from these kernel messages,
419 and never generate NameLost/NameAcquired. If you library needs to
420 stay compatible to the old dbus1 userspace, you possibly might need
421 to synthesize both a NameOwnerChanged and NameLost/NameAcquired
422 message from the same kernel message.
423
424 When a method call times out KDBUS_ITEM_REPLY_TIMEOUT message is
425 generated. This should be synthesized into a method error reply
426 message to the original call.
427
428 When a method call fails because the peer terminated the connection
429 before responding a KDBUS_ITEM_REPLY_DEAD message is
430 generated. Simiarl, it should be synthesized into a method error
431 reply message.
432
433For synthesized messages we recommend setting the cookie field to
434(uint32_t) -1 (and not (uint64_t) -1!), so that the cookie is not 0
435(which the dbus1 spec does not allow), but clearly recognizable as
436synthetic.
437
438Note that the KDBUS_ITEM_NAME_XYZ messages will actually inform you
439about all kinds of names, including activatable ones. Classic dbus1
440NameOwnerChanged messages OTOH are only generated when a name is
441really acquired on the bus and not just simply activatable. This means
442you must explictly check for the case where an activatable name
443becomes acquired or an acquired name is lost and returns to be
444activatable.
445
446NAME REGISTRY
447
448To acquire names on the bus use the KDBUS_CMD_NAME_ACQUIRE ioctl(). It
449takes a flags field similar to dbus1's RequestName() bus driver call,
450however the NO_QUEUE flag got inverted into a QUEUE flag instead.
451
452To release a previousl acquired name use the KDBUS_CMD_NAME_RELEASE
453ioctl().
454
455To list acquired names use the KDBUS_CMD_CONN_INFO ioctl. It may be
456used to list unique names, well known names as well as activatable
457names and clients currently queueing for ownership of a well-known
458name. The ioctl will return an offset into the memory pool. After
459reading all the data you need you need to release this via the
460KDBUS_CMD_FREE ioctl(), similar how you release a received message.
461
00586799
LP
462CREDENTIALS
463
464kdbus can optionally attach all kinds of metadata about the sender at
465the point of time of sending ("credentials") to messages, on request
466of the receiver. This is both supported on directed and undirected
467(broadcast) messages. The metadata to attach is selected at time of
468the HELLO ioctl of the receiver via a flags field (see above). Note
469that clients must be able to handle that messages contain more
470metadata than they asked for themselves, to simplify implementation of
471broadcasting in the kernel. The receiver should not rely on this data
472to be around though, even though it will be correct if it happens to
473be attached. In order to avoid programming errors in application we'd
474recommend though not to pass this data on to clients that did not
475explicitly ask for it.
476
477Credentials may also be queried for a well-known or unique name. Use
478the KDBUS_CMD_CONN_INFO for this. It will return an offset to the pool
479area again, which will contain the same credential items as messages
480have attached. Note that when issuing the ioctl you can select a
481different set of credentials to gather than was originally requested
482for being attached to incoming messages.
483
484Credentials are always specific to the sender namespace that was
485current at the time of sending, and of the proceess that opened the
486bus connection at the time of opening it. Note that this latter data
487is cached!
488
489POLICY
490
491The kernel enforces only very limited policy on names. It will not do
492access filtering by userspace payload, and thus not by interface or
493method name.
494
495This ultimately means that most finegrained policy enforcement needs
496to be done by the receiving process. We recommend using PolicyKit for
497any more complex checks. However, libraries should make simple static
498policy decisions regarding privileged/unprivileged method calls
499easy. We recommend doing this by enabling KDBUS_ATTACH_CAPS and
500KDBUS_ATTACH_CREDS for incoming messages, and then discerning client
501access by some capability of if sender and receiver UIDs match.
502
503BUS ADDRESSES
504
505When connecting to kdbus use the "kernel:" protocol prefix in DBus
506address strings. The device node path is encoded in its "path="
507parameter.
508
509Client libraries should use the following connection string when
510connecting to the system bus:
511
512 kernel:path=/dev/kdbus/0-system/bus;unix:path=/run/dbus/system_bus_socket
513
514This will ensure that kdbus is preferred over the legacy AF_UNIX
515socket, but compatibility is kept. For the user bus use:
516
517 kernel:path=/dev/kdbus/$UID-system/bus;unix:path=$XDG_RUNTIME_DIR/bus
518
519With $UID replaced by the callers numer user ID, and $XDG_RUNTIME_DIR
520following the XDG basedir spec.
521
522Of course the $DBUS_SYSTEM_BUS_ADDRESS and $DBUS_SESSION_BUS_ADDRESS
523variables should still take precedence.
524
9129246b
LP
525DBUS SERVICE FILES
526
527Activatable services for kdbus may not use classic dbus1 service
528activation files. Instead, programs should drop in native systemd
529.service and .busname unit files, so that they are treated uniformly
530with other types of units and activation of the system.
531
532Note that this results in a major difference to classic dbus1:
533activatable bus names can be established at any time in the boot. This
534is unlike dbus1 where activatable names are unconditionally available
535as long as dbus-daemon is running. Being able to control when
536activatable names are established is essential to allow usage of kdbus
537during early boot and in initrds, without the risk of triggering
538services too early.
539
00586799
LP
540DISCLAIMER
541
542This all is just the status quo. We are putting this together, because
543we are quite confident that further API changes will be smaller, but
544to make this very clear: this is all subject to change, still!
545
546We invite you to port over your favourite dbus library to this new
547scheme, but please be prepared to make minor changes when we still
548change these interfaces!