]> git.ipfire.org Git - thirdparty/strongswan.git/blame - doc/oppimpl.txt
- import of strongswan-2.7.0
[thirdparty/strongswan.git] / doc / oppimpl.txt
CommitLineData
997358a6
MW
1Implementing Opportunistic Encryption
2
3Henry Spencer & D. Hugh Redelmeier
4
5Version 4+, 15 Dec 2000
6
7
8
9Updates
10
11Major changes since last version: "Negotiation Issues" section discussing
12some interoperability matters, plus some wording cleanup. Some issues
13arising from discussions at OLS are not yet resolved, so there will almost
14certainly be another version soon.
15
16xxx incoming could be opportunistic or RW. xxx any way of saving unaware
17implementations??? xxx compression needs mention.
18
19
20
21Introduction
22
23A major long-term goal of the FreeS/WAN project is opportunistic
24encryption: a security gateway intercepts an outgoing packet aimed at a
25new remote host, and quickly attempts to negotiate an IPsec tunnel to that
26host's security gateway, so that traffic can be encrypted and
27authenticated without changes to the host software. (This generalizes
28trivially to the end-to-end case where host and security gateway are one
29and the same.) If the attempt fails, the packet (or a retry thereof)
30passes through in clear or is dropped, depending on local policy.
31Prearranged tunnels bypass all this, so static VPNs can coexist with
32opportunistic encryption.
33
34xxx here Although significant intelligence about all this is necessary at the
35initiator end, it's highly desirable for little or no special machinery
36to be needed at the responder end. In particular, if none were needed,
37then a security gateway which knows nothing about opportunistic encryption
38could nevertheless participate in some opportunistic connections.
39
40IPSEC gives us the low-level mechanisms, and the key-exchange machinery,
41but there are some vague spots (to put it mildly) at higher levels.
42
43One constraint which deserves comment is that the process of tunnel setup
44should be quick. Moreover, the decision that no tunnel can be created
45should also be quick, since that will be a common case, at least in the
46beginning. People will be reluctant to use opportunistic encryption if it
47causes gross startup delays on every connection, even connections which see
48no benefit from it. Win or lose, the process must be rapid.
49
50There's nothing much we can do to speed up the key exchange itself. (The
51one thing which conceivably might be done is to use Aggressive Mode, which
52involves fewer round trips, but it has limitations and possible security
53problems, and we're reluctant to touch it.) What we can do, is to make the
54other parts of the setup process as quick as possible. This desire will
55come back to haunt us below. :-)
56
57A further note is that we must consider the processing at the responder
58end as well as the initiator end.
59
60Several pieces of new machinery are needed to make this work. Here's a
61brief list, with details considered below.
62
63+ Outgoing Packet Interception. KLIPS needs to intercept packets which
64likely would benefit from tunnel setup, and bring them to Pluto's
65attention. There needs to be enough memory in the process that the same
66tunnel doesn't get proposed too often (win or lose).
67
68+ Smart Connection Management. Not only do we need to establish tunnels
69on request, once a tunnel is set up, it needs to be torn down eventually
70if it's not in use. It's also highly desirable to detect the fact that it
71has stopped working, and do something useful. Status changes should be
72coordinated between the two security gateways unless one has crashed,
73and even then, they should get back into sync eventually.
74
75+ Security Gateway Discovery. Given a packet destination, we must decide
76who to attempt to negotiate a tunnel with. This must be done quickly, win
77or lose, and reliably even in the presence of diverse network setups.
78
79+ Authentication Without Prearrangement. We need to be sure we're really
80talking to the intended security gateway, without being able to prearrange
81any shared information. He needs the same assurance about us.
82
83+ More Flexible Policy. In particular, the responding Pluto needs a way
84to figure out whether the connection it is being asked to make is okay.
85This isn't as simple as just searching our existing conn database -- we
86probably have to specify *classes* of legitimate connections.
87
88Conveniently, we have a three-letter acronym for each of these. :-)
89
90Note on philosophy: we have deliberately avoided providing six different
91ways to do each step, in favor of specifying one good one. Choices are
92provided only when they appear to be necessary. (Or when we are not yet
93quite sure yet how best to do something...)
94
95
96
97OPI, SCM
98
99Smart Connection Management would be quite useful even by itself,
100requiring manual triggering. (Right now, we do the manual triggering, but
101not the other parts of SCM.) Outgoing Packet Interception fits together
102with SCM quite well, and improves its usefulness further. Going through a
103connection's life cycle from the start...
104
105OPI itself is relatively straightforward, aside from the nagging question
106of whether the intercepted packet is put on hold and then released, or
107dropped. Putting it on hold is preferable; the alternative is to rely on
108the application or the transport layer re-trying. The downside of packet
109hold is extra resources; the downside of packet dropping is that IPSEC
110knows *when* the packet can finally go out, and the higher layers don't.
111Either way, life gets a little tricky because a quickly-retrying
112application may try more than once before we know for sure whether a
113tunnel can be set up, and something has to detect and filter out the
114duplications. Some ARP implementations use the approach of keeping one
115packet for an as-yet-unresolved address, and throwing away any more that
116appear; that seems a reasonable choice.
117
118(Is it worth intercepting *incoming* packets, from the outside world, and
119attempting tunnel setup based on them? Perhaps... if, and only if, we
120organize AWP so that non-opportunistic SGs can do it somehow. Otherwise,
121if the other end has not initiated tunnel setup itself, it will not be
122prepared to do so at our request.)
123
124Once a tunnel is up, packets going into it naturally are not intercepted
125by OPI. However, we need to do something about the flip side of this too:
126after deciding that we *cannot* set up a tunnel, either because we don't
127have enough information or because the other security gateway is
128uncooperative, we have to remember that for a while, so we don't keep
129knocking on the same locked door. One plausible way of doing that is to
130set up a bypass "tunnel" -- the equivalent of our current %passthrough
131connection -- and have it managed like a real SCM tunnel (finite lifespan
132etc.). This sounds a bit heavyweight, but in practice, the alternatives
133all end up doing something very similar when examined closely. Note that
134we need an extra variant of this, a block rather than a bypass, to cover
135the case where local policy dictates that packets *not* be passed through;
136we still have to remember the fact that we can't set up a real tunnel.
137
138When to tear tunnels down is a bit problematic, but if we're setting up a
139potentially unbounded number of them, we have to tear them down *somehow*
140*sometime*. It seems fairly obvious that we set a tentative lifespan,
141probably fairly short (say 1min), and when it expires, we look to see if
142the tunnel is still in use (say, has had traffic in the last half of the
143lifespan). If so, we assign it a somewhat longer lifespan (say 10min),
144after which we look again. If not, we close it down. (This lifespan is
145independent of key lifetime; it is just the time when the tunnel's future
146is next considered. This should happen reasonably frequently, unlike
147rekeying, which is costly and shouldn't be too frequent.) Multi-step
148backoff algorithms probably are not worth the trouble; looking every
14910min doesn't seem onerous.
150
151For the tunnel-expiry decision, we need to know how long it has been since
152the last traffic went through. A more detailed history of the traffic
153does not seem very useful; a simple idle timer (or last-traffic timestamp)
154is both necessary and sufficient. And KLIPS already has this.
155
156As noted, default initial lifespan should be short. However, Pluto should
157keep a history of recently-closed tunnels, to detect cases where a tunnel
158is being repeatedly re-established and should be given a longer lifespan.
159(Not only is tunnel setup costly, but it adds user-visible delay, so
160keeping a tunnel alive is preferable if we have reason to suspect more
161traffic soon.) Any tunnel re-established within 10min of dying should have
16210min added to its initial lifespan. (Just leaving all tunnels open longer
163is unappealing -- adaptive lifetimes which are sensitive to the behavior
164of a particular tunnel are wanted. Tunnels are relatively cheap entities
165for us, but that is not necessarily true of all implementations, and there
166may also be administrative problems in sorting through large accumulations
167of idle tunnels.)
168
169It might be desirable to have detailed information about the initial
170packet when determining lifespans. HTTP connections in particular are
171notoriously bursty and repetitive.
172
173Arguably it would be nice to monitor TCP connection status. A still-open
174TCP connection is almost a guarantee that more traffic is coming, while
175the closing of the only TCP connection through a tunnel is a good hint
176that none is. But the monitoring is complex, and it doesn't seem worth
177the trouble.
178
179IKE connections likewise should be torn down when it appears the need has
180passed. They should linger longer than the last tunnel they administer,
181just in case they are needed again; the cost of retaining them is low. An
182SG with only a modest number of them open might want to simply retain each
183until rekeying time, with more aggressive management cutting in only when
184the number gets large. (They should be torn down eventually, if only to
185minimize the length of a status report, but rekeying is the only expensive
186event for them.)
187
188It's worth remembering that tunnels sometimes go down because the other
189end crashes, or disconnects, or has a network link break, and we don't get
190any notice of this in the general case. (Even in the event of a crash and
191successful reboot, we won't hear about it unless the other end has
192specific reason to talk IKE to us immediately.) Of course, we have to
193guard against being too quick to respond to temporary network outages,
194but it's not quite the same issue for us as for TCP, because we can tear
195down and then re-establish a tunnel without any user-visible effect except
196a pause in traffic. And if the other end does go down and come back up,
197we and it can't communicate *at all* (except via IKE) until we tear down
198our tunnel.
199
200So... we need some kind of heartbeat mechanism. Currently there is none
201in IKE, but there is discussion of changing that, and this seems like the
202best approach. Doing a heartbeat at the IP level will not tell us about a
203crash/reboot event, and sending heartbeat packets through tunnels has
204various complications (they should stop at the far mouth of the tunnel
205instead of going on to a subnet; they should not count against idle
206timers; etc.). Heartbeat exchanges obviously should be done only when
207there are tunnels established *and* there has been no recent incoming
208traffic through them. It seems reasonable to do them at lifespan ends,
209subject to appropriate rate limiting when more than one tunnel goes to the
210same other SG. When all traffic between the two ends is supposed to go
211via the tunnel, it might be reasonable to do a heartbeat -- subject to a
212rate limiter to avoid DOS attacks -- if the kernel sees a non-tunnel
213non-IKE packet from the other end.
214
215If a heartbeat gets no response, try a few (say 3) pings to check IP
216connectivity; if one comes back, try another heartbeat; if it gets no
217response, the other end has rebooted, or otherwise been re-initialized,
218and its tunnels should be torn down. If there's no response to the pings,
219note the fact and try the sequence again at the next lifespan end; if
220there's nothing then either, declare the tunnels dead.
221
222Finally... except in cases where we've decided that the other end is dead
223or has rebooted, tunnel teardown should always be coordinated with the
224other end. This means interpreting and sending Delete notifications, and
225also Initial-Contacts. Receiving a Delete for the other party's tunnel
226SAs should lead us to tear down our end too -- SAs (SA bundles, really)
227need to be considered as paired bidirectional entities, even though the
228low-level protocols don't think of them that way.
229
230
231
232SGD, AWP
233
234Given a packet destination, how do we decide who to (attempt to) negotiate
235a tunnel with? And as a related issue, how do the negotiating parties
236authenticate each other? DNSSEC obviously provides the tools for the
237latter, but how exactly do we use them?
238
239Having intercepted a packet, what we know is basically the IP addresses of
240source and destination (plus, in principle, some information about the
241desired communication, like protocol and port). We might be able to map
242the source address to more information about the source, depending on how
243well we control our local networks, but we know nothing further about the
244destination.
245
246The obvious first thing to do is a DNS reverse lookup on the destination
247address; that's about all we can do with available data. Ideally, we'd
248like to get all necessary information with this one DNS lookup, because
249DNS lookups are time-consuming -- all the more so if they involve a DNSSEC
250signature-checking treewalk by the name server -- and we've got to hurry.
251While it is unusual for a reverse lookup to yield records other than PTR
252records (or possibly CNAME records, for RFC 2317 classless delegation),
253there's no reason why it can't.
254
255(For purposes like logging, a reverse lookup is usually followed by a
256forward lookup, to verify that the reverse lookup wasn't lying about the
257host name. For our purposes, this is not vital, since we use stronger
258authentication methods anyway.)
259
260While we want to get as much data as possible (ideally all of it) from one
261lookup, it is useful to first consider how the necessary information would
262be obtained if DNS lookups were instantaneous. Two pieces of information
263are absolutely vital at this point: the IP address of the other end's
264security gateway, and the SG's public key*.
265
266(* Actually, knowledge of the key can be postponed slightly -- it's not
267needed until the second exchange of the negotiations, while we can't even
268start negotiations without knowing the IP address. The SG is not
269necessarily on the plain-IP route to the destination, especially when
270multiple SGs are present.)
271
272Given instantaneous DNS lookups, we would:
273
274+ Start with a reverse lookup to turn the address into a name.
275
276+ Look for something like RFC-2782 SRV records using the name, to find out
277who provides this particular service. If none comes back, we can abandon
278the whole process.
279
280+ Select one SRV record, which gives us the name of a target host (plus
281possibly one or more addresses, if the name server has supplied address
282records as Additional Data for the SRV records -- this is recommended
283behavior but is not required).
284
285+ Use the target name to look up a suitable KEY record, and also address
286record(s) if they are still needed.
287
288This gives us the desired address(es) and key. However, it requires three
289lookups, and we don't even find out whether there's any point in trying
290until after the second.
291
292With real DNS lookups, which are far from instantaneous, some optimization
293is needed. At the very least, typical cases should need fewer lookups.
294
295So when we do the reverse lookup on the IP address, instead of asking for
296PTR, we ask for TXT. If we get none, we abandon opportunistic
297negotiation, and set up a bypass/block with a relatively long life (say
2986hr) because it's not worth trying again soon. (Note, there needs to be a
299way to manually force an early retry -- say, by just clearing out all
300memory of a particular address -- to cover cases where a configuration
301error is discovered and fixed.)
302
303xxx need to discuss multi-string TXTs
304
305In the results, we look for at least one TXT record with content
306"X-IPsec-Server(nnn)=a.b.c.d kkk", following RFC 1464 attribute/value
307notation. (The "X-" indicates that this is tentative and experimental;
308this design will probably need modification after initial experiments.)
309Again, if there is no such record, we abandon opportunistic negotiation.
310
311"nnn" and the parentheses surrounding it are optional. If present, it
312specifies a priority (low number high priority), as for MX records, to
313control the order in which multiple servers are tried. If there are no
314priorities, or there are ties, pick one randomly.
315
316"a.b.c.d" is the dotted-decimal IP address of the SG. (Suitable extensions
317for IPv6, when the time comes, are straightforward.)
318
319"kkk" is either an RSA-MD5 public key in base-64 notation, as in the text
320form of an RFC 2535 KEY record, or "@hhh". In the latter case, hhh is a
321DNS name, under which one Host/Authentication/IPSEC/RSA-MD5 KEY record is
322present, giving the server's authentication key. (The delay of the extra
323lookup is undesirable, but practical issues of key management may make it
324advisable not to duplicate the key itself in DNS entries for many
325clients.)
326
327It unfortunately does appear that the authentication key has to be
328associated with the server, not the client behind it. At the time when
329the responder has to authenticate our SG, it does not know which of its
330clients we are interested in (i.e., which key to use), and there is no
331good way to tell it. (There are some bad ways; this decision may merit
332re-examination after experimental use.)
333
334The responder authenticates our SG by doing a reverse lookup on its IP
335address to get a Host/Authentication/IPSEC/RSA-MD5 KEY record. He can
336attempt this in parallel with the early parts of the negotiation (since he
337knows our SG IP address from the first negotiation packet), at the risk of
338having to abandon the attempt and do a different lookup if we use
339something different as our ID (see below). Unfortunately, he doesn't yet
340know what client we will claim to represent, so he'll need to do another
341lookup as part of phase 2 negotiation (unless the client *is* our SG), to
342confirm that the client has a TXT X-IPsec-Server record pointing to our
343SG. (Checking that the record specifies the same key is not important,
344since the responder already has a trustworthy key for our SG.)
345
346Also unfortunately, opportunistic tunnels can only have degenerate subnets
347(/32 subnets, containing one host) at their ends. It's superficially
348attractive to negotiate broader connections... but without prearrangement,
349you don't know whether you can trust the other end's claim to have a
350specific subnet behind it. Fixing this would require a way to do a
351reverse lookup on the *subnet* (you cannot trust information in DNS
352records for a name or a single address, which may be controlled by people
353who do not control the whole subnet) with both the address and the mask
354included in the name. Except in the special case of a subnet masked on a
355byte boundary (in which case RFC 1035's convention of an incomplete
356in-addr.arpa name could be used), this would need extensions to the
357reverse-map name space, which is awkward, especially in the presence of
358RFC 2317 delegation. (IPv6 delegation is more flexible and it might be
359easier there.)
360
361There is a question of what ID should be used in later steps of
362negotiation. However, the desire not to put more DNS lookups in the
363critical path suggests avoiding the extra complication of varied IDs,
364except in the Road Warrior case (where an extra lookup is inevitable).
365Also, figuring out what such IDs *mean* gets messy. To keep things simple,
366except in the RW case, all IDs should be IP addresses identical to those
367used in the packet headers.
368
369For Road Warrior, the RW must be the initiator, since the home-base SG has
370no idea what address the RW will appear at. Moreover, in general the RW
371does not control the DNS entries for his address. This inherently denies
372the home base any authentication of the RW's IP address; the most it can
373do is to verify an identity he provides, and perhaps decide whether it
374wishes to talk to someone with that identity, but this does not verify his
375right to use that IP address -- nothing can, really.
376
377(That may sound like it would permit some man-in-the-middle attacks, but
378the RW can still do full authentication of the home base, so a man in the
379middle cannot successfully impersonate home base. Furthermore, a man in
380the middle must impersonate both sides for the DH exchange to work. So
381either way, the IKE negotiation falls apart.)
382
383A Road Warrior provides an FQDN ID, used for a forward lookup to obtain a
384Host/Authentication/IPSEC/RSA-MD5 KEY record. (Note, an FQDN need not
385actually correspond to a host -- e.g., the DNS data for it need not
386include an A record.) This suffices, since the RW is the initiator and
387the responder knows his address from his first packet.
388
389Certain situations where a host has a more-or-less permanent IP address,
390but does not control its DNS entries, must be treated essentially like
391Road Warrior. It is unfortunate that DNS's old inverse-query feature
392cannot be used (nonrecursively) to ask the initiator's local DNS server
393whether it has a name for the address, because the address will almost
394always have been obtained from a DNS name lookup, and it might be a lookup
395of a name whose DNS entries the host *does* control. (Real examples of
396this exist: the host has a preferred name whose host-controlled entry
397includes an A record, but a reverse lookup on the address sends you to an
398ISP-controlled name whose entry has an A record but not much else.) Alas,
399inverse query is long obsolete and is not widely implemented now.
400
401There are some questions in failure cases. If we cannot acquire the info
402needed to set up a tunnel, this is the no-tunnel-possible case. If we
403reach an SG but negotiation fails, this too is the no-tunnel-possible
404case, with a relatively long bypass/block lifespan (say 1hr) since
405fruitless negotiations are expensive. (In the multiple-SG case, it seems
406unlikely to be worthwhile to try other SGs just in case one of them might
407have a configuration permitting successful negotiation.)
408
409Finally, there is a sticky problem with timeouts. If the other SG is down
410or otherwise inaccessible, in the worst case we won't hear about this
411except by not getting responses. Some other, more pathological or even
412evil, failure cases can have the same result. The problem is that in the
413case where a bypass is permitted, we want to decide whether a tunnel is
414possible quickly. It gets even worse if there are multiple SGs, in which
415case conceivably we might want to try them all (since some SGs being up
416when others are down is much more likely than SGs differing in policy).
417
418The patience setting needs to be configurable policy, with a reasonable
419default (to be determined by experiment). If it expires, we simply have
420to declare the attempt a failure, and set up a bypass/block. (Setting up
421a tentative bypass/block, and replacing it with a real tunnel if remaining
422attempts do produce one, looks attractive at first glance... but exposing
423the first few seconds of a connection is often almost as bad as exposing
424the whole thing!) Such a bypass/block should have a short lifespan, say
42510min, because the SG(s) might be only temporarily unavailable.
426
427The flip side of IKE waiting for a timeout is that all other forms of
428feedback, e.g. "host not reachable", should be *ignored*, because you
429cannot trust them! This may need kernel changes.
430
431Can AWP be done by non-opportunistic SGs? Probably not; existing SG
432implementations generally aren't prepared to do anything suitable, except
433perhaps via the messy business of certificates. There is one borderline
434exception: some implementations rely on LDAP for at least some of their
435information fetching, and it might be possible to substitute a custom LDAP
436server which does the right things for them. Feasibility of this depends
437on details, which we don't know well enough.
438
439[This could do with a full example, a complete packet by packet walkthrough
440including all DNS and IKE traffic.]
441
442
443
444MFP
445
446Our current conn database simply isn't flexible enough to cover all this
447properly. In particular, the responding Pluto needs a way to figure out
448whether the connection it is being asked to make is legitimate.
449
450This is more subtle than it sounds, given the problem noted earlier, that
451there's no clear way to authenticate claims to represent a non-degenerate
452subnet. Our database has to be able to say "a connection to any host in
453this subnet is okay" or "a connection to any subnet within this subnet is
454okay", rather than "a connection to exactly this subnet is okay". (There
455is some analogy to the Road Warrior case here, which may be relevant.)
456This will require at least a re-interpretation of ipsec.conf.
457
458Interim stages of implementation of this will require a bit of thought.
459Notably, we need some way of dealing with the lack of fully signed DNSSEC
460records. Without user interaction, probably the best we can do is to
461remember the results of old fetches, compare them to the results of new
462fetches, and complain and disbelieve all of it if there's a mismatch.
463This does mean that somebody who gets fake data into our very first fetch
464will fool us, at least for a while, but that seems an acceptable tradeoff.
465
466
467
468Negotiation Issues
469
470There are various options which are nominally open to negotiation as part
471of setup, but which have to be nailed down at least well enough that
472opportunistic SGs can reliably interoperate. Somewhat arbitrarily and
473tentatively, opportunistic SGs must support Main Mode, Oakley group 5 for
474D-H, 3DES encryption and MD5 authentication for both ISAKMP and IPsec SAs,
475RSA digital-signature authentication with keys between 2048 and 8192 bits,
476and ESP doing both encryption and authentication. They must do key PFS
477in Quick Mode, but not identity PFS.
478
479
480
481What we need from DNS
482
483Fortunately, we don't need any new record types or suchlike to make this
484all work. We do, however, need attention to a couple of areas in DNS
485implementation.
486
487First, size limits. Although the information we directly need from a
488lookup is not enormous -- the only potentially-big item is the KEY record,
489and there should be only one of those -- there is still a problem with
490DNSSEC authentication signatures. With a 2048-bit key and assorted
491supporting information, we will fill most of a 512-byte DNS UDP packet...
492and if the data is to have DNSSEC authentication, at least one quite large
493SIG record will come too. Plus maybe a TSIG signature on the whole
494response, to authenticate it to our resolver. So: DNSSEC-capable name
495servers must fix the 512-byte UDP limit. We're told there are provisions
496for this; implementation of them is mandatory.
497
498Second, interface. It is unclear how the resolver interface will let us
499ask for DNSSEC authentication. We would prefer to ask for "authentication
500where possible", and get back the data with each item flagged by whether
501authentication was available (and successful!) or not available. Having
502to ask separately for authenticated and non-authenticated data would
503probably be acceptable, *provided* both will be cached on the first
504request, so the two requests incur only one set of (non-local) network
505traffic. Either way, we want to see the name server and resolver do this
506for us; that makes sense in any case, since it's important that
507verification be done somewhere where it can be cached, the more centrally
508the better.
509
510Finally, a wistful note: the ability to do a limited form of inverse
511queries (an almost forgotten feature), to ask the local name server which
512hostname it recently mapped to a particular address, would be quite
513helpful. Note, this is *NOT* the same as a reverse lookup, and crude
514fakes like putting a dotted-decimal address in brackets do not suffice.