]> git.ipfire.org Git - thirdparty/strongswan.git/blame - doc/opportunism.nr
- moved RFCs from ikev2 into doc dir
[thirdparty/strongswan.git] / doc / opportunism.nr
CommitLineData
997358a6
MW
1.DA "3 May 2001"
2.ds LH "
3.ds CH "Opportunistic Encryption
4.ds RH "
5.ds LF "Draft 4+
6.ds CF "\\*(DY
7.ds RF %
8.de P
9.LP
10..
11.de R
12.LP
13\fBRationale:\fR
14..
15.de A
16.LP
17\fBAhem:\fR
18..
19.TL
20Opportunistic Encryption
21.AU
22Henry Spencer
23D. Hugh Redelmeier
24.AI
25henry@spsystems.net
26hugh@mimosa.com
27Linux FreeS/WAN Project
28.AB no
29xxx cases where reverses not controlled, all possibilities.
30xxx DHR suggests okay if gateway doesn't control reverse but destination does.
31xxx level of patience where Responder just doesn't answer the phone.
32xxx IKE finger to get basic keying info, to be confirmed via DNSSEC?
33xxx packets from some OE connections might get special status,
34if the other end is definitely someone we trust.
35Opportunistic encryption permits secure (encrypted, authenticated)
36communication via IPsec without connection-by-connection prearrangement,
37either explicitly between hosts (when the hosts are capable of it) or
38transparently via packet-intercepting security gateways.
39It uses DNS records (authenticated with DNSSEC) to provide
40the necessary information for gateway discovery and gateway authentication,
41and constrains negotiation enough to guarantee success.
42.sp
43Substantive changes since draft 3:
44write off inverse queries as a lost cause;
45use Invalid-SPI rather than Delete as notification of unknown SA;
46minor wording improvements and clarifications.
47This document takes over from the older ``Implementing Opportunistic
48Encryption'' document.
49.AE
50.NH 1
51Introduction
52.P
53A major goal of the FreeS/WAN project is opportunistic encryption:
54a (security) gateway intercepts an outgoing packet aimed at a
55remote host, and quickly attempts to negotiate an IPsec tunnel to that
56host's security gateway.
57If the attempt succeeds, traffic can then be secure,
58transparently (without changes to the host software).
59If the attempt fails,
60the packet (or a retry thereof) passes through in clear or is dropped,
61depending on local policy.
62Prearranged tunnels bypass the packet interception etc., so static VPNs
63can coexist with opportunistic encryption.
64.P
65This generalizes trivially to the end-to-end case:
66host and security gateway simply are one and the same.
67Some optimizations are possible in that case,
68but the basic scheme need not change.
69.P
70The objectives for security systems need to be explicitly stated.
71Opportunistic encryption is meant to achieve secure communication,
72without prearrangement of the individual connection
73(although some prearrangement on a per-host basis is required),
74between any two hosts which implement the protocol
75(and, if they act as security gateways,
76between hosts behind them).
77Here ``secure'' means strong encryption and authentication of packets,
78with authentication of participants\(emto prevent man-in-the-middle
79and impersonation attacks\(emdependent on several factors.
80The biggest factor is the authentication of DNS records,
81via DNSSEC or equivalent means.
82A lesser factor is which exact variant
83of the setup procedure (see section 2.2) is used,
84because there is a tradeoff between strong authentication of the other end
85and ability
86to negotiate opportunistic encryption with hosts which have limited
87or no control of their reverse-map DNS records:
88without reverse-map information,
89we can verify that the host has the right to use a particular FQDN
90(Fully Qualified Domain Name),
91but not whether that FQDN is authorized to use that IP address.
92Local policy must decide whether authentication
93or connectivity has higher priority.
94.P
95Apart from careful attention to detail in various areas,
96there are three crucial design problems for opportunistic encryption.
97It needs a way to quickly identify the remote host's security gateway.
98It needs a way to quickly obtain an authentication key for the
99security gateway.
100And the numerous options which can be specified with IKE
101must be constrained sufficiently that two independent implementations are
102guaranteed to reach agreement,
103without any explicit prearrangement or preliminary negotiation.
104The first two problems are solved using DNS,
105with DNSSEC ensuring that the data obtained is reliable;
106the third is solved by specifying a minimum standard which must be supported.
107.P
108A note on philosophy:
109we have deliberately avoided providing six different
110ways to do each job, in favor of specifying one good one.
111Choices are
112provided only when they appear to be necessary,
113or at least important.
114.P
115A note on terminology:
116to avoid constant circumlocutions,
117an ISAKMP/IKE SA, possibly recreated occasionally by rekeying,
118will be referred to as a ``keying channel'',
119and a set of IPsec SAs providing bidirectional communication between
120two IPsec hosts,
121possibly recreated occasionally by rekeying,
122will be referred to as a ``tunnel''
123(it could conceivably use transport mode in the host-to-host case,
124but we advocate using tunnel mode even there).
125The word ``connection'' is here used in a more generic sense.
126The word ``lifetime'' will be avoided in favor of ``rekeying interval'',
127since many of the connections will have useful lives far shorter
128than any reasonable rekeying interval,
129and hence the two concepts must be separated.
130.P
131A note on document structure:
132Discussions of \fIwhy\fR things were done a particular way,
133or not done a particular way,
134are broken out in paragraphs headed ``Rationale:''
135(to preserve the flow of the text, many such paragraphs are deferred
136to the ends of sections).
137Paragraphs headed ``Ahem:'' are discussions of where the problem is being
138made significantly harder by problems elsewhere,
139and how that might be corrected.
140Some meta-comments are enclosed in [].
141.R
142The motive is to get the Internet encrypted.
143That requires encryption without connection-by-connection prearrangement:
144a system must be able to
145reliably negotiate an encrypted, authenticated
146connection with a total stranger.
147While end-to-end encryption is preferable,
148doing opportunistic encryption in security gateways
149gives enormous leverage for quick deployment of this technology,
150in a world where end-host software is often primitive, rigid, and outdated.
151.R
152Speed is of the essence in tunnel setup:
153a connection-establishment delay longer than about 10 seconds
154begins to cause problems for users and applications.
155Thus the emphasis on rapidity in gateway discovery and key fetching.
156.A
157Host-to-host opportunistic encryption
158would be utterly trivial if a fast public-key
159encryption/signature
160algorithm was available.
161You would do a reverse lookup on the destination address to obtain a
162public key for that address,
163and simply encrypt all packets going to it with that key,
164signing them with your own private key.
165Alas, this is impractical with current CPU speeds and current algorithms
166(although as noted later, it might be of some use for limited purposes).
167Nevertheless, it is a useful model.
168.NH 1
169Connection Setup
170.P
171For purposes of discussion, the network is taken to look like this:
172.DS
173Source----Initiator----...----Responder----Destination
174.DE
175The intercepted packet comes from the Source,
176bound for the Destination,
177and is intercepted at the Initiator.
178The Initiator communicates over the insecure Internet to the Responder.
179The Source and the Initiator might be the same host,
180or the Source might be an end-user host and the Initiator a
181security gateway (SG).
182Likewise for the Responder and the Destination.
183.P
184Given an intercepted packet,
185whose useful information (for our purposes)
186is essentially only the Destination's IP address,
187the Initiator
188must quickly determine the Responder (the Destination's SG) and
189fetch everything needed to authenticate it.
190The Responder must do likewise for the Initiator.
191Both must eventually also confirm that the other is authorized to act
192on behalf of the client host behind it (if any).
193.P
194An important subtlety here is that if the alternative to an IPsec tunnel
195is plaintext transmission, negative results must be obtained quickly.
196That is,
197the decision that \fIno\fR tunnel can be established must also be made rapidly.
198.NH 2
199Packet Interception
200.P
201Interception of outgoing packets is relatively straightforward
202in principle.
203It is preferable to put the intercepted packet on hold rather than
204dropping it, since higher-level retries are not necessarily well-timed.
205There is a problem of hosts and applications retrying during negotiations.
206ARP implementations, which face the same problem,
207use the approach of keeping the \fImost recent\fR
208packet for an as-yet-unresolved address,
209and throwing away older ones.
210(Incrementing of request numbers etc. means that replies to older ones may no
211longer be accepted.)
212.P
213Is it worth intercepting \fIincoming\fR packets, from the outside world, and
214attempting tunnel setup based on them?
215No, unless and until a way can be devised to initiate opportunistic encryption
216to a non-opportunistic responder,
217because
218if the other end has not initiated tunnel setup itself, it will not be
219prepared to do so at our request.
220.R
221Note, however, that most incoming packets will promptly be followed by
222an outgoing packet in response!
223Conceivably it might be useful to start early stages of negotiation,
224at least as far as looking up information,
225in response to an incoming packet.
226.R
227If a plaintext incoming packet indicates that the other
228end is not prepared to do opportunistic encryption,
229it might seem that this fact should be noted, to
230avoid consuming resources and delaying
231traffic in an attempt at opportunistic setup which is doomed to fail.
232However, this would be a major security hole,
233since the plaintext packet is not authenticated;
234see section 2.5.
235.NH 2
236Algorithm
237.P
238For clarity,
239the following defers most discussion of error handling to the end.
240.nr x \w'Step 3A.'u+1n
241.de S
242.IP "Step \\$1." \nxu
243..
244.S 1
245Initiator does a DNS reverse lookup on the Destination address,
246asking not for the usual PTR records,
247but for TXT records.
248Meanwhile, Initiator also sends a ping to the Destination,
249to cause any other dynamic setup actions to start happening.
250(Ping replies are disregarded;
251the host might not be reachable with plaintext pings.)
252.S 2A
253If at least one suitable TXT record (see section 2.3) comes back,
254each contains a potential Responder's IP address
255and that Responder's public key (or where to find it).
256Initiator picks one TXT record, based on priority (see 2.3),
257thus picking a Responder.
258If there was no public key in the TXT record,
259the Initiator also starts a DNS lookup (as specified by the TXT record)
260to get KEY records.
261.S 2B
262If no suitable TXT record is available,
263and policy permits,
264Initiator designates the Destination itself as the Responder
265(see section 2.4).
266If policy does not permit,
267or the Destination is unresponsive to the negotiation,
268then opportunistic encryption is not possible,
269and Initiator gives up (see section 2.5).
270.S 3
271If there already is a keying channel to the Responder's IP address,
272the Initiator uses the existing keying channel;
273skip to step 10.
274Otherwise, the Initiator starts an IKE Phase 1 negotiation
275(see section 2.7 for details)
276with the Responder.
277The address family of the Responder's IP address dictates whether
278the keying channel and the outside of the tunnel should be IPv4 or IPv6.
279.S 4
280Responder gets the first IKE message,
281and responds.
282It also starts a DNS reverse lookup on the Initiator's IP address,
283for KEY records, on speculation.
284.S 5
285Initiator gets Responder's reply,
286and sends first message of IKE's D-H exchange (see 2.4).
287.S 6
288Responder gets Initiator's D-H message,
289and responds with a matching one.
290.S 7
291Initiator gets Responder's D-H message;
292encryption is now established, authentication remains to be done.
293Initiator sends IKE authentication message,
294with an FQDN identity if a reverse lookup on its address will not yield a
295suitable KEY record.
296(Note, an FQDN need not
297actually correspond to a host\(eme.g., the DNS data for it need not
298include an A record.)
299.S 8
300Responder gets Initiator's authentication message.
301If there is no identity included,
302Responder waits for step 4's speculative DNS lookup to finish;
303it should yield a suitable KEY record (see 2.3).
304If there is an FQDN identity,
305responder discards any data obtained from step 4's DNS lookup;
306does a forward lookup on the FQDN, for a KEY record;
307waits for that lookup to return;
308it should yield a suitable KEY record.
309Either way, Responder uses the KEY data to verify the message's hash.
310Responder replies with an authentication message,
311with an FQDN identity if a reverse lookup on its address will not yield a
312suitable KEY record.
313.S 9A
314(If step 2A was used.)
315The Initiator gets the Responder's authentication message.
316Step 2A has provided a key (from the TXT record or via DNS lookup).
317Verify message's hash.
318Encrypted and authenticated keying channel established,
319man-in-middle attack precluded.
320.S 9B
321(If step 2B was used.)
322The Initiator gets the Responder's authentication message,
323which must contain an FQDN identity (if the Responder can't put a TXT in his
324reverse map he presumably can't do a KEY either).
325Do forward lookup on the FQDN,
326get suitable KEY record, verify hash.
327Encrypted keying channel established,
328man-in-middle attack precluded,
329but authentication weak (see 2.4).
330.S 10
331Initiator initiates IKE Phase 2 negotiation (see 2.7) to establish tunnel,
332specifying Source and Destination identities as IP addresses (see 2.6).
333The address family of those addresses also determines whether the inside
334of the tunnel should be IPv4 or IPv6.
335.S 11
336Responder gets first Phase 2 message.
337Now the Responder finally knows what's going on!
338Unless the specified Source is identical to the Initiator,
339Responder initiates DNS reverse lookup on Source IP address,
340for TXT records;
341waits for result;
342gets suitable TXT record(s) (see 2.3),
343which should contain either the Initiator's IP address
344or an FQDN identity identical to that supplied by the Initiator in step 7.
345This verifies that the Initiator is authorized
346to act as SG for the Source.
347Responder replies with second Phase 2 message,
348selecting acceptable details (see 2.7),
349and establishes tunnel.
350.S 12
351Initiator gets second Phase 2 message,
352establishes tunnel (if he didn't already),
353and releases the intercepted packet into it, finally.
354.S 13
355Communication proceeds.
356See section 3 for what happens later.
357.P
358As additional information becomes available,
359notably in steps 1, 2, 4, 8, 9, 11, and 12,
360there is always a possibility that local policy
361(e.g., access limitations) might prevent further progress.
362Whenever possible,
363at least attempt to inform the other end of this.
364.P
365At any time, there is a possibility of the negotiation failing due to
366unexpected responses, e.g. the Responder not responding at all
367or rejecting all Initiator's proposals.
368If multiple SGs were found as possible Responders,
369the Initiator should try at least one more before giving up.
370The number tried should be influenced by what the alternative is:
371if the traffic will otherwise be discarded, trying the full list is
372probably appropriate,
373while if the alternative is plaintext transmission,
374it might be based on how long the tries are taking.
375The Initiator should try as many as it reasonably can,
376ideally all of them.
377.P
378There is a sticky problem with timeouts.
379If the Responder is down
380or otherwise inaccessible, in the worst case we won't hear about this
381except by not getting responses.
382Some other, more pathological or even
383evil, failure cases can have the same result.
384The problem is that in the
385case where plaintext is permitted, we want to decide whether a tunnel is
386possible quickly.
387There is no good solution to this, alas;
388we just have to take the time and do it right.
389(Passing plaintext meanwhile
390looks attractive at first glance... but exposing
391the first few seconds of a connection is often almost as bad as exposing
392the whole thing.
393Worse, if the user checks the status of the connection,
394after that brief window it looks secure!)
395.P
396The flip side of waiting for a timeout is that all other forms of
397feedback, e.g. ``host not reachable'',
398arguably should be \fIignored\fR,
399because in the absence of authenticated ICMP,
400you cannot trust them!
401.R
402An alternative, sometimes suggested, to the use of explicit DNS records
403for SG discovery is to directly attempt IKE negotiation with the
404destination host,
405and assume that any relevant SG will be on the packet path,
406will intercept the IKE packets,
407and will impersonate the destination host for the IKE negotiation.
408This is superficially attractive but is a very bad idea.
409It assumes that routing is stable throughout negotiation,
410that the SG is on the plaintext-packets path,
411and that the destination host is routable
412(yes, it is possible to have (private) DNS data for an unroutable host).
413Playing extra games in the plaintext-packet path hurts performance and
414can be expected to be unpopular.
415Various difficulties ensue when there are multiple SGs along the path
416(there is already bad experience with this, in RSVP),
417and the presence of even one can make it impossible
418to do IKE direct to the host when that is what's wanted.
419Worst of all, such impersonation breaks the IP network model badly,
420making problems difficult to diagnose and impossible to work around
421(and there is already bad experience with this, in areas like web caching).
422.R
423(Step 1.)
424Dynamic setup actions might include establishment of demand-dialed links.
425These might be present anywhere along the path,
426so one cannot rely on out-of-band communication at the Initiator to
427trigger them.
428Hence the ping.
429.R
430(Step 2.)
431In many cases, the IP address on the intercepted packet will be the
432result of a name lookup just done.
433Inverse queries, an obscure DNS feature from the distant past,
434in theory can be used to ask a DNS server to reverse that lookup,
435giving the name that produced the address.
436This is not the same as a reverse lookup,
437and the difference can matter a great deal in cases where a host
438does not control its reverse map
439(e.g., when the host's IP address is dynamically assigned).
440Unfortunately, inverse queries were never widely implemented and
441are now considered obsolete.
442Phooey.
443.A
444Support for a small subset of this admittedly-obscure feature
445would be useful.
446Unfortunately, it seems unlikely.
447.R
448(Step 3.)
449Using only IP addresses to decide whether there is already a relevant
450keying channel avoids some
451difficult problems.
452In particular, it might seem that this should be based on identities,
453but those are not known until very late in IKE Phase 1 negotiations.
454.R
455(Step 4.)
456The DNS lookup is done on speculation
457because the data will probably be useful and the lookup can be done
458in parallel with IKE activity,
459potentially speeding things up.
460.R
461(Steps 7 and 8.)
462If an SG does not control its reverse map,
463there is no way it can prove its right to use an IP address,
464but it can nevertheless supply both an identity (as an FQDN) and
465proof of its right to use that identity.
466This is somewhat better than nothing,
467and may be quite useful if the SG is representing a client host
468which \fIcan\fR prove its right to \fIits\fR IP address.
469(For example, a fixed-address subnet might live behind an SG with
470a dynamically-assigned address;
471such an SG has to be the Initiator, not the Responder,
472so the subnet's TXT records can contain FQDN identities,
473but with that restriction, this works.)
474It might sound like this would permit some man-in-the-middle attacks
475in important cases like Road Warrior,
476but the RW can still do full authentication of the home base,
477so a man in the middle cannot successfully impersonate home base,
478and the D-H exchange doesn't work unless the man in the middle
479impersonates \fIboth\fR ends.
480.R
481(Steps 7 and 8.)
482Another situation where proof of the right to use an identity can be
483very useful is when access is deliberately limited.
484While opportunistic encryption is intended as a general-purpose
485connection mechanism between strangers,
486it may well be convenient for prearranged connections to use
487the same mechanism.
488.R
489(Steps 7 and 8.)
490FQDNs as identities are avoided where possible,
491since they can involve synchronous DNS lookups.
492.R
493(Step 11.)
494Note that only here, in Phase 2,
495does the Responder actually learn who the
496Source and Destination hosts are.
497This unfortunately demands a synchronous DNS lookup to verify that the
498Initiator is authorized to represent the Source,
499unless they are one and the same.
500This and the initial TXT lookup are the only synchronous DNS lookups
501absolutely required by the algorithm,
502and they appear to be unavoidable.
503.R
504While it might seem unlikely that a refusal to cooperate from one SG
505could be remedied by trying another\(empresumably they all use the
506same policies\(emit's conceivable that one might be misconfigured.
507Preferably they should all be tried,
508but it may be necessary to set some limits on this
509if alternatives exist.
510.NH 2
511DNS Records
512.P
513Gateway discovery and key lookup are based on TXT and KEY DNS records.
514The TXT record specifies IP address or other identity of a host's SG,
515and possibly supplies its public key as well,
516while the KEY record supplies public keys not found in TXT records.
517.NH 3
518TXT
519.P
520Opportunistic-encryption SG discovery uses TXT records with the content:
521.DS
522X-IPsec-Gateway(\fInnn\fR)=\fIiii\fR\ \fIkkk\fR
523.DE
524following RFC 1464 attribute/value
525notation.
526Records which
527do not contain an ``='',
528or which do not have exactly the specified form to the left of it,
529are ignored.
530(Near misses perhaps should be reported.)
531.P
532The \fInnn\fR is an unsigned integer which will fit in 16 bits,
533specifying an MX-style preference
534(lower number = stronger preference) to
535control the order in which multiple SGs are tried.
536If there are ties, pick one,
537randomly enough that the choice will probably be different each time.
538xxx rollover.
539The preference field is not optional;
540use ``0'' if there is no meaningful preference ordering.
541.P
542The \fIiii\fR part identifies the SG.
543Normally this is a dotted-decimal IPv4 address or
544a colon-hex IPv6 address.
545The sole exception is if the SG has no fixed address (see 2.4) but
546the host(s) behind it do,
547in which case \fIiii\fR is of the form ``@fqdn'',
548where \fIfqdn\fR is the FQDN that the SG will use to
549identify itself (in step 7 of section 2.2);
550such a record cannot be used for SG discovery by an Initiator,
551but can be used for
552SG verification (step 11 of 2.2) by a Responder.
553.P
554The \fIkkk\fR part is optional.
555If it is present,
556it is an RSA-MD5 public key in base-64 notation, as in the text
557form of an RFC 2535 KEY record.
558If it is not present,
559this specifies that the public key can be found in a KEY
560record located based on the SG's identification:
561if \fIiii\fR is an IP address,
562do a reverse lookup on that address,
563else do a forward lookup on the FQDN.
564.R
565While it is unusual for a reverse lookup to go for records other than PTR
566records (or possibly CNAME records, for RFC 2317 classless delegation),
567there's no reason why it can't.
568The TXT record is a temporary stand-in
569for (we hope, someday) a new DNS record for SG identification and keying.
570Keeping the setup process fast requires minimizing the number of DNS
571lookups, hence the desire to put all the information in one place.
572.R
573The use of RFC 1464 notation avoids collisions with other uses of TXT
574records.
575The ``X-'' in the attribute name
576indicates that this format is tentative and experimental;
577this design will probably need modification after initial experiments.
578The format is chosen with an eye on eventual binary encoding.
579Note, in particular,
580that the TXT record normally contains the \fIaddress\fR of the SG,
581not (repeat, not) its name.
582Name-to-address conversion is the job of
583whatever generates the TXT record,
584which is expected to be a program, not a human\(emthis is conceptually
585a \fIbinary\fR record, temporarily using a text encoding.
586The ``@fqdn'' form of the SG identity is
587for specialized uses and is never mapped to an address.
588.A
589A DNS TXT record contains one or more character strings,
590but RFC 1035 does not describe exactly how
591a multi-string TXT record is interpreted.
592This is relevant because a string can be at most 255 characters,
593and public keys can exceed this.
594Empirically, the standard pattern is that
595each string which is
596both less than 255 characters \fIand\fR not the final string of the
597record should have a blank appended to it,
598and the strings of the record
599should then be concatenated.
600(This observation is based on how BIND 8 transforms a TXT record
601from text to DNS binary.)
602.NH 3
603KEY
604.P
605An opportunistic-encryption KEY record
606is an Authentication-permitted,
607Entity (host),
608non-Signatory,
609IPsec,
610RSA/MD5 record
611(that is, its first four bytes are 0x42000401),
612as per RFCs 2535 and 2537.
613KEY records with other \fIflags\fR, \fIprotocol\fR, or \fIalgorithm\fR
614values are ignored.
615.R
616Unfortunately, the public key has to be
617associated with the SG, not the client host behind it.
618The Responder does not know which client it is supposed to be representing,
619or which client the Initiator is representing,
620until far too late.
621.A
622Per-client keys would reduce vulnerability to key compromise,
623and simplify key changes,
624but they would require changes to IKE Phase 1, to separately identify
625the SG and its initial client(s).
626(At present, the client identities are not known to the Responder
627until IKE Phase 2.)
628While the current IKE standard does not actually specify (!) who is
629being identified by identity payloads,
630the overwhelming consensus is that they identify the SG,
631and as seen earlier,
632this has important uses.
633.NH 3
634Summary
635.P
636For reference, the minimum set of DNS records needed to make this
637all work is either:
638.IP 1. \w'1.'u+2n
639TXT in Destination reverse map, identifying Responder and providing public key.
640.IP 2.
641KEY in Initiator reverse map, providing public key.
642.IP 3.
643TXT in Source reverse map, verifying relationship to Initiator.
644.P
645or:
646.IP 1. \w'1.'u+2n
647TXT in Destination reverse map, identifying Responder.
648.IP 2.
649KEY in Responder reverse map, providing public key.
650.IP 3.
651KEY in Initiator reverse map, providing public key.
652.IP 4.
653TXT in Source reverse map, verifying relationship to Initiator.
654.P
655Slight complications ensue for dynamic addresses,
656lack of control over reverse maps, etc.
657.NH 3
658Implementation
659.P
660In the long run, we need either a tree of trust or a web of trust,
661so we can trust our DNS data.
662The obvious approach for DNS is a tree of trust,
663but there are various practical problems with running all of this
664through the root servers,
665and a web of trust is arguably more robust anyway.
666This is logically independent of opportunistic encryption,
667and a separate design proposal will be prepared.
668.P
669Interim stages of implementation of this will require a bit of thought.
670Notably, we need some way of dealing with the lack of fully signed DNSSEC
671records right away.
672Without user interaction, probably the best we can do is to
673remember the results of old fetches, compare them to the results of new
674fetches, and complain and disbelieve all of it if there's a mismatch.
675This does mean that somebody who gets fake data into our very first fetch
676will fool us, at least for a while, but that seems an acceptable tradeoff.
677(Obviously there needs to be a way to manually flush the remembered results
678for a specific host, to permit deliberate changes.)
679.NH 2
680Responders Without Credentials
681.P
682In cases where the Destination simply does not control its
683DNS reverse-map entries,
684there is no verifiable way to determine a suitable SG.
685This does not make communication utterly impossible, though.
686.P
687Simply attempting negotiation directly with the host is a last resort.
688(An aggressive implementation might wish to attempt it in parallel,
689rather than waiting until other options are known to be unavailable.)
690In particular, in many cases involving dynamic addresses, it will work.
691It has the disadvantage of delaying the discovery that opportunistic
692encryption is entirely impossible,
693but the case seems common enough to justify the overhead.
694.P
695However, there are policy issues here either way, because
696it is possible to impersonate such a host.
697The host can supply an FQDN identity and verify its right to use that
698identity,
699but except by prearrangement,
700there is no way to verify that the FQDN is the right one for that
701IP address.
702(The data from forward lookups may be controlled by people
703who do not own the address, so it cannot be trusted.)
704The encryption is still solid, though,
705so in many cases this may be useful.
706.NH 2
707Failure of Opportunism
708.P
709When there is no way to do opportunistic encryption, a policy issue arises:
710whether to put in a bypass (which allows plaintext traffic through)
711or a block (which discards it, perhaps with notification back to the sender).
712The choice is very much a matter of local policy,
713and may depend on details such as the higher-level protocol being used.
714For example,
715an SG might well permit plaintext HTTP but forbid plaintext Telnet,
716in which case \fIboth\fR a block and a bypass would be set up if
717opportunistic encryption failed.
718.P
719A bypass/block must, in practice,
720be treated much like an IPsec tunnel.
721It should persist for a while,
722so that high-overhead processing doesn't have to be done for every packet,
723but should go away eventually to return resources.
724It may be simplest to treat it as a degenerate tunnel.
725It should have a relatively long lifetime (say 6h) to keep the frequency
726of negotiation attempts down,
727except in the case where the other SG simply did not respond to IKE packets,
728where the lifetime should be short (say 10min) because
729the other SG is presumably down and might come back up again.
730(Cases where the other SG responded to IKE with unauthenticated error
731reports like ``port unreachable'' are borderline,
732and might deserve to be treated as an intermediate case:
733while such reports cannot be trusted unreservedly,
734in the absence of any other response,
735they do give some reason to suspect that the other SG is unable or
736unwilling to participate in opportunistic encryption.)
737.P
738As noted in section 2.1, one might think that
739arrival of a plaintext incoming packet should cause a
740bypass/block to be set up for its source host:
741such a packet is almost always followed by an outgoing reply packet;
742the incoming packet is clear evidence that opportunistic encryption is
743not available at the other end;
744attempting it will waste resources and delay traffic to no good purpose.
745Unfortunately, this means that anyone out on the Internet
746who can forge a source address can prevent encrypted communication!
747Since their source addresses are not authenticated,
748plaintext packets cannot be taken as evidence of anything,
749except perhaps that communication from that host is likely to occur soon.
750.P
751There needs to be a way for local administrators to remove a bypass/block
752ahead of its normal expiry time,
753to force a retry after a problem at the other end is known to have been fixed.
754.NH 2
755Subnet Opportunism
756.P
757In principle, when the Source or Destination host belongs to a subnet
758and the corresponding SG is willing to provide tunnels to the whole subnet,
759this should be done.
760There is no extra overhead,
761and considerable potential for avoiding later overhead if
762similar communication occurs with other members of the subnet.
763Unfortunately,
764at the moment,
765opportunistic tunnels can only have degenerate subnets (single hosts)
766at their ends.
767(This does, at least, set up the keying channel,
768so that negotiations for tunnels to other hosts in the same subnets
769will be considerably faster.)
770.P
771The crucial problem is step 11 of section 2.2:
772the Responder must verify that the Initiator is authorized to represent
773the Source,
774and this is impossible for a subnet because
775there is no way to do a reverse lookup on it.
776Information in DNS
777records for a name or a single address cannot be trusted,
778because they may be controlled by people who do not control the whole subnet.
779.A
780Except in the special case of a subnet masked on a
781byte boundary (in which case RFC 1035's convention of an incomplete
782in-addr.arpa name could be used), subnet lookup would need extensions to the
783reverse-map name space, perhaps along the lines of that commonly done for
784RFC 2317 delegation.
785IPv6 already has suitable name syntax, as in RFC 2874,
786but has no specific provisions for subnet entries in its reverse maps.
787Fixing all this is is not conceptually difficult,
788but is logically independent of opportunistic encryption,
789and will be proposed separately.
790.P
791A less-troublesome problem is that the Initiator,
792in step 10 of 2.2,
793must know exactly what subnet is present on the Responder's end
794so he can propose a tunnel to it.
795This information could be included in the TXT record
796of the Destination
797(it would have to be verified with a subnet lookup,
798but that could be done in parallel with other operations).
799The Initiator presumably
800can be configured to know what subnet(s) are present on its end.
801.NH 2
802Option Settings
803.P
804IPsec and IKE have far too many useless options, and a few useful ones.
805IKE negotiation is quite simplistic, and cannot handle even simple
806discrepancies between the two SGs.
807So it is necessary to be quite specific about what should be done and
808what should be proposed,
809to guarantee interoperability without prearrangement or
810other negotiation protocols.
811.R
812The prohibition of other negotiations is simply because there is no time.
813The setup algorithm (section 2.2) is lengthy already.
814.P
815[Open question:
816should opportunistic IKE use a different port than normal IKE?]
817.P
818Somewhat arbitrarily and
819tentatively, opportunistic SGs must support Main Mode, Oakley group 5 for
820D-H, 3DES encryption and MD5 authentication for both ISAKMP and IPsec SAs,
821RSA/MD5 digital-signature authentication with keys between 2048 and 8192 bits,
822and ESP doing both encryption and authentication.
823They must do key PFS
824in Quick Mode, but not identity PFS.
825They may support IPComp, preferably using Deflate,
826but must not insist on it.
827They may support AES as an alternative to 3DES,
828but must not insist on it.
829.R
830Identity PFS essentially requires establishing
831a complete new keying channel for each new tunnel,
832but key PFS just does a new Diffie-Hellman exchange for each rekeying,
833which is relatively cheap.
834.P
835Keying channels must remain in existence at least as long as any
836tunnel created with them remains (they are not costly, and keeping
837the management path up and available simplifies various issues).
838See section 3.1 for related issues.
839Given the use of key PFS,
840frequent rekeying does not seem critical here.
841In the absence of strong reason to do otherwise,
842the Initiator should propose rekeying at 8hr-or-1MB.
843The Responder must accept any proposal which specifies
844a rekeying time between 1hr and 24hr inclusive
845and a rekeying volume between 100KB and 10MB inclusive.
846.P
847Given the short expected useful life of most tunnels (see section 3.1),
848very few of them will survive long enough to be rekeyed.
849In the absence of strong reason to do otherwise,
850the Initiator should propose rekeying at 1hr-or-100MB.
851The Responder must accept any proposal which specifies
852a rekeying time between 10min and 8hr inclusive
853and a rekeying volume between 1MB and 1000MB inclusive.
854.P
855It is highly desirable to add some random jitter
856to the times of actual rekeying attempts,
857to break up ``convoys'' of rekeying events;
858this and certain other aspects of robust rekeying practice will be the subject
859of a separate design proposal.
860.R
861The numbers used here for rekeying intervals are chosen quite arbitrarily
862and should be re-assessed after some implementation experience is gathered.
863.NH 1
864Renewal and Teardown
865.NH 2
866Aging
867.P
868When to tear tunnels down is a bit problematic, but if we're setting up a
869potentially unbounded number of them,
870we have to tear them down \fIsomehow sometime\fR.
871.P
872Set a short initial tentative lifespan, say 1min,
873since most net flows in fact last only a few seconds.
874When that expires, look to see if
875the tunnel is still in use (definition:
876has had traffic, in either direction,
877in the last half of the tentative lifespan).
878If so, assign it a somewhat longer tentative lifespan, say 20min,
879after which, look again.
880If not, close it down.
881(This tentative lifespan is
882independent of rekeying; it is just the time when the tunnel's future
883is next considered.
884This should happen reasonably frequently, unlike
885rekeying, which is costly and shouldn't be too frequent.)
886Multi-step backoff algorithms are not worth the trouble; looking every
88720min doesn't seem onerous.
888.P
889If the security gateway and the client host are one and the same,
890tunnel teardown decisions might wish to pay attention to TCP connection status,
891as reported by the local TCP layer.
892A still-open
893TCP connection is almost a guarantee that more traffic is coming, while
894the demise of the only TCP connection through a tunnel is a strong hint
895that none is.
896If the SG and the client host are separate machines,
897though, tracking TCP connection status requires packet snooping,
898which is complicated and probably not worthwhile.
899.P
900IKE keying channels likewise are torn down when it appears the need has
901passed.
902They always linger longer than the last tunnel they administer,
903in case they are needed again; the cost of retaining them is low.
904Other than that,
905unless the number of keying channels on the SG gets large,
906the SG should simply retain all of them until rekeying time,
907since rekeying is the only costly event.
908When about to rekey a keying channel which has no current tunnels,
909note when the last actual keying-channel traffic occurred,
910and close the keying channel down if it wasn't in the last, say, 30min.
911When rekeying a keying channel (or perhaps shortly before rekeying is expected),
912Initiator and Responder should re-fetch the public keys used for
913SG authentication,
914against the possibility that they have changed or disappeared.
915.P
916See section 2.7 for discussion of rekeying intervals.
917.P
918Given the low user impact of tearing down and rebuilding a connection
919(a tunnel or a keying channel),
920rekeying attempts should not be too persistent:
921one can always just rebuild when needed,
922so heroic efforts to preserve an existing connection are unnecessary.
923Say, try every 10s for a minute and every minute for 5min,
924and then give up and declare the connection
925(and all other connections to that IKE peer) dead.
926.R
927In future, more sophisticated, versions of this protocol,
928examining the initial packet might permit a more intelligent guess at
929the tunnel's useful life.
930HTTP connections in particular are
931notoriously bursty and repetitive.
932.R
933Note that rekeying a keying connection basically consists of building a
934new keying connection from scratch,
935using IKE Phase 1,
936and abandoning the old one.
937.NH 2
938Teardown and Cleanup
939.P
940Teardown should always be coordinated with the other end.
941This means interpreting and sending Delete notifications.
942.P
943On receiving a Delete for the outbound SAs of a tunnel
944(or some subset of them),
945tear down the inbound ones too, and notify the other end
946with a Delete.
947Tunnels need to be considered as bidirectional entities,
948even though the low-level protocols don't think of them that way.
949.P
950When the deletion is initiated locally,
951rather than as a response to a received Delete,
952send a Delete for (all) the inbound SAs of a tunnel.
953If no responding Delete is received for the outbound SAs,
954try re-sending the original Delete.
955Three tries spaced 10s apart seems a reasonable level of effort.
956(Indefinite persistence is not necessary;
957whether the other end isn't cooperating because it doesn't feel like
958it, or because it is down/disconnected/etc.,
959the problem will eventually be cleared up by other means.)
960.P
961After rekeying,
962transmission should switch to using the new SAs (ISAKMP or IPsec)
963immediately,
964and the old leftover SAs should be cleared out promptly
965(and Deletes sent) rather than waiting for them to expire.
966This reduces clutter and minimizes confusion.
967.P
968Since there is only one keying channel per remote IP address,
969the question of whether a Delete notification has appeared on a
970``suitable'' keying channel does not arise.
971.R
972The pairing of Delete notifications effectively constitutes an
973acknowledged Delete, which is highly desirable.
974.NH 2
975Outages and Reboots
976.P
977Tunnels sometimes go down because the other
978end crashes, or disconnects, or has a network link break,
979and there is no notice of this in the general case.
980(Even in the event of a crash and
981successful reboot, other SGs don't hear about it unless the
982rebooted SG has specific reason to talk to them immediately.)
983Over-quick response to temporary network outages is undesirable...
984but note that a tunnel can be torn
985down and then re-established without any user-visible effect except
986a pause in traffic,
987whereas if one end does reboot,
988the other end can't get packets to it \fIat all\fR (except via IKE)
989until the situation is noticed.
990So a bias toward quick response is appropriate,
991even at the cost of occasional false alarms.
992.P
993Heartbeat mechanisms are somewhat unsatisfactory for this.
994Unless they are very frequent, which causes other problems,
995they do not detect the problem promptly.
996.A
997What is really wanted is authenticated ICMP.
998This might be a case where public-key encryption/authentication
999of network packets is the right thing to do,
1000despite the expense.
1001.P
1002In the absence of that, a two-part approach seems warranted.
1003.P
1004First,
1005when an SG receives an IPsec packet that is addressed to it,
1006and otherwise appears healthy,
1007but specifies an unknown SA and is from a host that the receiver currently
1008has no keying channel to,
1009the receiver must attempt to inform the sender
1010via an IKE Initial-Contact notification
1011(necessarily sent in plaintext,
1012since there is no suitable keying channel).
1013This must be severely rate-limited on \fIboth\fR ends;
1014one notification per SG pair per minute seems ample.
1015.P
1016Second, there is an obvious difficulty with this:
1017the Initial-Contact notification is unauthenticated
1018and cannot be trusted.
1019So it must be taken as a hint only:
1020there must be a way to confirm it.
1021.P
1022What is needed here is something that's desirable for
1023debugging and testing anyway:
1024an IKE-level ping mechanism.
1025Pinging direct at the IP level instead will not tell us about a
1026crash/reboot event.
1027Sending pings through tunnels has
1028various complications (they should stop at the far mouth of the tunnel
1029instead of going on to a subnet; they should not count against idle
1030timers; etc.).
1031What is needed is a continuity check on a keying channel.
1032(This could also be used as a heartbeat,
1033should that seem useful.)
1034.P
1035IKE Ping delivery need not be reliable, since the whole point of a ping is
1036simply to provoke an acknowledgement.
1037They should preferably be authenticated,
1038but it is not clear that this is absolutely necessary,
1039although if they are not they need
1040encryption plus a timestamp or a nonce,
1041to foil replay mischief.
1042How they are implemented is a secondary issue,
1043and a separate design proposal will be prepared.
1044.A
1045Some existing implementations are already using
1046(private) notify value 30000 (``LIKE_HELLO'') as ping
1047and (private) notify value 30002 (``SHUT_UP'') as ping reply.
1048.P
1049If an IKE Ping gets no response, try some (say 8) IP pings,
1050spaced a few seconds apart, to check IP connectivity;
1051if one comes back, try another IKE Ping;
1052if that gets no response,
1053the other end probably has rebooted, or otherwise been re-initialized,
1054and its tunnels and keying channel(s) should be torn down.
1055.P
1056In a similar vein,
1057giving limited rekeying persistence,
1058a short network outage could take some tunnels down without
1059disrupting others.
1060On receiving a packet for an unknown SA from a host that a keying
1061channel is currently open to,
1062send that host a Invalid-SPI notification for that SA.
1063xxx that's not what Invalid-SPI is for.
1064The other host can then tear down the half-torn-down tunnel,
1065and negotiate a new tunnel for the traffic
1066it presumably still wants to send.
1067.P
1068Finally,
1069it would be helpful if SGs made some attempt to deal intelligently
1070with crashes and reboots.
1071A deliberate shutdown should include an attempt to notify all other SGs
1072currently connected by keying channels,
1073using Deletes,
1074that communication is about to fail.
1075(Again, these will be taken as teardowns;
1076attempts by the other SGs to negotiate new tunnels as replacements
1077should be ignored at this point.)
1078And when possible, SGs should attempt to preserve information
1079about currently-connected SGs in non-volatile storage,
1080so that after a crash,
1081an Initial-Contact can be sent to previous partners to
1082indicate loss of all previously-established connections.
1083.NH 1
1084Conclusions
1085.P
1086This design appears to achieve the objective of setting up encryption
1087with strangers.
1088The authentication aspects also seem adequately addressed if the
1089destination controls its reverse-map DNS entries
1090and the DNS data itself can be reliably authenticated
1091as having originated from the legitimate administrators of that
1092subnet/FQDN.
1093The authentication situation is less satisfactory when DNS is less helpful,
1094but it is difficult to see what else could be done about it.
1095.NH 1
1096References
1097.P
1098[TBW]
1099.NH 1
1100Appendix: Separate Design Proposals TBW
1101.IP \(bu \w'\(bu'u+2n
1102How can we build a web of trust with DNSSEC?
1103(See section 2.3.4.)
1104.IP \(bu
1105How can we extend DNS reverse lookups to permit reverse lookup
1106on a subnet?
1107(Both address and mask must appear in the name to be looked up.)
1108(See section 2.6.)
1109.IP \(bu
1110How can rekeying be done as robustly as possible?
1111(At least partly, this is just documenting current FreeS/WAN practice.)
1112(See section 2.7.)
1113.IP \(bu
1114How should IKE Pings be implemented?
1115(See section 3.3.)