]> git.ipfire.org Git - thirdparty/dhcp.git/blob - doc/draft-ietf-dhc-failover-12.txt
Fixed a bug that caused OMAPI clients to hang when opening leases. [rt16495]
[thirdparty/dhcp.git] / doc / draft-ietf-dhc-failover-12.txt
1
2
3
4
5
6
7 Network Working Group Ralph Droms
8 INTERNET DRAFT Kim Kinnear
9 Mark Stapp
10 Cisco Systems
11
12 Bernie Volz
13 Ericsson
14
15 Steve Gonczi
16 Relicore
17
18 Greg Rabil
19 Lucent Technologies
20
21 Michael Dooley
22 Diamond IP Technologies
23
24 Arun Kapur
25 K5 Networks
26
27 March 2003
28 Expires September 2003
29
30
31 DHCP Failover Protocol
32 <draft-ietf-dhc-failover-12.txt>
33
34 Status of this Memo
35
36 This document is an Internet-Draft and is in full conformance with
37 all provisions of Section 10 of RFC2026.
38
39 Internet-Drafts are working documents of the Internet Engineering
40 Task Force (IETF), its areas, and its working groups. Note that
41 other groups may also distribute working documents as Internet-
42 Drafts.
43
44 Internet-Drafts are draft documents valid for a maximum of six months
45 and may be updated, replaced, or obsoleted by other documents at any
46 time. It is inappropriate to use Internet- Drafts as reference
47 material or to cite them other than as "work in progress."
48
49 The list of current Internet-Drafts can be accessed at
50 http://www.ietf.org/ietf/1id-abstracts.txt
51
52 The list of Internet-Draft Shadow Directories can be accessed at
53 http://www.ietf.org/shadow.html.
54
55
56
57
58 Droms, et. al. Expires September 2003 [Page 1]
59 \f
60 Internet Draft DHCP Failover Protocol March 2003
61
62
63 Copyright Notice
64
65 Copyright (C) The Internet Society (2003). All Rights Reserved.
66
67 Abstract
68
69 DHCP [RFC 2131] allows for multiple servers to be operating on a
70 single network. Some sites are interested in running multiple
71 servers in such a way so as to provide redundancy in case of server
72 failure. In order for this to work reliably, the cooperating primary
73 and secondary servers must maintain a consistent database of the
74 lease information. This implies that servers will need to coordinate
75 any and all lease activity so that this information is synchronized
76 in case of failover.
77
78 This document defines a protocol to provide such synchronization
79 between two servers. One server is designated the "primary" server,
80 the other is the "secondary" server. This document also describes a
81 way to integrate the failover protocol with the DHCP load balancing
82 approach.
83
84
85 Table of Contents
86
87
88 1. Introduction................................................. 4
89 2. Terminology.................................................. 5
90 2.1. Requirements terminology................................... 5
91 2.2. DHCP and failover terminology.............................. 5
92 3. Background and External Requirements......................... 9
93 3.1. Key aspects of the DHCP protocol........................... 9
94 3.2. BOOTP relay agent implementation........................... 11
95 3.3. What does it mean if a server can't communicate with its partner? 12
96 3.4. Challenging scenarios for a Failover protocol.............. 13
97 3.5. Using TCP to detect partner server failure................. 14
98 4. Design Goals................................................. 15
99 4.1. Design goals for this protocol............................. 15
100 4.2. Limitations of this protocol............................... 17
101 5. Protocol Overview............................................ 17
102 5.1. Messages and States........................................ 18
103 5.2. Fundamental guarantees..................................... 20
104 5.3. Load balancing............................................. 27
105 5.4. IP address allocations between servers..................... 28
106 5.5. Operating in NORMAL state.................................. 30
107 5.6. Operating in COMMUNICATIONS-INTERRUPTED state.............. 31
108 5.7. Operating in PARTNER-DOWN state............................ 31
109
110
111
112
113
114 Droms, et. al. Expires September 2003 [Page 2]
115 \f
116 Internet Draft DHCP Failover Protocol March 2003
117
118
119
120 5.8. Operating in RECOVER state................................. 31
121 5.9. Operating in STARTUP state................................. 31
122 5.10. Time synchronization between servers...................... 32
123 5.11. IP address binding-status................................. 33
124 5.12. DNS dynamic update considerations......................... 36
125 5.13. Reservations and failover................................. 41
126 5.14. Dynamic BOOTP and failover................................ 42
127 5.15. Guidelines for selecting MCLT............................. 43
128 5.16. What is sent in response to an UPDREQ or UPDREQALL message? 43
129 5.17. How do you determine that your partner is "up to date" for 45
130 6. Common Message Format........................................ 45
131 6.1. Message header format...................................... 46
132 6.2. Common option format....................................... 48
133 6.3. Batching multiple binding update transactions in one BNDUPD mes- 49
134 7. Protocol Messages............................................ 51
135 7.1. BNDUPD message [3]......................................... 51
136 7.2. BNDACK message [4]......................................... 62
137 7.3. UPDREQ message [9]......................................... 65
138 7.4. UPDREQALL message [7]...................................... 66
139 7.5. UPDDONE message [8]........................................ 67
140 7.6. POOLREQ message [1]........................................ 68
141 7.7. POOLRESP message [2]....................................... 69
142 7.8. CONNECT message [5]........................................ 70
143 7.9. CONNECTACK message [6]..................................... 74
144 7.10. STATE message [10]........................................ 78
145 7.11. CONTACT message [11]...................................... 79
146 7.12. DISCONNECT message [12]................................... 80
147 8. Connection Management........................................ 81
148 8.1. Connection granularity..................................... 81
149 8.2. Creating the TCP connection................................ 81
150 8.3. Using the TCP connection for determining communications status 83
151 8.4. Using the TCP connection for binding data.................. 85
152 8.5. Using the TCP connection for control messages.............. 85
153 8.6. Losing the TCP connection.................................. 85
154 9. Failover Endpoint States..................................... 86
155 9.1. Server Initialization...................................... 86
156 9.2. Server State Transitions................................... 86
157 9.3. STARTUP state.............................................. 90
158 9.4. PARTNER-DOWN state......................................... 93
159 9.5. RECOVER state.............................................. 95
160 9.6. RECOVER-WAIT state......................................... 97
161 9.7. RECOVER-DONE state......................................... 98
162 9.9. COMMUNICATIONS-INTERRUPTED State........................... 101
163 9.10. POTENTIAL-CONFLICT state.................................. 105
164 9.11. RESOLUTION-INTERRUPTED state.............................. 107
165 9.12. CONFLICT-DONE state....................................... 108
166 9.13. PAUSED state.............................................. 108
167
168
169
170 Droms, et. al. Expires September 2003 [Page 3]
171 \f
172 Internet Draft DHCP Failover Protocol March 2003
173
174
175 9.14. SHUTDOWN state............................................ 109
176 10. Safe Period................................................. 110
177 11. Security.................................................... 111
178 11.1. Simple shared secret...................................... 112
179 11.2. TLS....................................................... 113
180 12. Failover Options............................................ 113
181 12.1. addresses-transferred..................................... 114
182 12.2. assigned-IP-address....................................... 114
183 12.3. binding-status............................................ 114
184 12.4. client-identifier......................................... 115
185 12.5. client-hardware-address................................... 115
186 12.6. client-last-transaction-time.............................. 115
187 12.7. client-reply-options...................................... 116
188 12.8. client-request-options.................................... 116
189 12.9. DDNS...................................................... 117
190 12.10. delayed-service-parameter................................ 118
191 12.11. hash-bucket-assignment................................... 118
192 12.12. IP-flags................................................. 119
193 12.13. lease-expiration-time.................................... 120
194 12.14. max-unacked-bndupd....................................... 120
195 12.15. MCLT..................................................... 120
196 12.16. message.................................................. 121
197 12.17. message-digest........................................... 121
198 12.18. potential-expiration-time................................ 122
199 12.19. receive-timer............................................ 122
200 12.20. protocol-version......................................... 122
201 12.21. reject-reason............................................ 123
202 12.22. relationship-name........................................ 124
203 12.23. server-flags............................................. 124
204 12.24. server-state............................................. 125
205 12.25. start-time-of-state...................................... 125
206 12.26. TLS-reply................................................ 126
207 12.27. TLS-request.............................................. 126
208 12.28. vendor-class-identifier.................................. 126
209 12.29. vendor-specific-options.................................. 127
210 13. IANA Considerations......................................... 127
211 14. Acknowledgments............................................. 127
212 15. References.................................................. 129
213 16. Author's information........................................ 131
214 17. Full Copyright Statement.................................... 132
215
216
217 1. Introduction
218
219 DHCP [RFC 2131] allows for multiple servers to be operating on a sin-
220 gle network. Some sites are interested in running multiple servers
221 in such a way so as to provide redundancy in case of server failure
222 since the DHCP subsystem is in many cases a critical part of the
223
224
225
226 Droms, et. al. Expires September 2003 [Page 4]
227 \f
228 Internet Draft DHCP Failover Protocol March 2003
229
230
231 network infrastructure.
232
233 This document defines a protocol to provide synchronization between
234 two servers in order that each can take over for the other should
235 either one fail or become unreachable.
236
237 One server is designated the "primary" server, the other is the
238 "secondary" server, and most DHCP client requests are sent to each
239 server (see section 3.1.1 for details).
240
241 In order to provide a high availability DHCP service, these
242 cooperating primary and secondary servers must maintain a consistent
243 database of lease information. This implies that servers will need
244 to coordinate all lease activity so that this information is syn-
245 chronized in case failover is required. The protocol messages and
246 processing techniques required to maintain a consistent database are
247 specified in the protocol described here.
248
249 The failover protocol also contains a way to integrate the DHCP load-
250 balancing algorithm described in [RFC 3074] with the failover proto-
251 col.
252
253 2. Terminology
254
255 This section discusses both the generic requirements terminology com-
256 mon to many IETF protocol specifications as well as specialized DHCP
257 and failover protocol specific terminology.
258
259 2.1. Requirements terminology
260
261 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
262 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
263 document are to be interpreted as described in RFC 2119 [RFC 2119].
264
265
266 2.2. DHCP and failover terminology
267
268 This document uses the following terms:
269
270 o "available IP address"
271
272 An IP address is "available" if it may be allocated by a
273 specific DHCP server. An IP address is considered (for the
274 purposes of this document) to be available to a single server
275 for allocation unless otherwise noted. An IP address available
276 for allocation on a primary server has state FREE, and an IP
277 address available for allocation on a secondary server has
278 state BACKUP.
279
280
281
282 Droms, et. al. Expires September 2003 [Page 5]
283 \f
284 Internet Draft DHCP Failover Protocol March 2003
285
286
287 o "binding"
288
289 A binding is a collection of configuration parameters, includ-
290 ing at least an IP address, associated with or "bound to" a
291 DHCP client. Bindings are managed by DHCP servers.
292
293 o "binding database"
294
295 The collection of bindings managed by a primary and secondary.
296
297 o "binding update transaction"
298
299 A binding update transaction refers to the set of information
300 (contained in options) necessary to perform a binding update
301 for a single IP address. It will be comprised of the
302 assigned-IP-address option, the binding-status option, along
303 with other options as appropriate.
304
305 o "binding-status"
306
307 The binding-status is the status of an IP address with respect
308 to its association with a client. There are specific binding-
309 status values defined for use by the failover protocol, e.g.,
310 ACTIVE, FREE, RELEASED, ABANDONED, etc. These are designed to
311 map more or less directly onto the binding-status values used
312 internally in most DHCP server implementations. The term
313 binding-status refers to the concept also sometimes known as
314 "lease state" or "IP address state", but in this document the
315 term "state" is reserved for the failover state of a failover
316 endpoint, and binding-status is always used to refer to the
317 state associated with an IP address or lease.
318
319 o "DHCP client" or "client"
320
321 A DHCP client is an Internet host using DHCP to obtain confi-
322 guration parameters such as a network address. The term
323 "client" used within this document always means a DHCP client,
324 and never one of the two failover servers.
325
326 o "DHCP server" or "server"
327
328 A DHCP server is an Internet host that returns configuration
329 parameters to DHCP clients.
330
331 o "DDNS"
332
333 An abbreviation for "Dynamic DNS", which refers to the capabil-
334 ity to update a DNS server's name (actually resource record)
335
336
337
338 Droms, et. al. Expires September 2003 [Page 6]
339 \f
340 Internet Draft DHCP Failover Protocol March 2003
341
342
343 database using an on-the-wire protocol defined in [RFC 2136].
344
345 o "DNS"
346
347 An abbreviation for "Domain Name System", a scheme where a cen-
348 tral name repository is used to map names to IP addresses and IP
349 addresses to names.
350
351 o "failover endpoint"
352
353 The failover protocol allows for there to be a unique failover
354 endpoint per partner per role per relationship (where role is
355 primary or secondary and the relationship is defined by the
356 relationship-name option). This failover endpoint can take
357 actions and hold unique states. Typically, there is a one fail-
358 over endpoint per partner, although there may be more.
359
360 o "FQDN"
361
362 An FQDN is a "fully qualified domain name". A fully qualified
363 domain name generally is a host name with at least one zone
364 name, for example "www.dhcp.org" is a fully qualified domain
365 name.
366
367 o "lazy update"
368
369 Lazy update refers to the requirement placed on a server imple-
370 menting a failover protocol to update its failover partner when-
371 ever the binding database changes. A failover protocol which
372 didn't support lazy update would require the failover partner
373 update to be complete before a DHCP server could respond to a
374 DHCP client request with a DHCPACK. A failover protocol which
375 does support lazy update places no such restriction on the
376 update of the failover partner server, and so a server can allo-
377 cate an IP address or extend a lease on an IP address and then
378 update its failover partner as time permits. A failover proto-
379 col which supports lazy update not only removes the requirement
380 to update the failover partner prior to responding to a DHCP
381 client with a DHCPACK, but also allows gathering up batches of
382 updates from one failover server to its partner.
383
384 o "MCLT"
385
386 The MCLT refers to maximum client lead time. This time is con-
387 figured on the primary server and transmitted from the primary
388 to the secondary server in the CONNECT message. It is the max-
389 imum amount of time that one server can extend a lease for a
390 client's binding beyond the time known by the partner server.
391
392
393
394 Droms, et. al. Expires September 2003 [Page 7]
395 \f
396 Internet Draft DHCP Failover Protocol March 2003
397
398
399 See section 5.2.1 for details.
400
401 o "partner"
402
403 A "partner", for the purposes of this document, refers to a
404 failover server, typically the other failover server. In many
405 (if not most) cases, the failover protocol is symmetric with
406 respect to the primary or secondary nature of the servers, and
407 so it is often appropriate to discuss "updating the partner
408 server", since it could be a primary server updating a secondary
409 server or a secondary server updating a primary server.
410
411 o "Primary server" or "Primary"
412
413 A DHCP server configured to provide primary service to a set of
414 DHCP clients for a particular set of subnet address pools.
415
416 o "RR"
417
418 "RR" is an abbreviation for "resource record". All records in
419 the DNS are resource records. The resource records of most
420 relevance to this document are the "A" resource record, which
421 maps a DNS name to a particular IP address, the "PTR" resource
422 record, which allows a "reverse map", from the IP address back
423 to a DNS name, and the "KEY" resource record, which is used in
424 ways defined in [FQDN] to tag a DNS name with the identity of
425 the DHCP client with which it is associated.
426
427 o "Secondary server" or "Secondary"
428
429 A DHCP server configured to act as backup to a primary server
430 for a particular set of subnet address pools.
431
432 o "stable storage"
433
434 Every DHCP server is assumed to have some form of what is called
435 "stable storage". Stable storage is used to hold information
436 concerning IP address bindings (among other things) so that this
437 information is not lost in the event of a server failure which
438 requires restart of the server.
439
440 o "state"
441
442 In this document, the term "state" refers exclusively to the
443 state of a failover endpoint, for example: NORMAL,
444 COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN. It is not used to
445 refer to any attributes of an IP address or a binding of an IP
446 address. See "binding-status".
447
448
449
450 Droms, et. al. Expires September 2003 [Page 8]
451 \f
452 Internet Draft DHCP Failover Protocol March 2003
453
454
455 o "subnet address pool"
456
457 A subnet address pool is the set of IP addresses which is asso-
458 ciated with a particular network number and subnet mask. In the
459 simple case, there is a single network number and subnet mask
460 and a set of IP addresses. In the more complex case (sometimes
461 called "secondary subnets", sometimes "superscopes"), several
462 (apparently unrelated) network number and subnet mask combina-
463 tions with their associated IP addresses may all be configured
464 together into one subnet address pool.
465
466
467 3. Background and External Requirements
468
469 This section highlights key aspects of the DHCP protocol on which the
470 failover protocol depends. It also discusses the requirements that
471 the failover protocol places on other aspects of the network infras-
472 tructure, and some general issues surrounding server failure detec-
473 tion. Some failure scenarios that provide particular challenges to a
474 failover protocol are discussed. Finally, the challenges inherent in
475 using a TCP connection as a means to detect failure of a partner
476 server are elaborated.
477
478 3.1. Key aspects of the DHCP protocol
479
480 The failover protocol is designed to augment the DHCP protocol as
481 described in RFC 2131 [RFC 2131]. There are several key aspects of
482 the DHCP protocol which are required by the failover protocol in
483 order to successfully meet its design goals.
484
485 3.1.1. Broadcast behavior
486
487 There are two aspects of the broadcast behavior of the DHCP protocol
488 which are key to making the failover protocol operate successfully.
489 The first is simply that the DHCP protocol requires a DHCP client to
490 broadcast all DHCPDISCOVER and DHCPREQUEST/INIT-REBOOT messages.
491 Because of this requirement, a DHCP client who was communicating with
492 one server will automatically be able to communicate with another
493 server if one is available.
494
495 The second aspect of broadcast behavior is similar to the first, but
496 involves the distinction between a DHCPREQUEST/RENEW and
497 DHCPREQUEST/REBINDING. A DHCPREQUEST/RENEW is the message that a
498 DHCP client uses to extend its lease. It is unicast to the DHCP
499 server from which it acquired the lease. However, the DHCP protocol
500 (in a farsighted move), was explicitly designed so that in the event
501 that a DHCP client cannot contact the server from which it received a
502 lease on an IP address using a DHCPREQUEST/RENEW, the client is
503
504
505
506 Droms, et. al. Expires September 2003 [Page 9]
507 \f
508 Internet Draft DHCP Failover Protocol March 2003
509
510
511 required to broadcast its renewal using a DHCPREQUEST/REBINDING to
512 any available DHCP server. Since all DHCP clients were required to
513 implement this algorithm, the failover protocol can have a different
514 server from the one that initially granted a lease be the server to
515 renew a lease. Thus, one server can take over for another with no
516 interruption in the service as experienced by the DHCP client or its
517 associated applications software.
518
519 3.1.2. Client responsibility
520
521 In the DHCP protocol the DHCP clients are entrusted with a consider-
522 able responsibility. In particular, after they are granted a lease
523 on an IP address, they are enjoined to only use that IP address while
524 their lease is valid. Every DHCP client is expected to stop using an
525 IP address if the expiration time on the lease has passed and if it
526 cannot get an extension on the lease for that IP address from some
527 DHCP server. Thus, the correct behavior of every DHCP client in this
528 regard is required to ensure the integrity of the DHCP service. On
529 the other hand, incorrect behavior by a client in this area will tend
530 to adversely affect at most one other DHCP client.
531
532 Furthermore, any DHCP client which sends in a DHCPREQUEST/RENEW or
533 DHCPREQUEST/REBINDING to a DHCP server (either unicast for a RENEW or
534 broadcast for a REBINDING) MUST still have time to run on the lease
535 for that IP address. The DHCP server sends the DHCPACK back unicast
536 to the IP address from which the RENEW or REBINDING originated.
537
538 Given the existing responsibility placed on the client to only use an
539 IP address when the lease is valid, and to only send in a RENEW or
540 REBINDING if the lease is valid, the failover protocol relies on DHCP
541 clients to perform responsibly and will, in the absence of conflict-
542 ing information, believe a DHCP client that is attempting to RENEW or
543 REBIND a lease on an IP address is the legitimate owner of that IP
544 address.
545
546 If clients do not follow these rules, it is possible for an address
547 to be in use by more than one client. For a single server, this hap-
548 pens because the server has leased the expired address to another
549 client and the original client is also attempting to use the address.
550 The server would NAK the renewal request. This is made slightly worse
551 in the failover protocol if the two servers are unable to communicate
552 with each other and one server leases an available address to a new
553 client while the other server receives a renewal from a different
554 client. In this case, both servers lease the same address to dif-
555 ferent clients for the MCLT time.
556
557 One troublesome issue is that of the DHCP client responsibility when
558 sending in DHCPREQUEST/INIT-REBOOT requests. While the original DHCP
559
560
561
562 Droms, et. al. Expires September 2003 [Page 10]
563 \f
564 Internet Draft DHCP Failover Protocol March 2003
565
566
567 RFC was written to require a DHCP client to have time left to run on
568 the lease for an IP address if the client is sending an INIT-REBOOT
569 request, it was sufficiently unclear that some client vendors didn't
570 realize this until recently. Since the INIT-REBOOT request was sent
571 with the IP address in the dhcp-requested-address option and not in
572 the ciaddr (for perfectly good reasons), the similarity to the RENEW
573 and REBINDING case was lost on many people.
574
575 At present, the failover protocol does not assume that a client send-
576 ing in an INIT-REBOOT request necessarily has a valid lease on the IP
577 address appearing in the dhcp-requested-address option in the INIT-
578 REBOOT request.
579
580 The implications of this are as follows: Assume that there is a DHCP
581 client that gets a lease from one server while that server is unable
582 to communicate with its failover partner. Then, assume that after
583 that client reboots it is able only to communicate with the other
584 failover server. If the failover servers have not been able to com-
585 municate with each other during this process, then the DHCP client
586 will get a new IP address instead of being able to continue to use
587 its existing IP address. This will affect no applications on the DHCP
588 client, since it is rebooting. However, it will use up an additional
589 IP address in this marginal case.
590
591 3.1.3. Stable storage update before DHCPACK
592
593 The DHCP protocol allocates resources, and in order to operate
594 correctly it requires that a DHCP server update some form of stable
595 storage prior to sending a DHCPACK to a DHCP client in order to grant
596 that client a lease on an IP address.
597
598 One of the goals of the failover protocol is that it not add signifi-
599 cant additional time to this already time consuming requirement to
600 update stable storage prior to a DHCPACK. In particular, adding a
601 requirement to communicate with another server prior to sending a
602 DHCPACK would greatly simplify the failover protocol, but it would
603 unacceptably limit the potential scalability of any DHCP server which
604 employed the failover protocol.
605
606 3.2. BOOTP relay agent implementation
607
608 Many DHCP clients are not resident on the same network segment as a
609 DHCP server. In order to support this form of network architecture,
610 most contemporary routers implement something known as a BOOTP Relay
611 Agent. This capability inside of a router listens for all broadcasts
612 at the DHCP port, port 67, and will relay any broadcasts that it
613 receives on to a DHCP server. The IP address of the DHCP server must
614 have been previously configured into the router. As part of the
615
616
617
618 Droms, et. al. Expires September 2003 [Page 11]
619 \f
620 Internet Draft DHCP Failover Protocol March 2003
621
622
623 relay process, the relay agent will place the address of the inter-
624 face on which it received the broadcast into the giaddr field of the
625 DHCP packet.
626
627 Since the failover protocol requires two DHCP servers to receive any
628 broadcast DHCP messages, in order to work with DHCP clients which are
629 not local to the DHCP server, the BOOTP relay agent on the router
630 closest to the DHCP client must be configured to point at more than
631 one DHCP server.
632
633 Most BOOTP relay agent implementations allow this duplication of
634 packets.
635
636 If this is not possible, an administrator might be able to configure
637 the relay agent with a subnet broadcast address, but in this case the
638 primary and secondary DHCP servers in a failover pair must both
639 reside on the same subnet.
640
641 3.3. What does it mean if a server can't communicate with its partner?
642
643 In any protocol designed to allow one server to take over some
644 responsibilities from a partner server in the event of "failure" of
645 that partner server, there is an inherent difficulty in determining
646 when that partner server has failed.
647
648 In fact, it is fundamentally impossible for one server to distinguish
649 a network communications failure from the outright failure of the
650 server to which it is trying to communicate. In the case where each
651 server is handing out resources (in this case IP addresses) to a
652 client community, mistaking an inability to communicate with a
653 partner server for failure of that partner server could easily cause
654 both servers to be handing out the same IP addresses to different
655 clients.
656
657 One way that this is sometimes handled is for there to be more than
658 two servers. In the case of an odd number of servers, the servers
659 that can still communicate with a majority of other servers will con-
660 sider themselves operational, and any server which can't communicate
661 to a majority of other servers must immediately cease operations.
662
663 While this technique works in some domains, having the only server to
664 which a DHCP client can communicate voluntarily shut itself down
665 seems like something worth avoiding.
666
667 The failover protocol will operate correctly while both servers are
668 unable to communicate, whether they are both running or not. At some
669 point there may be resource contention, and if one of the servers is
670 actually down, then the operator can inform the operational server
671
672
673
674 Droms, et. al. Expires September 2003 [Page 12]
675 \f
676 Internet Draft DHCP Failover Protocol March 2003
677
678
679 and the operational server will be able to use all of the failed
680 server's resources.
681
682 The protocol also allows detection of an orderly shutdown of a parti-
683 cipating server.
684
685 3.4. Challenging scenarios for a Failover protocol
686
687 There exist two failure scenarios which provide particular challenges
688 to the correctness guarantees of a failover protocol.
689
690 3.4.1. Primary Server crash before "lazy" update:
691
692 In the case where the primary server sends a DHCPACK to a client for
693 a newly allocated IP address and then crashes prior to sending the
694 corresponding update to the secondary server, the secondary server
695 will have no record of the IP address allocation. When the secondary
696 server takes over, it may well try to allocate that IP address to a
697 different client. In the case where the first client to receive the
698 IP address is not on the net at the time (yet while there was still
699 time to run on its lease), an ICMP echo (i.e., ping) will not prevent
700 the secondary server from allocating that IP address to a different
701 client.
702
703 The failover protocol deals with this situation by having the primary
704 and secondary servers allocate addresses for new clients from dis-
705 joint address pools. See section 5.5 for details.
706
707 A more likely (in that DHCPREQUEST/RENEWs are presumably more common
708 than DHCPDISCOVERs) and more subtle version of this problem is where
709 the primary server crashes after extending a client's lease time, and
710 before updating the secondary with a new time using a lazy update.
711 After the secondary takes over, if the client is not connected to the
712 network the secondary will believe the client's lease has expired
713 when, in fact, it has not. In this case as well, the IP address
714 might be reallocated to a different client while the first client is
715 still using it.
716
717 This scenario is handled by the failover protocol through control of
718 the lease time and the use of the maximum client lead time (MCLT).
719 See section 5.2.1 for details.
720
721 3.4.2. Network partition where DHCP servers can't communicate but each
722 can talk to clients:
723
724 Several conditions are required for this situation to occur. First,
725 due to a network failure, the primary and secondary servers cannot
726 communicate. As well, some of the DHCP clients must be able to
727
728
729
730 Droms, et. al. Expires September 2003 [Page 13]
731 \f
732 Internet Draft DHCP Failover Protocol March 2003
733
734
735 communicate with the primary server, and some of the clients must now
736 only be able to communicate with the secondary server. When this
737 condition occurs, both primary and secondary servers could attempt to
738 allocate IP addresses for new clients from the same pool of available
739 addresses. At some point, then, two clients will end up being allo-
740 cated the same IP address. This will cause problems when the network
741 failure that created this situation is corrected.
742
743 The failover protocol deals with this situation by having the primary
744 and secondary servers allocate addresses for new clients from dis-
745 joint address pools. See section 5.5 for details.
746
747 3.5. Using TCP to detect partner server failure
748
749 There are several characteristics of TCP that are important to the
750 functioning of the failover protocol, which uses one TCP connection
751 for both bulk data transfer as well as to assess communications
752 integrity with the other server. Reliable and ordered message
753 delivery are chief among these important characteristics.
754
755 It would be nice to use the capabilities built in to TCP to allow it
756 to determine if communications integrity exists to the failover
757 partner but this strategy contains some problems which require
758 analysis. There exist three fundamental cases for an open TCP con-
759 nection that must be examined.
760
761 1. When no data is being sent on a TCP connection, the TCP layer
762 also does not exchange any signaling messages to assure that
763 the peer is still up.
764
765 2. When data is queued to be sent, and the receiver has not
766 blocked the sending of additional data, then messages are
767 flowing across the TCP connection containing the applications
768 data.
769
770 3. When data is queued to be sent, and the receiver has blocked
771 the transmission of additional data, then persist messages are
772 flowing from the receiver to the sender to ensure that the
773 sender doesn't miss the receiver opening the window for
774 further transmissions.
775
776 The first case can be turned into the second case by sending
777 application-level keep-alive messages periodically when there is no
778 other data queued to be sent. Note TCP keep-alive messages might be
779 used as well, but they present additional problems.
780
781 Thus, we can ensure that the TCP connection has messages flowing
782 periodically across the connection fairly easily. The question
783
784
785
786 Droms, et. al. Expires September 2003 [Page 14]
787 \f
788 Internet Draft DHCP Failover Protocol March 2003
789
790
791 remains as to what TCP will do if the other end of the connection
792 fails to respond (either because of network partition or because the
793 receiving server crashes). TCP will attempt to retransmit a message
794 with an exponential backoff, and will eventually timeout that
795 retransmission. However, the length of that timeout cannot, in gen-
796 eral, be set on a per-connection basis, and is frequently as long as
797 nine minutes, though in some cases it may be as short as two minutes.
798 On some systems it can be set system-wide, while on other systems it
799 cannot be changed at all.
800
801 A value for this timeout that would be appropriate for the failover
802 protocol, say less than 1 minute, could have unpleasant side-effects
803 on other applications running on the same server, assuming that it
804 could be changed at all on the host operating system.
805
806 Nine minutes is a long time for the DHCP service to be unavailable to
807 any new clients that were being served by the server which has
808 crashed, when there is another server running that could respond to
809 them as soon as it determines that its partner is not operational.
810
811 The conclusion drawn from this analysis is that TCP provides very
812 useful support for the failover protocol in the areas of reliable and
813 ordered message delivery, but cannot by itself be relied upon to
814 detect partner server failure in a fashion acceptable to the needs of
815 the failover protocol. Additional failover protocol capabilities
816 have been created to support timely detection of partner server
817 failure. See section 8.3 for details on this mechanism.
818
819 4. Design Goals
820
821 This section lists the design goals and the limitations of the fail-
822 over protocol.
823
824 4.1. Design goals for this protocol
825
826 The following is a list of goals that are met by this protocol. They
827 are listed in priority order.
828
829 1. Implementations of this protocol must work with existing DHCP
830 client implementations based on the DHCP protocol [RFC 2131].
831
832 2. Implementations of the protocol must work with existing BOOTP
833 relay agent implementations.
834
835 3. The protocol must provide failover redundancy between servers
836 that are not located on the same subnet.
837
838 4. Provide for continued service to DHCP clients through an
839
840
841
842 Droms, et. al. Expires September 2003 [Page 15]
843 \f
844 Internet Draft DHCP Failover Protocol March 2003
845
846
847 automated mechanism in the event of failure of the primary
848 server.
849
850 5. Avoid binding an IP address to a client while that binding is
851 currently valid for another client. In other words, do not
852 allocate the same IP address to two clients.
853
854 6. Minimize any need for manual administrative intervention.
855
856 7. Introduce no additional delays in server response time as a
857 result of the network communications required to implement the
858 failover protocol, i.e., don't require communications with the
859 partner between the receipt of a DHCPREQUEST and the
860 corresponding DHCPACK.
861
862 8. Share IP address ranges between primary and secondary servers;
863 i.e., impose no requirement that the pool of available
864 addresses be manually or permanently divided between servers.
865
866 9. Continue to meet the goals and objectives of this protocol in
867 the event of server failure or network partition.
868
869 10. Provide graceful reintegration of full protocol service after
870 server failure or network partition.
871
872 11. Allow for one computer to act as a secondary server for multi-
873 ple primary servers. The protocol must allow failover primary
874 and secondary configuration choices to be made at a granular-
875 ity smaller than "all of the subnets served by a single
876 server", though individual implementations may not choose to
877 allow such flexibility.
878
879 12. Ensure that an existing client can keep its existing IP
880 address binding if it can communicate with either the primary
881 or secondary DHCP server implementing this protocol - not just
882 whichever server that originally offered it the binding.
883
884 13. Ensure that a new client can get an IP address from some
885 server. Ensure that in the face of partition, where servers
886 continue to run but cannot communicate with each other, the
887 above goals and requirements may be met. In addition, when
888 the partition condition is removed, allow graceful automatic
889 re-integration without requiring human intervention.
890
891 14. If either primary or secondary server loses all of the infor-
892 mation that it has stored in stable storage, ensure that it be
893 able to refresh its stable storage from the other server.
894
895
896
897
898 Droms, et. al. Expires September 2003 [Page 16]
899 \f
900 Internet Draft DHCP Failover Protocol March 2003
901
902
903 15. Support load balancing between the primary and secondary
904 servers, and allow configuration of the percentage of the
905 client population served by each with a moderately fine granu-
906 larity.
907
908
909 4.2. Limitations of this protocol
910
911 The following are explicit limitations of this protocol.
912
913 1. This protocol provides only one level of redundancy through a
914 single secondary server for each primary server.
915
916 2. A subset of the address pool is reserved for secondary server
917 use. In order to handle the failure case where both servers
918 are able to communicate with DHCP clients, but unable to com-
919 municate with each other, a subset of the IP address pool must
920 be set aside as a private address pool for the secondary
921 server. The secondary can use these to service newly arrived
922 DHCP clients during such a period. The required size of this
923 private pool is based only on the arrival rate of new DHCP
924 clients and the length of expected downtime, and is not influ-
925 enced in any way by the total number of DHCP clients supported
926 by the server pair.
927
928 The failover protocol can be used in a mode where both the
929 primary and secondary servers can share the load between them
930 when both are operating. In this load balancing mode, the
931 addresses allocated by the primary server to the secondary
932 server are not unused, but are used instead to service the
933 portion of the client base to which the secondary server is
934 required to respond. See section 5.3 for more information on
935 load balancing.
936
937 3. The primary and secondary servers do not respond to client
938 requests at all while recovering from a failure that could
939 have resulted in duplicate IP assignments. (When synchroniz-
940 ing in POTENTIAL-CONFLICT state).
941
942
943 5. Protocol Overview
944
945 This section will discuss the failover protocol at a relatively high
946 level of detail. In the event that a description in this section
947 conflicts (or appears to conflict due to the overview nature of this
948 section) with information in later sections of this draft, the infor-
949 mation in the later sections should be considered authoritative.
950
951
952
953
954 Droms, et. al. Expires September 2003 [Page 17]
955 \f
956 Internet Draft DHCP Failover Protocol March 2003
957
958
959 5.1. Messages and States
960
961 This protocol is centered around the message exchange used by one
962 server to update the other server of binding database changes result-
963 ing from DHCP client activity:
964
965 o Communication of binding database changes
966
967 The binding update (BNDUPD) message is used to send the binding
968 database changes to the partner server, and the partner server
969 responds with a binding acknowledgement (BNDACK) message when it
970 has successfully committed those changes to its own stable
971 storage.
972
973 All of the other messages involve ancillary issues:
974
975 o Management of available IP addresses
976
977 The pool request (POOLREQ) message is used by the secondary
978 server to request an allocation of IP addresses from the primary
979 server. The pool response (POOLRESP) message is used by the
980 primary server to inform the secondary server how many IP
981 addresses were allocated to the secondary server as the result
982 of the pool request.
983
984 o Synchronization of the binding databases between the servers
985 after they've been out of communications
986
987 The update request (UPDREQ) message is used by one server to
988 request that its partner send it all binding database informa-
989 tion that it has not already seen. The update request all
990 (UPDREQALL) message is used by one server to request that all
991 binding database information be sent in order to recover from a
992 total loss of its binding database by the requesting server.
993 The update done (UPDDONE) message is used by the responding
994 server to indicate that all requested updates have been sent the
995 responding server and acked by the requesting server.
996
997 o Connection establishment
998
999 The connect (CONNECT) message is used by the primary server to
1000 establish a high level connection with the other server, and to
1001 transmit several important configuration data items between the
1002 servers. The connect acknowledgement message (CONNECTACK) is
1003 used by the secondary server to respond to a CONNECT message
1004 from the primary server. The disconnect (DISCONNECT) message is
1005 used by either server when closing a connection.
1006
1007
1008
1009
1010 Droms, et. al. Expires September 2003 [Page 18]
1011 \f
1012 Internet Draft DHCP Failover Protocol March 2003
1013
1014
1015 o Server synchronization
1016
1017 The state change (STATE) message is used by either server to
1018 inform the other server of a change of failover state.
1019
1020 o Connection integrity management
1021
1022 The contact (CONTACT) message is used by either server to ensure
1023 that the other server continues to see the connection as opera-
1024 tional. It MUST be transmitted periodically over every esta-
1025 blished connection if other message traffic is not flowing, and
1026 it MAY be sent at any time.
1027
1028 5.1.1. Failover endpoints
1029
1030 The proper operation of the failover protocol requires more than the
1031 transmission of messages between one server and the other. Each end-
1032 point might seem to be a single DHCP server, but in fact there are
1033 many situations where additional flexibility in configuration is use-
1034 ful.
1035
1036 For instance, there might be several servers which are each primary
1037 for a distinct set of address pools, and one server which is secon-
1038 dary for all of those address pools. The situation with the pri-
1039 maries is straightforward, but the secondary will need to maintain a
1040 separate failover state, partner state, and communications up/down
1041 status for each of the separate primary servers for which it is act-
1042 ing as a secondary.
1043
1044 The failover protocol is SHOULD be configured with one failover rela-
1045 tionship between each pair of failover servers. In this case there is
1046 one failover endpoint for that relationship on each partner. This
1047 failover relationship MUST have a unique name, which is communicated
1048 using the relationship-name option in the CONNECT and CONNECTACK mes-
1049 sages.
1050
1051 There is typically little need for addtional relationships between
1052 any two servers but there MAY be more than one failover relationship
1053 between two servers -- however each MUST have a unique relationship
1054 name (stored in the relationship-name option).
1055
1056 Any failover endpoint can take actions and hold unique states.
1057
1058 Thus, in the case where there are two primary servers A and B each
1059 backed up by a single common secondary server C, there is one fail-
1060 over endpoint on each of A and B, and two different failover end-
1061 points on C. The two different failover endpoints on C each have
1062 unique states, unique relationship names, and independent TCP
1063
1064
1065
1066 Droms, et. al. Expires September 2003 [Page 19]
1067 \f
1068 Internet Draft DHCP Failover Protocol March 2003
1069
1070
1071 connections.
1072
1073 This document frequently describes the behavior of the protocol in
1074 terms of primary and secondary servers, not primary and secondary
1075 failover endpoints. However, it is important to remember that every
1076 'server' described in this document is in reality a failover endpoint
1077 that resides in a particular process, and that many failover end-
1078 points may reside in the same server process.
1079
1080 It is not the case that there is a unique failover endpoint for each
1081 subnet address pool that participates in a failover relationship. On
1082 one server, there is (typically) one failover endpoint per partner,
1083 regardless of how many subnet address pools are managed by that com-
1084 bination of partner and role. Conversely, on a particular server,
1085 any given subnet address pool will be associated with exactly one
1086 failover endpoint.
1087
1088 When a connection is received from the partner, the unique failover
1089 endpoint to which the message is directed is determined solely by the
1090 IP address of the partner, the relationship-name, and the role of the
1091 receiving server. See section 8.2.
1092
1093 5.2. Fundamental guarantees
1094
1095 There a several fundamental restrictions this protocol places on what
1096 one server can do in the absence of knowledge of the other server.
1097 Operating within these restrictions allows certain guarantees to be
1098 made to the partner server, and these are key to the correct opera-
1099 tion of the protocol.
1100
1101 5.2.1. Control of lease time
1102
1103 The key problem with lazy update is that when a server fails after
1104 updating a client with a particular lease time and before updating
1105 its partner, the partner will believe that a lease has expired even
1106 though the client still retains a valid lease on that IP address.
1107
1108 In order to handle this problem, a period of time known as the "Max-
1109 imum Client Lead Time" (MCLT) is defined and must be known to both
1110 the primary and secondary servers. Proper use of this time interval
1111 places an upper bound on the difference allowed between the lease
1112 time provided to a DHCP client by a server and the lease time known
1113 by that server's partner. However, the MCLT is typically much less
1114 than the lease time that a server has been configured to offer a
1115 client, and so some strategy must exist to allow a server to offer
1116 the configured lease time to a client. During a lazy update the
1117 updating server typically updates its partner with a potential
1118 expiration time which is longer than the lease time previously given
1119
1120
1121
1122 Droms, et. al. Expires September 2003 [Page 20]
1123 \f
1124 Internet Draft DHCP Failover Protocol March 2003
1125
1126
1127 to the client and which is longer than the lease time that the server
1128 has been configured to give a client. This allows that server to
1129 give a longer lease time to the client the next time the client
1130 renews its lease, since the time that it will give to the client will
1131 not exceed the MCLT beyond the potential expiration time acknowledged
1132 by its partner.
1133
1134 The PARTNER-DOWN state exists so that a server can be sure that its
1135 partner is, indeed, down. Correct operation while in that state
1136 requires (generally) that the server wait the MCLT after anything
1137 that happened prior to its transition into PARTNER-DOWN state (or,
1138 more accurately, when the other server went down if that is known).
1139 Thus, the server MUST wait the MCLT after the partner server went
1140 down before allocating any of the partner's addresses which were
1141 available for allocation. In the event the partner was not in com-
1142 munication prior to going down, it might have allocated one or more
1143 of its FREE addresses to a DHCP client and been unable to inform the
1144 server entering PARTNER-DOWN prior to going down itself. By waiting
1145 the MCLT after the time the partner went down, the server in
1146 PARTNER-DOWN state ensures that any clients which have a lease on one
1147 of the partner's FREE addresses will either time out or contact the
1148 server in PARTNER-DOWN by the time that period ends.
1149
1150 In addition, once a server has made a transition to PARTNER-DOWN
1151 state, it MUST NOT reallocate an IP address from one client to
1152 another client until the longer of the following two times:
1153
1154 o The MCLT after the time the partner server went down (see
1155 above).
1156
1157 o An additional MCLT interval after the lease by the original
1158 client expires. (Actually, until the maximum client lead time
1159 after what it believes to be the lease expiration time of the
1160 client.)
1161
1162 Some optimizations exist for this restriction, in that it only
1163 applies to leases that were issued BEFORE entering PARTNER-DOWN. Once
1164 a server has entered PARTNER-DOWN and it leases out an address, it
1165 need not wait this time as long as it has never communicated with the
1166 partner since the lease was given out.
1167
1168 The fundamental relationship on which much of the correctness of this
1169 protocol depends is that the lease expiration time known to a DHCP
1170 client MUST NOT be more than the maximum client lead time greater
1171 than the potential expiration time known to a server's partner.
1172
1173 The remainder of this section makes the above fundamental relation-
1174 ship more explicit.
1175
1176
1177
1178 Droms, et. al. Expires September 2003 [Page 21]
1179 \f
1180 Internet Draft DHCP Failover Protocol March 2003
1181
1182
1183 This protocol requires a DHCP server to deal with several different
1184 lease intervals and places specific restrictions on their relation-
1185 ships. The purpose of these restrictions is to allow the other server
1186 in the pair to be able to make certain assumptions in the absence of
1187 an ability to communicate between servers.
1188
1189 The different lease times are:
1190
1191 o desired lease interval
1192
1193 The desired lease interval is the lease interval that a DHCP server
1194 would like to give to a DHCP client in the absence of any restric-
1195 tions imposed by the Failover protocol. Its determination is out-
1196 side of the scope of this protocol. Typically this is the result of
1197 external configuration of a DHCP server.
1198
1199 o actual lease interval
1200
1201 The actual lease internal is the lease interval that a DHCP server
1202 gives out to a DHCP client in the dhcp-lease-time option of a
1203 DHCPACK packet. It may be shorter than the desired client lease
1204 interval (as explained below).
1205
1206 o potential lease interval
1207
1208 The potential lease interval is the lease expiration interval the
1209 local server tells to its partner in the potential-expiration-time
1210 option of a BNDUPD message.
1211
1212 o acknowledged potential lease interval
1213
1214 The acknowledged potential lease interval is the potential lease
1215 interval the partner server has most recently acknowledged in the
1216 potential-expiration-time option of a BNDACK message.
1217
1218 The key restriction (and guarantee) that any server makes with
1219 respect to lease intervals is that the actual client lease interval
1220 never exceeds the acknowledged potential lease interval (if any) by
1221 more than a fixed amount. This fixed amount is called the "Maximum
1222 Client Lead Time" (MCLT).
1223
1224 The MCLT MAY be configurable on the primary server, but for correct
1225 server operation it MUST be the same and known to both the primary
1226 and secondary servers. The secondary server determines the MCLT from
1227 the MCLT option sent from the primary server to the secondary server
1228 in the CONNECT message.
1229
1230 A server MUST record in its stable storage both the actual lease
1231
1232
1233
1234 Droms, et. al. Expires September 2003 [Page 22]
1235 \f
1236 Internet Draft DHCP Failover Protocol March 2003
1237
1238
1239 interval and the most recently acknowledged potential lease interval
1240 for each IP address binding. It is assumed that the desired client
1241 lease interval can be determined through techniques outside of the
1242 scope of this protocol. See section 7.1.5 for more details concern-
1243 ing the times that the server MUST record in its stable storage and
1244 the way that they interact with the lease time that may be offered to
1245 a DHCP client.
1246
1247 Again, the fundamental relationship among these times which MUST be
1248 maintained is:
1249
1250 actual lease interval <
1251 ( acknowledged potential lease interval + MCLT )
1252
1253
1254 Figure 5.2.1-1 illustrates an initial lease to a client using the
1255 rules discussed in the example which follows it. Note that this is
1256 only one example -- as long as the fundamental relationship is
1257 preserved, the actual times used could be quite different.
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290 Droms, et. al. Expires September 2003 [Page 23]
1291 \f
1292 Internet Draft DHCP Failover Protocol March 2003
1293
1294
1295
1296 DHCP Primary Secondary
1297 time Client Server Server
1298
1299 | (time in intervals) | (absolute time) |
1300 | | |
1301 | >-DHCPDISCOVER-> | |
1302 | <---DHCPOFFER-< | |
1303 | lease-time=MCLT | |
1304 | | |
1305 | >-DHCPREQUEST-> | |
1306 | (selecting) | |
1307 | | |
1308 t | <--------DHCPACK-< | |
1309 | lease-time=MCLT | |
1310 | | >-BNDUPD--> |
1311 | | lease-expiration=t+MCLT
1312 | | potential-expiration=t+(MCLT/2)+X
1313 | | |
1314 | | <-BNDACK-< |
1315 | | potential-expiration=t+(MCLT/2)+X
1316 ... ... ...
1317 | | |
1318 t+MCLT/2 | >-DHCPREQUEST-> | |
1319 | (renew) | |
1320 | | |
1321 t1 | <--------DHCPACK-< | |
1322 | lease-time=X | |
1323 | | >-BNDUPD--> |
1324 | | lease-expiration=t1+X
1325 | | potential-expiration=t1+(X/2)+X
1326 | | |
1327 | | <-BNDACK-< |
1328 | | potential-expiration=t1+(X/2)+X
1329 ... ... ...
1330
1331 Figure 5.2.1-1: Lazy Update Message Traffic
1332 X = Desired Lease Interval
1333 Assumes renewal interval = lease interval / 2
1334
1335
1336 DISCUSSION:
1337
1338 This protocol mandates only that the above fundamental relation-
1339 ship concerning lease intervals is preserved.
1340
1341 In the interests of clarity, however, let's examine a specific
1342 example. The MCLT in this case is 1 hour. The desired lease
1343
1344
1345
1346 Droms, et. al. Expires September 2003 [Page 24]
1347 \f
1348 Internet Draft DHCP Failover Protocol March 2003
1349
1350
1351 interval is 3 days, and its renewal time is half the lease inter-
1352 val.
1353
1354 The rules for this example are:
1355
1356 o What to tell the client:
1357
1358 Take the remainder of the acknowledged potential lease interval.
1359 If this is a new lease, then this value will be zero. If this
1360 remainder plus the MCLT is greater than the desired lease inter-
1361 val, give the client the desired lease interval else give the
1362 client the remainder plus the MCLT.
1363
1364 o What to tell the failover partner server:
1365
1366 Take the renewal interval (typically half of the actual client
1367 lease interval), add to it the desired lease interval, and add
1368 it to the current time to yield the value that goes into the
1369 potential-expiration-time option.
1370
1371 Also tell the failover partner the actual lease interval by
1372 adding it to the current time to yield the value that goes into
1373 the lease-expiration option.
1374
1375 In operation this might work as follows:
1376
1377 When a server makes an offer for a new lease on an IP address to a
1378 DHCP client, it determines the desired lease interval (in this
1379 case, 3 days). It then examines the acknowledged potential lease
1380 interval (which in this case is zero) and determines the remainder
1381 of the time left to run, which is also zero. To this it adds the
1382 MCLT. Since the actual lease interval cannot be allowed to exceed
1383 the remainder of the current acknowledged potential lease interval
1384 plus the MCLT, the offer made to the client is for the remainder
1385 of the current acknowledged potential lease interval (i.e., zero)
1386 plus the MCLT. Thus, the actual lease interval is 1 hour.
1387
1388 Once the server has performed the DHCPACK to the DHCP client, it
1389 will update the secondary server with the lease information. How-
1390 ever, the desired potential lease interval will be composed of one
1391 half of the current actual lease interval added to the desired
1392 lease interval. Thus, the secondary server is updated with a
1393 BNDUPD with a lease interval of 3 days + 1/2 hour specified in the
1394 potential-expiration-time option.
1395
1396 When the primary server receives a BNDACK to its update of the
1397 secondary server's (partner's) potential lease interval, it
1398 records that as the acknowledged potential lease interval. A
1399
1400
1401
1402 Droms, et. al. Expires September 2003 [Page 25]
1403 \f
1404 Internet Draft DHCP Failover Protocol March 2003
1405
1406
1407 server MUST NOT send a BNDACK in response to a BNDUPD message
1408 until it is sure that the information in the BNDUPD message
1409 resides in its stable storage. Thus, the primary server in this
1410 case can be sure that the secondary server has recorded the poten-
1411 tial lease interval in its stable storage when the primary server
1412 receives a BNDACK message from the secondary server.
1413
1414 When the DHCP client attempts to renew at T1 (approximately one
1415 half an hour from the start of the lease), the primary server
1416 again determines the desired lease interval, which is still 3
1417 days. It then compares this with the remaining acknowledged
1418 potential lease interval (3 days + 1/2 hour) and adjusts for the
1419 time passed since the secondary was last updated (1/2 hour). Thus
1420 the time remaining of the acknowledged potential lease interval is
1421 3 days. Adding the MCLT to this yields 3 days plus 1 hour, which
1422 is more than the desired lease interval of 3 days. So the client
1423 is renewed for the desired lease interval -- 3 days.
1424
1425 When the primary DHCP server updates the secondary DHCP server
1426 after the DHCP client's renewal ACK is complete, it will calculate
1427 the desired potential lease interval as the T1 fraction of the
1428 actual client lease interval (1/2 of 3 days this time = 1.5 days).
1429 To this it will add the desired client lease interval of 3 days,
1430 yielding a total desired partner server lease interval of 4.5
1431 days. In this way, the primary attempts to have the secondary
1432 always "lead" the client in its understanding of the client's
1433 lease interval so as to be able to always offer the client the
1434 desired client lease interval.
1435
1436 Once the initial actual client lease interval of the MCLT is past,
1437 the protocol operates effectively like the DHCP protocol does
1438 today in its behavior concerning lease intervals. However, the
1439 guarantee that the actual client lease interval will never exceed
1440 the remaining acknowledged partner server lease interval by more
1441 than the MCLT allows full recovery from a variety of failures.
1442
1443 5.2.2. Controlled re-allocation of IP addresses
1444
1445 When in PARTNER-DOWN state there is a waiting period after which an
1446 IP address can be re-allocated to another client. For IP addresses
1447 which are available when the server enters PARTNER-DOWN state, the
1448 period is the MCLT from entry into PARTNER-DOWN state. For IP
1449 addresses which are not available when the server enters PARTNER-DOWN
1450 state, the period is the MCLT after the IP address becomes available.
1451 See section 9.4.2 for more details.
1452
1453 In any other state, a server cannot reallocate an address from one
1454 client to another without first notifying its partner (through a
1455
1456
1457
1458 Droms, et. al. Expires September 2003 [Page 26]
1459 \f
1460 Internet Draft DHCP Failover Protocol March 2003
1461
1462
1463 BNDUPD message) and receiving acknowledgement (through a BNDACK mes-
1464 sage) that its partner is aware that that first client is not using
1465 the address.
1466
1467 This could be modeled in the following way. Though this specific
1468 implementation is in no way required, it may serve to better illus-
1469 trate the concept.
1470
1471 An "available" IP address on a server may be allocated to any client.
1472 An IP address which was leased to a client and which expired or was
1473 released by that client would take on a new state, EXPIRED or
1474 RELEASED respectively. The partner server would then be notified
1475 that this IP address was EXPIRED or RELEASED through a BNDUPD. When
1476 the sending server received the BNDACK for that IP address showing it
1477 was FREE, it would move the IP address from EXPIRED or RELEASED to
1478 FREE, and it would be available for allocation by the primary server
1479 to any clients.
1480
1481 A server MAY reallocate an IP address in the EXPIRED or RELEASED
1482 state to the same client with no restrictions provided it has not
1483 sent a BNDUPD message to its partner. This situation would exist if
1484 the lease expired or was released after the transition into PARTNER-
1485 DOWN state, for instance.
1486
1487
1488 5.3. Load balancing
1489
1490 In order to implement load balancing between a primary and secondary
1491 server pair, each server must respond to DHCPDISCOVER requests from
1492 some clients and not from other clients. In order to do this suc-
1493 cessfully, each server must be able to determine immediately upon
1494 receipt of a DHCP client request whether it is to service this
1495 request or to ignore it in order to allow the other server to service
1496 the request.
1497
1498 In addition, it should be possible to configure the percentage of
1499 clients which will be serviced by either the primary or secondary
1500 server. This configuration should be more or less continuous, from
1501 all clients serviced by the primary through an even split with half
1502 serviced by each, to all clients serviced by the secondary.
1503
1504 The technique chosen to support these goals is described in [RFC
1505 3074].
1506
1507 A bitmap-style Hash Bucket Assignment (as described in [RFC 3074]) is
1508 used to determine which DHCP clients can be processed. There are two
1509 potential HBA's in a failover server -- a server HBA and a failover
1510 HBA. The way that a server acquires a server HBA is outside of the
1511
1512
1513
1514 Droms, et. al. Expires September 2003 [Page 27]
1515 \f
1516 Internet Draft DHCP Failover Protocol March 2003
1517
1518
1519 scope of the failover protocol, but both servers in a failover pair
1520 MUST have the same server HBA. The failover HBA (which specifies the
1521 clients that the secondary is supposed to process) is sent by the
1522 primary server to the secondary server whenever a connection is esta-
1523 blished, using the hash-bucket-assignment option defined in section
1524 12.11.
1525
1526 When using the server HBA (if any) and the failover HBA (if any), to
1527 decide whether to process a DHCP request, the server HBA always
1528 applies in every failover state, and the failover HBA (which MUST be
1529 a subset of the server HBA) is used by the secondary server to decide
1530 which packets to process when in NORMAL state.
1531
1532 5.4. IP address allocations between servers
1533
1534 The failover protocol allows a DHCP server which implements it to
1535 operate correctly in spite of the uncertainty over whether its
1536 partner has failed or whether the communications link to its partner
1537 has failed. This is made possible in part by the existence of
1538 separate address pools on each server for allocation to newly arrived
1539 DHCP clients.
1540
1541 Thus, each server has its own pool of available IP addresses. Note
1542 that an IP address is not "owned" by a particular server throughout
1543 its entire lifetime. Only an IP address which is available is
1544 "owned" by a particular server -- once it has been leased to a DHCP
1545 client, it is not owned by either failover partner. When it finally
1546 becomes available again, it will be owned initially by the primary
1547 server, and it may or may not be allocated to the secondary server by
1548 the primary server.
1549
1550 So, the flow of IP address ownership is as follows: initially an IP
1551 address is owned by the primary server. It may be allocated to the
1552 secondary server if it is available, and then it is owned by the
1553 secondary server. Either server can allocate available IP addresses
1554 which they own to DHCP clients, in which case they cease to own them.
1555 When the DHCP client releases the address or the lease on it expires,
1556 it will again become available and will be owned by the primary.
1557
1558 An IP address will not become owned by the server which allocated it
1559 initially when it is released or the lease expires because, in gen-
1560 eral, that server will have had to replenish its pool of available
1561 addresses well in advance of any likely lease expirations. Thus,
1562 having a particular IP address cycle back to the secondary might well
1563 put the secondary more out of balance with respect to the primary
1564 instead of enhancing the balance of available addresses between them.
1565
1566 These address pools are used when in COMMUNICATIONS-INTERRUPTED state
1567
1568
1569
1570 Droms, et. al. Expires September 2003 [Page 28]
1571 \f
1572 Internet Draft DHCP Failover Protocol March 2003
1573
1574
1575 and while waiting for the MCLT expiration in PARTNER-DOWN state. In
1576 addition, when using load balancing, these pools are used when in
1577 NORMAL state as well.
1578
1579 This allocation and maintenance of these address pools is an area of
1580 some sensitivity, since the goal is to maintain a more or less con-
1581 stant ratio of available addresses between the two servers.
1582
1583 The initial allocation when the servers first integrate is triggered
1584 by the POOLREQ message from the secondary to the primary. This is
1585 followed by the POOLRESP message where the primary tells the secon-
1586 dary how many IP addresses it allocated to the secondary. Then, the
1587 primary sends the allocated IP addresses to the secondary via BNDUPD
1588 messages. l The POOLREQ/POOLRESP message is a trigger to the primary
1589 to perform a scan of its database and to ensure that the secondary
1590 has enough IP addresses (based on some configured ratio).
1591
1592 The actual IP addresses are sent to the secondary using the BNDUPD
1593 message with a state of BACKUP, which indicates the IP address is now
1594 available for allocation by the secondary. Once the message is sent,
1595 the primary MUST NOT use these addresses for allocation to DHCP
1596 clients.
1597
1598 The POOLREQ/POOLRESP message exchange initiated by the secondary is
1599 valid at any time, and the primary server SHOULD, whenever it
1600 receives the POOLREQ message, scan its database of address pools and
1601 determine if the secondary needs more IP addresses from any of the IP
1602 address pools.
1603
1604 However, in order to support a reasonably dynamic balance of the IP
1605 addresses between the failover partners, the primary server needs to
1606 do additional work to ensure that the secondary server has as many IP
1607 addresses as it needs (but that it doesn't have *more* than it needs
1608 either).
1609
1610 The primary server SHOULD examine the balance of available addresses
1611 between the primary and secondary for a particular address pool when-
1612 ever the number of available addresses for either the primary or
1613 secondary changes. The primary server SHOULD adjust the available
1614 address balance as required to ensure the configured address balance,
1615 excepting that the primary server SHOULD employ some threshold
1616 mechanism to such a balance adjustment in order to minimize the over-
1617 head of maintaining this balance.
1618
1619 An example of a threshold approach is: do not attempt to re-balance
1620 the available pools on the primary and secondary until the out of
1621 balance value exceeds a configured value.
1622
1623
1624
1625
1626 Droms, et. al. Expires September 2003 [Page 29]
1627 \f
1628 Internet Draft DHCP Failover Protocol March 2003
1629
1630
1631 The primary server can, at any time, send an available IP address to
1632 the secondary using a BNDUPD with the state BACKUP. The primary
1633 server can attempt to take an available IP address away from the
1634 secondary by sending a BNDUPD with the state FREE. If the secondary
1635 accepts the BNDUPD, then it is now available to the PRIMARY and not
1636 available to the secondary. Of course, the secondary MUST reject
1637 that BNDUPD if it has already used that IP address for a DHCP client.
1638
1639 Whenever the primary server examines the possible available IP
1640 addresses which it could send to the secondary server, the primary
1641 server SHOULD take into account whether load balancing is in use, and
1642 it SHOULD attempt to send to the secondary any IP addresses whose
1643 most recent client would be processed by the secondary under the
1644 current load balancing regime in use. Likewise, when removing avail-
1645 able IP addresses from the secondary server when load balancing is in
1646 use, the primary server SHOULD first remove those IP addresses whose
1647 most recent client would be processed by the primary server under the
1648 current load balancing regime in use.
1649
1650 5.5. Operating in NORMAL state
1651
1652 When in NORMAL state, each server services DHCPDISCOVER's and all
1653 other DHCP requests other than DHCPREQUEST/RENEWAL or
1654 DHCPREQUEST/REBINDING from the client set defined by the load balanc-
1655 ing algorithm [RFC 3074]. Each server services DHCPREQUEST/RENEWAL
1656 or DHCPDISCOVER/REBINDING requests from any client.
1657
1658 In general, whenever the binding database is changed in stable
1659 storage (other than a change resulting from receiving a BNDUPD from
1660 the failover partner), then a BNDUPD message is sent with the con-
1661 tents of that change to the partner server. The partner server then
1662 writes the information about that binding in its bindings database in
1663 stable storage and replies with a BNDACK message.
1664
1665 The binding database in a DHCP server would normally be changed as a
1666 result of DHCP protocol activity with a DHCP client (e.g., granting
1667 a lease to a DHCP client through the familiar
1668 DISCOVER/OFFER/REQUEST/ACK cycle or extending a lease due to a
1669 renewal from a DHCP client) or possibly (on some servers) because a
1670 lease has expired or undergone another state change that must be
1671 recorded in the DHCP binding database. These are the state changes
1672 that would be communicated to the partner server using a BNDUPD mes-
1673 sage. Of course, receipt of a BNDUPD message itself will normally
1674 cause an update of the binding database for all of the IP addresses
1675 contained in the BNDUPD, and a binding database change such as this
1676 MUST NOT trigger a corresponding BNDUPD message to the partner.
1677
1678
1679
1680
1681
1682 Droms, et. al. Expires September 2003 [Page 30]
1683 \f
1684 Internet Draft DHCP Failover Protocol March 2003
1685
1686
1687 5.6. Operating in COMMUNICATIONS-INTERRUPTED state
1688
1689 When operating in COMMUNICATIONS-INTERRUPTED state, each server is
1690 operating independently, but does not assume that its partner is not
1691 operating. The partner server might be operating and simply unable
1692 to communicate with this server, or might not be operating.
1693
1694 Each server responds to the full range of DHCP client messages that
1695 it receives (subject to server load balancing [RFC 3074]), but in
1696 such a way that graceful reintegration is always possible when its
1697 partner comes back into contact with it.
1698
1699 5.7. Operating in PARTNER-DOWN state
1700
1701 When operating in PARTNER-DOWN state, a server assumes that its
1702 partner is not currently operating, but does make allowances for the
1703 possibility that that server was operating in the past, though possi-
1704 bly out of communications with this server. It responds to all DHCP
1705 client requests in PARTNER-DOWN state (subject to server load balanc-
1706 ing [RFC 3074]).
1707
1708 5.8. Operating in RECOVER state
1709
1710 A server operating in RECOVER state assumes that it is reintegrating
1711 with a server that has been operating in PARTNER-DOWN state, and that
1712 it needs to update its bindings database before it services DHCP
1713 client requests.
1714
1715 A server may also operate in RECOVER state in order to fully recover
1716 its bindings database from its partner server.
1717
1718 5.9. Operating in STARTUP state
1719
1720 A server operating in STARTUP state assumes that failover is opera-
1721 tional, and it spends a short time whenever it comes up attempting to
1722 contact the partner. During this short time, the server is unrespon-
1723 sive to DHCP client requests. This period exists in order to give a
1724 server a chance to determine that its partner has changed state since
1725 it was last in communications, and to react to that changed state (if
1726 any) prior to responding to DHCP client requests.
1727
1728 The startup period SHOULD be conditioned on the length of time the
1729 server has been down (if that can be determined). If the server has
1730 been down less than the MCLT then it can wait only a few (say 5 or
1731 10) seconds. If it has been down a longer time (such that the
1732 partner may well have moved to PARTNER-DOWN state), a considerably
1733 longer startup period of 30 to 60 seconds may be warranted, since the
1734 consequences of running while the partner is in PARTNER-DOWN state
1735
1736
1737
1738 Droms, et. al. Expires September 2003 [Page 31]
1739 \f
1740 Internet Draft DHCP Failover Protocol March 2003
1741
1742
1743 are unpleasant.
1744
1745 The period of time a server remains in STARTUP state SHOULD be long
1746 enough to ensure that it will connect to the other server if that
1747 server is available for connections.
1748
1749 5.10. Time synchronization between servers
1750
1751 The failover protocol is designed to operate between two servers
1752 which have time values which differ by an arbitrarily large amount.
1753 A particular implementation MAY choose to only support servers whose
1754 time values differ by an arbitrarily small amount.
1755
1756 Note that if an implementation that requires time synchronization
1757 between servers encounters a case where the time is not synchronized
1758 to its satisfaction between two servers, then this failure will prob-
1759 ably prevent the two servers from reaching communications OK status.
1760 In this occurs, and if both servers continue to operate and deal with
1761 clients, potentially troublesome things can happen. For instance, if
1762 there is a safe period configured on either server, then it will
1763 eventually go into PARTNER-DOWN state, but in this case the partner
1764 will not be down. This will almost certainly create problems. Thus,
1765 some method to prevent this sort of situation SHOULD exist in imple-
1766 mentations that can be configured to require time synchronization.
1767
1768 In any event, whether large or only small differences in time values
1769 are supported, every message that is sent MUST include the time and
1770 every packet that is received MUST be tagged with a time value as
1771 soon as possible after receipt. This time value is used along with
1772 the time value that is sent in every message between the failover
1773 partners to develop a delta time between the servers. This delta
1774 time is used during the connection process to establish a baseline
1775 delta time between the servers, and upon receipt of each message, the
1776 delta time for that message is used to refine the delta time for the
1777 server pair.
1778
1779 While the algorithm for this refinement of delta time is not speci-
1780 fied as part of this protocol, a server SHOULD allow the delta time
1781 value for a pair of failover servers to be periodically updated to
1782 account for time drift. In addition, the delta time value between
1783 servers SHOULD be smoothed in some fashion, so that transient network
1784 delays will not cause it to vary wildly.
1785
1786 A server SHOULD recognize a drastic change in the delta time value as
1787 an event to be signaled to a network administrator, as well as reset-
1788 ting the time delta between the failover partners.
1789
1790 The specific definitions of a minor or drastic change in delta time
1791
1792
1793
1794 Droms, et. al. Expires September 2003 [Page 32]
1795 \f
1796 Internet Draft DHCP Failover Protocol March 2003
1797
1798
1799 as well as the algorithm used to smooth minor changes into the run-
1800 ning delta time are implementation issues and are not further
1801 addressed in this document.
1802
1803 5.11. IP address binding-status
1804
1805 In most DHCP servers an IP address can take on several different
1806 binding-status values, sometimes also called states. While no two
1807 DHCP servers probably have exactly the same possible binding-status
1808 values, the DHCP RFC enforces some commonality among the general
1809 semantics of the binding-status values used by various DHCP server
1810 implementations.
1811
1812 In order to transmit binding database updates between one server and
1813 another using the failover protocol, some common denominator
1814 binding-status values must be defined. It is not expected that these
1815 binding-status-values correspond with any actual implementation of
1816 the DHCP protocol in a DHCP server, but rather that the binding-
1817 status values defined in this document should be a common denominator
1818 of those in use by many DHCP server implementations. It is a goal of
1819 this protocol that any DHCP server can map the various IP address
1820 binding-status values that it uses internally into these failover IP
1821 address binding-status values on transmission of binding database
1822 updates to its partner, and likewise that it can map any failover IP
1823 address binding-status values it received in a binding update into
1824 its internal IP address binding-status values.
1825
1826 The IP address binding-status values defined for the failover proto-
1827 col are listed below. Unless otherwise noted below, there MAY be
1828 client information associated with each of these binding-status
1829 values.
1830
1831 o ACTIVE -- Lease is assigned to a client. Client identification
1832 MUST appear.
1833
1834 o EXPIRED -- indicates that a client's binding on an IP address
1835 has expired. When the partner server ACK's the BNDUPD of an
1836 EXPIRED IP address, the server sets its internal state to FREE.
1837 It is then available for allocation to any client of the primary
1838 server. It may be allocated to the same client on the server
1839 where the lease expired if a BNDUPD containing the EXPIRED state
1840 has not yet been sent to the partner (e.g., in the event that
1841 the servers are not in communication). Client identification
1842 SHOULD appear.
1843
1844 o RELEASED -- indicates that a DHCP client sent in a DHCPRELEASE
1845 message. When the partner server ACK's the BNDUPD of an
1846 RELEASED IP address, the server sets its internal state to FREE,
1847
1848
1849
1850 Droms, et. al. Expires September 2003 [Page 33]
1851 \f
1852 Internet Draft DHCP Failover Protocol March 2003
1853
1854
1855 and it is available for allocation by the primary server to any
1856 DHCP client. It may be allocated to the same client if a BNDUPD
1857 has not yet been sent to the partner. Client identification
1858 SHOULD appear.
1859
1860 o FREE -- is used when a DHCP server needs to communicate that an
1861 IP address is unused by any DHCP client, but it was not just
1862 released, expired, or reset by a network administrator. When
1863 the partner server ACK's the BNDUPD of a FREE IP address, the
1864 server sets its internal state such that it is available for
1865 allocation by the primary DHCP server to any DHCP client. (Note
1866 that in PARTNER-DOWN state, after waiting the MCLT, the IP
1867 address MAY be allocated to a DHCP client by the secondary
1868 server.)
1869
1870 Note that when an IP address that was allocated by the secondary
1871 reverts to the FREE state, it must (like any other IP address)
1872 be assigned to the secondary through the POOLREQ/BNDUPD process
1873 before the secondary can reallocate it.
1874
1875 Client identification MAY appear.
1876
1877 o ABANDONED -- indicates that an IP address is considered unusable
1878 by the DHCP subsystem. An IP address for which a valid PING
1879 response was received SHOULD be set to ABANDONED. An IP address
1880 for which a DHCPDECLINE was received should be set to ABANDONED.
1881 Client identification MUST NOT appear.
1882
1883 o RESET -- indicates that this IP address was made available by
1884 operator command. This is a distinct state so that the reason
1885 that the IP address became FREE can be determined. Client iden-
1886 tification MAY appear.
1887
1888 o BACKUP -- indicates that this IP address can be allocated by the
1889 secondary server to a DHCP client at any time. When the MCLT has
1890 passed after its time of entry into PARTNER-DOWN state, the IP
1891 address may be allocated by the primary to any DHCP client.
1892 Client identification MAY appear.
1893
1894 These binding-status values are communicated from one failover
1895 partner to another using the binding-status option, see section 12.3
1896 for details of this option. Unless otherwise noted above there MAY
1897 be client information associated with each of these binding-status
1898 values.
1899
1900 An IP address will move between these binding-status values using the
1901 following state transition diagram:
1902
1903
1904
1905
1906 Droms, et. al. Expires September 2003 [Page 34]
1907 \f
1908 Internet Draft DHCP Failover Protocol March 2003
1909
1910
1911
1912
1913 DHCP client DECLINE or
1914 server detected problem
1915 from any state
1916 |
1917 V
1918 +----------+ +--+------+
1919 External >---->| RESET | (3) |ABANDONED|
1920 command | +<--------+ |
1921 +----------+ +---------+
1922 |
1923 Comm w/Parter(1)
1924 V
1925 +---------+ Comm(1) +----------+ Comm(1) +---------+
1926 | EXPIRED |--------->| FREE |<----------| RELEASED|
1927 | | w/Parter | | w/Partner | |
1928 +---------+ +----------+ +---------+
1929 ^ ^ | | +-----------+ ^
1930 | | | | | |
1931 | Exp. grace IP | IP addr alloc. IP addr |
1932 | period ends address to sec.(2) reserved |
1933 | | leased V | |
1934 | | by | +----------+ | |
1935 | | primary | BACKUP |<---+ |
1936 | wait for | | | |
1937 | grace period | +----------+ |
1938 | | | | |
1939 | | | IP addr leased by |
1940 | Expired grace | secondary |
1941 | period exists V V |
1942 | | +----------+ |
1943 | | Lease on | ACTIVE | DHCPRELEASE |
1944 +-----+-IP addr---| |------------------+
1945 expires +----------+
1946
1947
1948 Figure 5.11-1: Transitions between binding-status values.
1949
1950 (1) This transition MAY also occur if the server is in
1951 PARTNER-DOWN state and the MCLT has passed since the entry
1952 in the RELEASED, EXPIRED, or RESET states.
1953
1954 (2) This transition MAY occur if the server is the secondary
1955 and the MCLT has passed since its entry into PARTNER-DOWN state.
1956
1957 (3) This transition MAY occur due to an implementation specific
1958 handling of ABANDONED IP addresses.
1959
1960
1961
1962 Droms, et. al. Expires September 2003 [Page 35]
1963 \f
1964 Internet Draft DHCP Failover Protocol March 2003
1965
1966
1967
1968
1969
1970 Again, note that a DHCP server implementing the failover protocol
1971 does not have to implement either this state machine or use these
1972 particular binding-status values in its normal operation of allocat-
1973 ing IP addresses to DHCP clients. It only needs to map its internal
1974 binding-status-values onto these "standard" binding-status values,
1975 and map these "standard" binding-status values back into its internal
1976 binding-status values. For example, a server which implements a
1977 grace period for a IP address binding SHOULD simply wait to update
1978 its partner server until the grace period on that binding has run
1979 out.
1980
1981 The process of setting an IP address to FREE deserves some detailed
1982 discussion. When an IP address is moved to the EXPIRED,RELEASED, or
1983 RESET binding-status on a server, it will send a BNDUPD with the
1984 binding-status of EXPIRED, RELEASED, or RESET to its partner. If its
1985 partner agrees that is acceptable (see sections 7.1.2 and 7.1.3 con-
1986 cerning why a server might not accept a BNDUPD) it will return a
1987 BNDACK with no reject-reason, signifying that it accepted the update.
1988 As part of the BNDUPD processing, the server returning the BNDACK
1989 will set the binding-status of the IP address to FREE, and upon
1990 receipt of the BNDACK the server which sent the BNDUPD will set the
1991 binding-status of the IP address to FREE. Thus, the EXPIRED,
1992 RELEASED, or RESET binding-status is something of a transitory state.
1993 This process is encoded in the transition diagram above by "Comm
1994 w/Partner".
1995
1996 5.12. DNS dynamic update considerations
1997
1998 DHCP servers (and clients) can use DNS Dynamic Updates as described
1999 in [RFC 2136] to maintain DNS name-mappings as they maintain DHCP
2000 leases. Many different administrative models for DHCP-DNS integra-
2001 tion are possible. Descriptions of several of these models, and
2002 guidelines that DHCP servers and clients should follow in carrying
2003 them out, are laid out in [FQDN]. The nature of the DHCP failover
2004 protocol introduces some issues concerning dynamic DNS updates that
2005 are not part of non-failover DHCP environments. This section
2006 describes these issues, and defines the information which failover
2007 partners should exchange and the protocol which they should follow in
2008 order to ensure consistent behavior. The presence of this section
2009 should not be interpreted as requiring that implementations of the
2010 DHCP failover protocol must also support DDNS updates. The purpose
2011 of this discussion is to clarify the areas where the DHCP failover
2012 and DHCP-DDNS protocols intersect for the benefit of implementations
2013 which support both protocols, not to introduce a new requirement into
2014 the DHCP failover protocol. Thus, a DHCP server which implements the
2015
2016
2017
2018 Droms, et. al. Expires September 2003 [Page 36]
2019 \f
2020 Internet Draft DHCP Failover Protocol March 2003
2021
2022
2023 failover protocol MAY also support dynamic DNS updates, but if it
2024 does support dynamic DNS updates it SHOULD utilize the techniques
2025 described here in order to correctly distribute them between the
2026 failover partners. See [FQDN], [DNSRES], and [DHCID] for details of
2027 how DHCP servers update DNS.
2028
2029 From the standpoint of the failover protocol, there is no reason why
2030 a server which is utilizing the DDNS protocol to update a DNS server
2031 should not be a partner with a server which is not utilizing the DDNS
2032 protocol to update a DNS server. However, a server which is not able
2033 to support DDNS or is not configured to support DDNS SHOULD output a
2034 warning message when it receives BNDUPD messages which indicate that
2035 its failover partner is configured to support the DDNS protocol to
2036 update a DNS server. An implementation MAY consider this an error
2037 and refuse to operate, or it MAY choose to operate anyway, having
2038 warned the user of the problem in some way.
2039
2040 5.12.1. Relationship between failover and dynamic DNS update
2041
2042 The failover protocol describes the conditions under which each fail-
2043 over server may renew a lease to its current DHCP client, and
2044 describes the conditions under which it may grant a lease to a new
2045 DHCP client. An analogous set of conditions determines when a fail-
2046 over server should initiate a DDNS update, and when it should attempt
2047 to remove records from the DNS. The failover protocol's conditions
2048 are based on the desired external behavior: avoiding duplicate
2049 address assignments; allowing clients to continue using leases which
2050 they obtained from one failover partner even if they can only commun-
2051 icate with the other partner; allowing the backup DHCP server to
2052 grant new leases even if it is unable to communicate with the primary
2053 server. The desired external DDNS behavior for DHCP failover servers
2054 is:
2055
2056 1. Allow timely DDNS updates from the server which grants a
2057 client a lease. Recognize that there is often a DDNS update
2058 lifecycle which parallels the DHCP lease lifecycle. This is
2059 likely to include the addition of records when the lease is
2060 granted, and the removal of DNS records when the lease is sub-
2061 sequently made available for allocation to a different client.
2062
2063 2. Communicate enough information between the two failover
2064 servers to allow one to complete the DDNS update 'lifecycle'
2065 even if the other server originally granted the lease.
2066
2067 3. Avoid redundant or overlapping DDNS updates, where both fail-
2068 over servers are attempting to perform DDNS updates for the
2069 same lease-client binding. Avoid situations where one partner
2070 is attempting to add RRs related to a lease binding while the
2071
2072
2073
2074 Droms, et. al. Expires September 2003 [Page 37]
2075 \f
2076 Internet Draft DHCP Failover Protocol March 2003
2077
2078
2079 other partner is attempting to remove RRs related to the same
2080 lease binding.
2081
2082 5.12.2. Use of the DDNS option
2083
2084 In order for either server to be able to complete a DDNS update, or
2085 to remove DNS records which were added by its partner, both servers
2086 need to know the FQDN associated with the lease-client binding. The
2087 FQDN associated with the client's A RR and PTR RR SHOULD be communi-
2088 cated from the server which adds records into the DNS to its partner.
2089 The initiating server SHOULD use the DDNS option in the BNDUPD mes-
2090 sages to inform the partner server of the status of any DDNS updates
2091 associated with a lease binding. Failover servers MAY choose not to
2092 include the DDNS option in BNDUPD messages if there has been no
2093 change in the status of any DDNS update related to the lease binding.
2094 The partner server receiving BNDUPD messages containing the DDNS
2095 option SHOULD compare the status flags and the FQDN contained in the
2096 option data with the current DDNS information it has associated with
2097 the lease binding, and update its notion of the DDNS status accord-
2098 ingly.
2099
2100 The initiating server MAY send a BNDUPD to its partner before the
2101 DDNS update has been successfully completed. If it does so, it SHOULD
2102 leave the 'C' bit in the Flags field clear, to indicate to the
2103 partner that the DDNS update may not be complete. When the DDNS
2104 update has been successfully acknowledged by the DNS server, the ini-
2105 tiating DHCP server SHOULD include the DDNS option in its next BNDUPD
2106 message about the binding, so that the partner server will be able to
2107 record the final status of the DDNS update. The initiating server
2108 SHOULD set the 'C' bit in the DDNS option if the DDNS update was suc-
2109 cessfully accepted by the DNS server.
2110
2111 Some implementations will choose to send a BNDUPD without waiting for
2112 the DDNS update to complete, and then will send a second BNDUPD once
2113 the DDNS update is complete. Other implementations will delay sending
2114 the partner a BNDUPD until the DDNS update has been acknowledged by
2115 the DNS server, or until some time-limit has elapsed, in order to
2116 avoid sending a second BNDUPD.
2117
2118 The Domain Name field in the DDNS option contains the FQDN that will
2119 be associated with the A RR (if the server is performing an A RR
2120 update for the client) and the PTR RR. This FQDN may be composed in
2121 any of several ways, depending on server configuration and the infor-
2122 mation provided by the client in its DHCP messages. The client may
2123 supply a hostname which it would like the server to use in forming
2124 the FQDN, or it may supply the entire FQDN. The server may be config-
2125 ured to attempt to use the information the client supplies, it may be
2126 configured with an FQDN to use for the client, or it may be
2127
2128
2129
2130 Droms, et. al. Expires September 2003 [Page 38]
2131 \f
2132 Internet Draft DHCP Failover Protocol March 2003
2133
2134
2135 configured to synthesize an FQDN. The responsive server SHOULD
2136 include the FQDN that it will be using in DDNS updates it initiates
2137 when it sends the DDNS option.
2138
2139 Since the responsive server may not have completed the DDNS update at
2140 the time it sends the first BNDUPD about the lease binding, there may
2141 be cases where the FQDN in later BNDUPD messages does not match the
2142 FQDN included in earlier messages. For example, the responsive
2143 server may be configured to handle situations where two or more DHCP
2144 client FQDNs are identical by modifying the most-specific label in
2145 the FQDNs of some of the clients in an attempt to generate unique
2146 FQDNs for them (a process sometimes called "disambiguation"). Alter-
2147 natively, at sites which use some or all of the information which
2148 clients supply to form the FQDN, it's possible that a client's confi-
2149 guration may be changed so that it begins to supply new data. The
2150 responsive server may react by removing the DNS records which it ori-
2151 ginally added for the client, and replacing them with records that
2152 refer to the client's new FQDN. In such cases, the responsive server
2153 SHOULD include the actual FQDN that was used in subsequent DDNS
2154 options. The responsive server SHOULD include relevant client-option
2155 data in the client-request-options option in its BNDUPD messages.
2156 This information may be necessary in order to allow the non-
2157 responsive partner to detect client configuration changes that change
2158 the hostname or FQDN data which the client includes in its DHCP
2159 requests.
2160
2161 5.12.3. Adding RRs to the DNS
2162
2163 A failover server which is going to perform DDNS updates SHOULD ini-
2164 tiate the DDNS update when it grants a new lease to a client. The
2165 non-responsive partner SHOULD NOT initiate a DDNS update when it
2166 receives the BNDUPD after the lease has been granted. The failover
2167 protocol ensures that only one of the partners will grant a lease to
2168 any individual client, so it follows that this requirement will
2169 prevent both partners from initiating updates simultaneously. The
2170 server initiating the update SHOULD follow the protocol in [FQDN].
2171 The server may be configured to perform an A RR update on behalf of
2172 its clients, or not. Ordinarily, a failover server will not initiate
2173 DDNS updates when it renews leases. In two cases, however, a failover
2174 server MAY initiate a DDNS update when it renews a lease to its
2175 existing client:
2176
2177 1. When the lease was granted before the server was configured to
2178 perform DDNS updates, the server MAY be configured to perform
2179 updates when it next renews existing leases. Since both
2180 servers are responsive to renewals in NORMAL state, it is not
2181 enough to simply require the non-responsive server to avoid a
2182 DNS update in this case. The server which would be responsive
2183
2184
2185
2186 Droms, et. al. Expires September 2003 [Page 39]
2187 \f
2188 Internet Draft DHCP Failover Protocol March 2003
2189
2190
2191 to a DHCPDISCOVER from this client (even though the current
2192 request is a DHCPREQUEST/RENEW) is the server which should
2193 initiate the DDNS update.
2194
2195 2. If a server is in PARTNER-DOWN state, it can conclude that its
2196 partner is no longer attempting to perform an update for the
2197 existing client. If the remaining server has not recorded that
2198 an update for the binding has been successfully completed, the
2199 server MAY initiate a DDNS update. It MAY initiate this
2200 update immediately upon entry to PARTNER-DOWN state, it may
2201 perform this in the background, or it MAY initiate this update
2202 upon next hearing from the DHCP client.
2203
2204 5.12.4. Deleting RRs from the DNS
2205
2206 The failover server which makes an IP address FREE SHOULD initiate
2207 any DDNS deletes, if it has recorded that DNS records were added on
2208 behalf of the client.
2209
2210 A server not in PARTNER-DOWN state "makes an IP address FREE" when it
2211 initiates a BNDUPD with a binding-status of FREE, EXPIRED, or
2212 RELEASED. Its partner confirms this status by acking that BNDUPD,
2213 and upon receipt of the ACK the server has "made the IP address
2214 FREE". Conversely, a server in PARTNER-DOWN state "makes an IP
2215 address FREE" when it sets the binding-status to FREE, since in
2216 PARTNER-DOWN state no communications is required with the partner.
2217
2218 It is at this point that it should initiate the DDNS operations to
2219 delete RRs from the DDNS. Its partner SHOULD NOT initiate DDNS
2220 deletes for DNS records related to the lease binding as part of send-
2221 ing the BNDACK message. The partner MAY have issued BNDUPD messages
2222 with a binding-status of FREE, EXPIRED, or RELEASED previously, but
2223 the other server will have NAKed these BNDUPD messages.
2224
2225 The failover protocol ensures that only one of the two partner
2226 servers will be able to make a lease FREE. The server making the
2227 lease FREE may be doing so while it is in NORMAL communication with
2228 its partner, or it may be in PARTNER-DOWN state. If a server is in
2229 PARTNER-DOWN state, it may be performing DDNS deletes for RRs which
2230 its partner added originally. This allows a single remaining partner
2231 server to assume responsibility for all of the DDNS activity which
2232 the two servers were undertaking.
2233
2234 Another implication of this approach is that no DDNS RR deletes will
2235 be performed while either server is in COMMUNICATIONS-INTERRUPTED
2236 state, since no IP addresses are moved into the FREE state during
2237 that period.
2238
2239
2240
2241
2242 Droms, et. al. Expires September 2003 [Page 40]
2243 \f
2244 Internet Draft DHCP Failover Protocol March 2003
2245
2246
2247 5.13. Reservations and failover
2248
2249 Some DHCP servers support a capability to offer specific pre-
2250 configured IP addresses to DHCP clients. These are real DHCP
2251 clients, they do the entire DHCP protocol, but these servers always
2252 offer the client a specific pre-configured IP address -- and they
2253 offer that IP address to no other clients. Such a capability has
2254 several names, but it is sometimes called a "reservation", in that
2255 the IP address is reserved for a particular DHCP client.
2256
2257 In a situation where there are two DHCP servers serving the same sub-
2258 net without using failover, the two DHCP server's need to have dis-
2259 joint IP address pools, but identical reservations for the DHCP
2260 clients.
2261
2262 In a failover context, both servers need to be configured with the
2263 proper reservations in an identical manner, but if we stop there
2264 problems can occur around the edge conditions where reservations are
2265 made for an IP address that has already been leased to a different
2266 client. Different servers handle this conflict in different ways,
2267 but the goal of the failover protocol is to allow correct operation
2268 with any server's approach to the normal processing of the DHCP pro-
2269 tocol.
2270
2271 The general solution with regards to reservations is as follows.
2272 Whenever a reserved IP address becomes FREE (i.e., when first config-
2273 ured or whenever a client frees it or it expires or is reset), the
2274 primary server MUST show that IP address as FREE (and thus available
2275 for its own allocation) and it MUST send it to the secondary server
2276 with the R bit set in the IP-flags option and the binding-status
2277 BACKUP.
2278
2279 Note that this implies that a reserved IP address goes through the
2280 normal state changes from FREE to ACTIVE (and possibly back to FREE).
2281 The failover protocol supports this approach to reservations, i.e.,
2282 where the IP address undergoes the normal state changes of any IP
2283 address, but it can only be offered to the client for which it is
2284 reserved. Other approaches to the support of reservations exist in
2285 some DHCP server implementations (e.g., where the IP address is
2286 apparently leased to a particular client forever, without any expira-
2287 tion). The goal is for the failover protocol to support any of the
2288 usual approaches to reservations, both those that allow an IP address
2289 to go through different states when reserved, and those that don't.
2290
2291 From the above, it follows that a reservation soley on the secondary
2292 will not necessarily allow the secondary to offer that address to
2293 client to whom it is reserved. The reservation must also appear on
2294 the primary as well for the secondary to be able to offer the IP
2295
2296
2297
2298 Droms, et. al. Expires September 2003 [Page 41]
2299 \f
2300 Internet Draft DHCP Failover Protocol March 2003
2301
2302
2303 address to the client to which is is reserved.
2304
2305 When the reservation on an IP address is cancelled, if the IP address
2306 is currently FREE and the server is the primary, or BACKUP and the
2307 server is the secondary, the server MUST send a BNDUPD to the other
2308 server with the binding-status FREE and the R bit clear.
2309
2310 5.14. Dynamic BOOTP and failover
2311
2312 Some DHCP servers support a capability to offer IP addresses to BOOTP
2313 clients without having a particular address previously allocated for
2314 those clients. This capability is often called something like
2315 "dynamic BOOTP". It is discussed briefly in RFC 1534 [RFC 1534].
2316
2317 This capability has a negative interaction with the fundamental ele-
2318 ments of the failover protocol, in that an address handed out to a
2319 BOOTP device has no term (or effectively no term, in that usually
2320 they are considered leases for "forever"). There is no opportunity
2321 to hand out a lease which is only the MCLT long when first hearing
2322 from a BOOTP device, because they may only interact once with the
2323 DHCP server and they have no notion of a lease expiration time. Thus
2324 the entire concept of the MCLT and waiting the MCLT after entering
2325 PARTNER-DOWN state is defeated when dealing with BOOTP devices.
2326
2327 With some restrictions, however, dynamic BOOTP devices can be sup-
2328 ported in a server on a subnet where failover is supported. The only
2329 restriction (and it is not small) is that on any portion of the sub-
2330 net (in any address pool) where dynamic BOOTP devices can be allo-
2331 cated IP addresses, a DHCP server MUST NOT ever use any of the IP
2332 addresses which were previously available for allocation by its fail-
2333 over partner. Thus, the addresses allocated by the primary to the
2334 secondary for allocation that might have been allocated to BOOTP dev-
2335 ices MUST NOT ever be used by the primary server even if it is in
2336 PARTNER-DOWN state and has waited the MCLT after entering that state.
2337 Conversely, addresses available for allocation by the primary MUST
2338 NOT be used by the secondary even it is in PARTNER-DOWN state. The
2339 reason for this is because one of those IP address could have been
2340 allocated by the secondary server to a BOOTP device, and the primary
2341 server would have no way of ever knowing that happened.
2342
2343 Whenever a server sends BNDUPD message to its partner, if the client
2344 associated with the IP address is a BOOTP client, then the server
2345 MUST set the B bit in the IP-flags option.
2346
2347 There is a very slight possibility that a BOOTP client could get an
2348 IP address on each server of a failover pair. When these two servers
2349 eventually attempt to resolve this conflict, they SHOULD agree to
2350 disagree, since it is not possible to know which IP address the BOOTP
2351
2352
2353
2354 Droms, et. al. Expires September 2003 [Page 42]
2355 \f
2356 Internet Draft DHCP Failover Protocol March 2003
2357
2358
2359 client will actually use -- indeed, it could use both. Operator
2360 intervention will, in general, be required to rectify this situation.
2361 Fortunately, it is extremely unlikely to ever actually occur.
2362
2363 5.15. Guidelines for selecting MCLT
2364
2365 There is no one correct value for the MCLT. There is an explicit
2366 tradeoff between various factors in selecting an MCLT value.
2367
2368 5.15.1. Short MCLT
2369
2370 A short MCLT value will mean that after entering PARTNER-DOWN state,
2371 a server will only have to wait a short time before it can start
2372 allocating its partner's IP addresses to DHCP clients. Furthermore,
2373 it will only have to wait a short time after the expiration of a
2374 lease on an IP address before it can reallocate that IP address to
2375 another DHCP client.
2376
2377 However the downside of a short MCLT value is that the initial lease
2378 interval that will be offered to every new DHCP client will be short,
2379 which will cause increased traffic as those clients will need to send
2380 in their first renew in a half of a short MCLT time. In addition,
2381 the lease extensions that a server in COMMUNICATIONS-INTERRUPTED
2382 state can give will be only the MCLT after the server has been in
2383 COMMUNICATIONS-INTERRUPTED for around the desired client lease
2384 period. If a server stays in COMMUNICATIONS-INTERRUPTED for that
2385 long, then the leases it hands out will be short and that will
2386 increase the load on that server, possibly causing difficulty.
2387
2388 5.15.2. Long MCLT
2389
2390 A long MCLT value will mean that the initial lease period will be
2391 longer and the time that a server in COMMUNICATIONS-INTERRUPTED state
2392 will be able to extend leases (after it has been in COMMUNICATIONS-
2393 INTERRUPTED state for around the desired client lease period) will be
2394 longer.
2395
2396 However, a server entering PARTNER-DOWN state will have to wait the
2397 longer MCLT before being able to allocate its partner's IP addresses
2398 to new DHCP clients. This may mean that additional IP addresses are
2399 required in order to cover this time period. Further, the server in
2400 PARTNER-DOWN will have to wait the longer MCLT from every lease
2401 expiration before it can reallocate an IP address to a different DHCP
2402 client.
2403
2404 5.16. What is sent in response to an UPDREQ or UPDREQALL message?
2405
2406 In section 7.3, the UPDREQ message is defined, and it says that the
2407
2408
2409
2410 Droms, et. al. Expires September 2003 [Page 43]
2411 \f
2412 Internet Draft DHCP Failover Protocol March 2003
2413
2414
2415 receiving server sends to the requesting server "all of the binding
2416 database information that it has not already seen". In section
2417 7.4.2, the UPDREQALL message is defined, and it says that the receiv-
2418 ing server sends to the requesting server "all binding database
2419 information".
2420
2421 Both of these statements need further elaboration.
2422
2423 First, for the UPDREQ message, the information to be sent in BNDUPD
2424 messages concerns "all of the binding database information it has not
2425 already seen". Since every BNDUPD is acked by the receiving server,
2426 the sending server need only keep track of which IP addresses have
2427 binding database changes not yet seen by the partner, and when they
2428 are finally acked by the partner it can record that. Thus, at any
2429 time, it knows which IP addresses have unacked binding database
2430 information. This is less simple when, across reconfigurations of
2431 the servers, an IP address can change the failover partner to which
2432 it is associated. In that case, it is important to reset the indica-
2433 tion that the partner has seen this binding information. See section
2434 5.17, below, for a more complete discussion of this issue.
2435
2436 Second, in the event that a failover server's binding database infor-
2437 mation is restored from a backup, it will be partially out of date.
2438 In this case, its partner's indication of which binding database
2439 information the restored server has seen will be also be out of date.
2440
2441 The solution to this problem is for a server which is connecting with
2442 its partner to check the partner's last communicated time, and if it
2443 is very much ahead of its own last communicated time, go to into
2444 RECOVER state and transmit an UPDREQALL to allow it to refresh its
2445 state. See section 9.3.2, step 5. If the partner's last communi-
2446 cated time is very much behind its own record of when it last commun-
2447 icated with the partner, then it SHOULD invalidate its information on
2448 which binding database information the partner server knows, so that
2449 it will send all of its relevant binding database information to the
2450 partner.
2451
2452 Third, in the event that a server receives a UPDREQALL message, what
2453 constitutes "all binding database information"? At first glance this
2454 would seem to be information on every configured IP address in the
2455 server. While this would be technically correct, it may impose a
2456 serious and unacceptable performance penalty on servers which have
2457 millions of configured IP addresses. What can be done to lessen the
2458 data that must be sent for an UPDREQALL?
2459
2460 When sending "all binding database information", if the sending
2461 server sends only information concerning IP addresses which have been
2462 at some time associated with clients, it will send enough information
2463
2464
2465
2466 Droms, et. al. Expires September 2003 [Page 44]
2467 \f
2468 Internet Draft DHCP Failover Protocol March 2003
2469
2470
2471 to satisfy the needs of the failover protocol. It need not send
2472 information on any IP addresses that have never been used, since
2473 presumably they will be initialized as available to the primary
2474 server (i.e. FREE) on any server employing failover.
2475
2476 5.17. How do you determine that your partner is "up to date" for
2477 specific binding?
2478
2479 Throughout this document, one server is assumed to know for each IP
2480 address binding whether or not its partner is "up to date" for that
2481 binding. There are some subtle issues involved in recording this "up
2482 to date" information about a specific binding.
2483
2484 In a steady state world, it would suffice to have a single bit in the
2485 binding database to represent the information about whether the
2486 partner was or was not up to date.
2487
2488 In a more complex environment a configuration change affecting a par-
2489 ticular IP address may change the failover endpoint with which it is
2490 associated, and if this should happen, any "up to date" bit which is
2491 written into the bindings database will be accurate for only the pre-
2492 vious failover endpoint, but not the current failover endpoint. If
2493 failover is disabled and then re-enabled (and the "up to date" bits,
2494 if used, are not cleared) problems can also occur.
2495
2496 A server MUST have be able to relate the "up to date" condition to a
2497 particular failover endpoint and even a particular instantiation of
2498 that failover endpoint. The techniques to do this are implementation
2499 dependent.
2500
2501 In addition, section 7.4 requires that a server be able to remember
2502 that an UPDREQALL message has been received and to treat every UPDREQ
2503 message as an UPDREQALL message until the first UPDDONE message is
2504 sent. One way to do this is to clear all of the "up to date" indica-
2505 tions for an entire failover endpoint upon receipt of an UPDREQALL
2506 message, thereby ensuring that every active binding will be sent to
2507 the partner whether through the completion of this UPDREQALL or
2508 through processing of a subsequent UPDREQ message. This is actually
2509 better than remembering that an UPDREQALL was received and turning
2510 every UPDREQ into an UPDREQALL, since any information sent in an
2511 incomplete UPDREQALL (or subsequent UPDREQ messages turned into "all"
2512 messages) will be remembered and not re-sent.
2513
2514 6. Common Message Format
2515
2516 This section discusses the common message format that all failover
2517 messages have in common, including the message header format as well
2518 as the common option format. See section 12 for the the definitions
2519
2520
2521
2522 Droms, et. al. Expires September 2003 [Page 45]
2523 \f
2524 Internet Draft DHCP Failover Protocol March 2003
2525
2526
2527 of the specific options used in the failover protocol.
2528
2529 6.1. Message header format
2530
2531 The options contained in the payload data section of the failover
2532 message all use a two byte option number and two byte length format.
2533
2534 All failover protocol messages are sent over the TCP connection
2535 between failover endpoints and encoded using a message format
2536 specific to the failover protocol.
2537
2538 There exists a common message format for all failover messages, which
2539 utilizes the options in a way similar to the DHCP protocol. For each
2540 message type, some options are required and some are optional. In
2541 addition, when a message is received any options that are not under-
2542 stood by the receiving server MUST be ignored.
2543
2544 All of the fields in the fixed portion of the message MUST be filled
2545 with correct data in every message sent.
2546
2547 0 1 2 3
2548 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2550 | message length (2) | msg type (1) |payload off (1)|
2551 +---------------+---------------+---------------+---------------+
2552 | time (4) |
2553 +---------------------------------------------------------------+
2554 | xid (4) |
2555 +---------------------------------------------------------------+
2556 | 0 or more additional header bytes (variable) |
2557 +---------------------------------------------------------------+
2558 | payload data (variable) |
2559 | |
2560 | formatted as DHCP-style options |
2561 | using a two byte option code and two byte length |
2562 | See section 6.2 for details. |
2563 +---------------------------------------------------------------+
2564
2565
2566
2567 message length - 2 bytes, network byte order
2568
2569 This is the length of the message in bytes. It includes the two byte
2570 message length itself. The maximum length is 2048 bytes. The
2571 minimum length is 12.
2572
2573
2574
2575
2576
2577
2578 Droms, et. al. Expires September 2003 [Page 46]
2579 \f
2580 Internet Draft DHCP Failover Protocol March 2003
2581
2582
2583 msg type - 1 byte
2584
2585 The message type field is used to distinguish between messages.
2586
2587 The following message types are defined:
2588
2589 Value Message Type
2590 ----- ------------
2591 0 reserved not used
2592 1 POOLREQ request allocation of addresses
2593 2 POOLRESP respond with allocation count
2594 3 BNDUPD update partner with binding info
2595 4 BNDACK acknowledge receipt of binding update
2596 5 CONNECT establish connection with the secondary
2597 6 CONNECTACK respond to attempt to establish connection with partner
2598 7 UPDREQALL request full transfer of binding info
2599 8 UPDDONE ack send and ack of req'd binding info
2600 9 UPDREQ request transfer of un-acked binding info
2601 10 STATE inform partner of current state or state change
2602 11 CONTACT probe communications integrity with partner
2603 12 DISCONNECT close a connection
2604
2605
2606 New message types should be defined in one of two ranges, 0-127 or
2607 129-255. The range of 0-127 is used for messages that MUST be sup-
2608 ported by every server, and if a server receives a message in the
2609 range of 0-127 that it doesn't understand, it MUST close the TCP con-
2610 nection. The range of 128-255 is used for messages which MAY be sup-
2611 ported but are not required, and if a server receives a message in
2612 this range that it does not understand it SHOULD ignore the message.
2613
2614 payload offset - 1 byte
2615
2616 The byte offset of the Payload Data, from the beginning of the
2617 failover message header. The value for the current protocol version
2618 (version 1) is 8.
2619
2620 time - 4 bytes, network byte order
2621
2622 The absolute time in GMT when the message was transmitted,
2623 represented as seconds elapsed since Jan 1, 1970 (i.e., similar to
2624 the ANSI C time_t time value representation). While the ANSI C
2625 time_t value is signed, the value used in this specification is
2626 unsigned.
2627
2628 A server SHOULD set this time as close to the actual transmission of
2629 the message as possible.
2630
2631
2632
2633
2634 Droms, et. al. Expires September 2003 [Page 47]
2635 \f
2636 Internet Draft DHCP Failover Protocol March 2003
2637
2638
2639 xid - 4 bytes, network byte order
2640
2641 This is the transaction id of the failover message. The sender of a
2642 failover protocol message is responsible for setting this number, and
2643 the receiver of the message copies the number over into any response
2644 message, treating it as opaque data. The sender MUST ensure that
2645 every message sent from a particular failover endpoint over the
2646 associated TCP connection has a unique transaction id.
2647
2648 For failover messages that have no corresponding response message,
2649 the XID value is meaningless, but MUST be supplied. The XID value is
2650 used solely by the receiver of a response message to determine the
2651 corresponding request message.
2652
2653 Request messages where the XID is used in the corresponding response
2654 messages are: POOLREQ, BNDUPD, CONNECT, UPDREQALL, and UPDREQ. The
2655 corresponding response messages are POOLRESP, BNDACK, CONNECTACK,
2656 UPDDONE, and UPDDONE, respectively.
2657
2658 As requests/responses don't survive connection reestablishment, XIDs
2659 only need to be unique during a specific connection.
2660
2661
2662 payload data - variable length
2663
2664 The options are placed after the header, after skipping payload
2665 offset bytes from beginning of the message. The payload data options
2666 are not preceded by a "cookie" value.
2667
2668 The payload data is formatted as DHCP style options using two byte
2669 option codes and two byte option lengths. The option codes are in a
2670 namespace which is unique to the failover protocol.
2671
2672 The maximum length of the payload data in octets is 2048 less the
2673 size of the header, i.e., the maximum message length is 2048 octets.
2674
2675 6.2. Common option format
2676
2677 The options contained in the payload data section of the failover
2678 message all use a two byte option number and two byte length format.
2679
2680 The option numbers are drawn from an option number space unique to
2681 the failover protocol. All of the message types share a common
2682 option number space and common options definitions, though not all
2683 options are required or meaningful for every message.
2684
2685 In contrast to the options which appear in DHCP client and server
2686 messages, the options in failover message are ordered. That is, for
2687
2688
2689
2690 Droms, et. al. Expires September 2003 [Page 48]
2691 \f
2692 Internet Draft DHCP Failover Protocol March 2003
2693
2694
2695 some messages the order in which the options appear in the payload
2696 data area is significant. The messages for which option ordering is
2697 significant explicitly describe the ordering requirements. If no
2698 ordering requirements are mentioned, then the order is not signifi-
2699 cant for that message.
2700
2701 For all options which refer to time, they all use an absolute time in
2702 GMT. Time synchronization has already been achieved between the
2703 source and the target server using the CONNECT message and is updated
2704 and refined using the time in every packet.
2705
2706 The time value is an unsigned 32 bit integer in network byte order
2707 giving the number of seconds since 00:00 UTC, 1st January 1970. This
2708 can be converted to an NTP timestamp by adding decimal 2208988800.
2709 This time format will not wrap until the year 2106. Until sometime
2710 in 2038, it is equal to the ANSI C time_t value (which is a signed 32
2711 bit value and will overflow into a negative number in 2038).
2712
2713 Options should appear once only in each message (except for BNDUPD
2714 and BNDACK messages where bulking is used, see section 6.3 for
2715 details.) An option that appears twice is not concatenated, but
2716 treated as an error.
2717
2718 Specific option values are described in section 12.
2719
2720 See section 13 for how to define additional options.
2721
2722 6.3. Batching multiple binding update transactions in one BNDUPD mes-
2723 sage
2724
2725 Implementations of this protocol MAY send multiple binding update
2726 transactions in one BNDUPD message, where a binding update transac-
2727 tion is defined as the set of options which are associated with the
2728 update of a single IP address. All implementations of this protocol
2729 MUST be prepared to receive BNDUPD messages which contain multiple
2730 binding update transactions and respond correctly to them, including
2731 replying with a BNDACK message which contains status for the multiple
2732 binding update transactions contained in the BNDUPD message.
2733
2734 In the discussion of sending and receiving BNDUPD messages in section
2735 7.1 and BNDACK messages in section 7.2, each BNDUPD message and
2736 BNDACK message is assumed to contain a single binding update transac-
2737 tion in order to reduce the complexity of the discussions in section
2738 7.
2739
2740 Multiple binding update transactions MAY be batched together in one
2741 BNDUPD protocol message with the data sets for the individual tran-
2742 sactions delimited by the assigned-IP-address option, which MUST
2743
2744
2745
2746 Droms, et. al. Expires September 2003 [Page 49]
2747 \f
2748 Internet Draft DHCP Failover Protocol March 2003
2749
2750
2751 appear first in the option set for each transaction. Ordering of
2752 options between the assigned-IP-address options is not significant.
2753 This is illustrated in the following schematic representation:
2754
2755
2756 Non-IP Address/Non-client specific options first
2757 assigned-IP-address option for the first IP address
2758 Options pertaining to first address, including at least the
2759 binding-status option and others as required.
2760 assigned-IP-address option for the second IP address
2761 Options pertaining to second address, including at least the
2762 binding-status option and others as required.
2763 ...
2764 Trailing options (message digest).
2765
2766
2767 There MUST be a one-to-one correspondence between BNDUPD and BNDACK
2768 messages, and every BNDACK message MUST contain status for all of the
2769 binding update transactions in the corresponding BNDUPD message.
2770
2771 The BNDACK message corresponding to a BNDUPD message MUST contain
2772 assigned-IP-address options for all of the binding update transac-
2773 tions in the BNDUPD message. Thus, every BNDACK message contains
2774 exactly the same assigned-IP-address options as does its correspond-
2775 ing BNDUPD message. The order of the assigned-IP-address options
2776 MAY, however, be different. Here is a schematic representation of a
2777 BNDACK:
2778
2779
2780 Non-IP Address/Non-client specific options first
2781 assigned-IP-address option for the first IP address
2782 If rejected, reject-reason option and message option.
2783 assigned-IP-address option for the second IP address
2784 If rejected, reject-reason option and message option.
2785 ...
2786 Trailing options (message digest).
2787
2788
2789 In case the server chooses to reject some or all of the IP address
2790 binding information in a BNDUPD message in a BNDACK reply, the BNDACK
2791 message MUST contain a reject-reason option following every failed
2792 assigned-IP-address option in order to indicate that the binding
2793 update transaction for that IP address was not accepted and why. As
2794 with a BNDACK message containing a single binding update transaction,
2795 an assigned-IP-address option without any associated reject-reason
2796 option indicates a successful binding update transaction.
2797
2798
2799
2800
2801
2802 Droms, et. al. Expires September 2003 [Page 50]
2803 \f
2804 Internet Draft DHCP Failover Protocol March 2003
2805
2806
2807 7. Protocol Messages
2808
2809 This section contains the detailed definition of the protocol mes-
2810 sages, including the information to include when sending the message,
2811 as well as the actions to take upon receiving the message. The mes-
2812 sage type for each message appears as [n] in the heading for the mes-
2813 sage (see section 6.1).
2814
2815 7.1. BNDUPD message [3]
2816
2817 The binding update (BNDUPD) message is used to send the binding data-
2818 base changes (known as binding update transactions) to the partner
2819 server, and the partner server responds with a binding acknowledge-
2820 ment (BNDACK) message when it has successfully committed those
2821 changes to its own stable storage.
2822
2823 The rest of the failover protocol exists to determine whether the
2824 partner server is able to communicate or not, and to enable the
2825 partners to exchange BNDUPD/BNDACK messages in order to keep their
2826 binding databases in stable storage synchronized.
2827
2828 The rest of this section is written as though every BNDUPD message
2829 contains only a single binding update transaction in order to reduce
2830 the complexity of the discussion. See section 6.3 for information on
2831 how to create and process BNDUPD and BNDACK messages which contain
2832 multiple binding update transactions. Note that while a server MAY
2833 generate BNDUPD messages with multiple binding update transactions,
2834 every server MUST be able to process a BNDUPD message which contains
2835 multiple binding update transactions and generate the corresponding
2836 BNDACK messages with status for multiple binding update transactions.
2837
2838 The following table summarizes the various options for the BNDUPD
2839 message.
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
2850
2851
2852
2853
2854
2855
2856
2857
2858 Droms, et. al. Expires September 2003 [Page 51]
2859 \f
2860 Internet Draft DHCP Failover Protocol March 2003
2861
2862
2863
2864
2865 binding-status BACKUP
2866 RESET
2867 ABANDONED
2868 Option ACTIVE EXPIRED RELEASED FREE
2869 ------ ------ ------- -------- ----
2870 assigned-IP-address (3) MUST MUST MUST MUST
2871 IP-flags MUST(4) MUST(4) MUST(4) MUST(4)
2872 binding-status MUST MUST MUST MUST
2873 client-identifier MAY MAY MAY MAY(2)
2874 client-hardware-address MUST MUST MUST MAY(2)
2875 lease-expiration-time MUST MUST NOT MUST NOT MUST NOT
2876 potential-expiration-time MUST MUST NOT MUST NOT MUST NOT
2877 start-time-of-state SHOULD SHOULD SHOULD SHOULD
2878 client-last-trans.-time MUST SHOULD MUST MAY
2879 DDNS(1) SHOULD SHOULD SHOULD SHOULD
2880 client-request-options SHOULD SHOULD NOT SHOULD SHOULD NOT
2881 client-reply-options SHOULD SHOULD NOT SHOULD NOT SHOULD NOT
2882
2883 (1) MUST if server is performing dynamic DNS for this IP address, else
2884 MUST NOT.
2885 (2) MUST NOT if binding-status is ABANDONED.
2886 (3) assigned-IP-address MUST be the first option for an IP address
2887 (4) IP-flags option MUST appear if any flags are non-zero, else it
2888 MAY appear.
2889
2890 Table 7.1-1: Options used in a BNDUPD message
2891
2892
2893 7.1.1. Sending the BNDUPD message
2894
2895 A BNDUPD message SHOULD be generated whenever any binding changes. A
2896 change might be in the binding-status, the lease-expiration-time, or
2897 even just the last-transaction-time. In general, any time a DHCP
2898 server writes its stable storage, a BNDUPD message SHOULD be gen-
2899 erated. This will often be the result of the processing of a DHCP
2900 client request, but it might also be the result of a successful
2901 dynamic DNS update operation. Stable storage updates due to BNDUPD
2902 or BNDACK messages SHOULD NOT result in additional BNDUPD messages.
2903
2904 BNDUPD (and BNDACK) messages refer to the binding-status of the IP
2905 address, and this protocol defines a series of binding-statuses, dis-
2906 cussed in more detail below. Some servers may not support all of
2907 these binding-statuses, and so in those cases they will not be sent.
2908 Upon receipt of a BNDUPD message which contains an unsupported
2909 binding-status, a reasonable interpretation should be made (see sec-
2910 tion 5.10).
2911
2912
2913
2914 Droms, et. al. Expires September 2003 [Page 52]
2915 \f
2916 Internet Draft DHCP Failover Protocol March 2003
2917
2918
2919 All BNDUPD messages MUST contain the IP address of the binding update
2920 transaction in the assigned-IP-address option.
2921
2922 All binding update transactions MUST contain an IP-flags option if
2923 the value of any of the flags would be non-zero. The IP-flags option
2924 MAY be omitted if all of the flags that it contains are zero. The
2925 IP-flags option contains a flag which indicates if the IP address is
2926 currently reserved on the server sending the BNDUPD message. It also
2927 contains a flag which indicates that the lease is associated with a
2928 client that used the BOOTP protocol (as opposed to the DHCP protocol)
2929 to interact with the DHCP server.
2930
2931 All binding update transactions contain a binding-status option, and
2932 it will have one of the values found in section 5.11. Client infor-
2933 mation consists of client-hardware-address and possibly a client-
2934 identifier, and is explained in more detail later in this section.
2935 The following table indicates whether client information should or
2936 should not appear with each binding-status in a binding update tran-
2937 saction:
2938
2939
2940 binding-status includes client information
2941 ------------------------------------------------
2942 ACTIVE MUST
2943 EXPIRED SHOULD
2944 RELEASED SHOULD
2945 FREE MAY
2946 ABANDONED MUST NOT
2947 RESET MAY
2948 BACKUP MAY
2949
2950 Table 7.1.1-1: Client information required by various
2951 binding-status values.
2952
2953
2954 The ACTIVE binding-status requires some options to indicate the
2955 length of the binding:
2956
2957
2958 o lease-expiration-time
2959
2960 The lease-expiration-time option MUST appear, and be set to the
2961 expiration time most recently ACKed to the DHCP client. Note
2962 that the time ACKed to a DHCP client is a lease duration in
2963 seconds, while the lease-expiration-time option in a BNDUPD mes-
2964 sage is an absolute time value.
2965
2966 o potential-expiration-time
2967
2968
2969
2970 Droms, et. al. Expires September 2003 [Page 53]
2971 \f
2972 Internet Draft DHCP Failover Protocol March 2003
2973
2974
2975 The potential-expiration-time option MUST appear, and be set to
2976 a value beyond that of the lease-expiration time. This is the
2977 value that is ACKed by the BNDACK message. A server sending a
2978 BNDUPD message MUST be able to recover the potential-
2979 expiration-time sent in every BNDUPD, not just those that
2980 receive a corresponding BNDACK, in order to be able to protect
2981 against possible duplicate allocation of IP addresses after
2982 transitioning to PARTNER-DOWN state. See section 5.2.1 for
2983 details as to why the potential-expiration-time exists and
2984 guidelines for how to decide on the value.
2985
2986 The following option information applies to all BNDUPD messages,
2987 regardless of the value of the binding-status, unless otherwise
2988 noted.
2989
2990 o Identifying the client
2991
2992 For many of the binding-status values a client MUST appear while
2993 for others a client MAY appear, and for some a client MUST NOT
2994 appear.
2995
2996 A client is identified in a BNDUPD message by at least one and pos-
2997 sibly two options. The client-hardware-address option MUST appear
2998 any time that a client appears in a BNDUPD message, and contains
2999 the hardware type and chaddr information from the DHCP request
3000 packet. A failover client-identifier option MUST appear any time
3001 that a client appears in a BNDUPD message if and only if that
3002 client used a DHCP client-identifier option when communicating with
3003 the DHCP server. See section 12.5 and 12.4 for details of how to
3004 construct these two options from a DHCP request packet.
3005
3006 o start-time-of-state
3007
3008 The start-time-of-state SHOULD appear. It is set to the time at
3009 which this IP address first took on the state that corresponds to
3010 the current value of binding-status.
3011
3012 o last-transaction-time
3013
3014 The last-transaction-time value SHOULD appear. This is the time at
3015 which this DHCP server last received a packet from the DHCP client
3016 referenced by the client-identifier or client-hardware-address that
3017 was associated with the IP address referenced by the assigned-IP-
3018 address.
3019
3020 o DDNS
3021
3022 If the DHCP server is performing dynamic DNS operations on behalf
3023
3024
3025
3026 Droms, et. al. Expires September 2003 [Page 54]
3027 \f
3028 Internet Draft DHCP Failover Protocol March 2003
3029
3030
3031 of the DHCP client represented by the client-identifier or client-
3032 hardware-address, then it should include a DDNS option containing
3033 the domain name and status of any dynamic DNS operations enabled.
3034
3035 o client-request-options
3036
3037 If the BNDUPD was triggered by a request from a DHCP client (typi-
3038 cally those with binding-status of ACTIVE and RELEASED), then the
3039 server SHOULD include options of interest to a failover partner
3040 from the client's request packet in the client-request-options for
3041 transmission to its partner (see section 12.8).
3042
3043 A server sending a BNDUPD SHOULD remember the "interesting" options
3044 or the information that would appear in an "interesting" option for
3045 transmission at a time when the BNDUPD is not closely associated
3046 with a DHCP client request.
3047
3048 A server SHOULD send the following "interesting" options. It MAY
3049 send any DHCP client options. As new options are defined, the RFC
3050 defining these options SHOULD include information that they are
3051 "interesting to failover servers" if they should be sent as part of
3052 a BNDUPD.
3053
3054
3055 option option
3056 number name
3057 -----------------------------------------
3058
3059 12 host-name
3060 81 client-FQDN [FQDN]
3061 82 relay-agent-information [RFC 3046]
3062 77 user-class [RFC 3004]
3063 60 vendor-class-identifier
3064 118 subnet-selection [RFC 3011]
3065
3066 Table 7.1.1-2: Options which SHOULD be sent in
3067 the client-request-options option in a BNDUPD message.
3068
3069
3070 o client-reply-options
3071
3072 If the BNDUPD was triggered by a request from a DHCP client (typi-
3073 cally those with binding-status of ACTIVE and RELEASED), then the
3074 server SHOULD include options of interest to a failover partner
3075 from the server's DHCP reply packet in the client-reply-options for
3076 transmission to its partner (see section 12.7).
3077
3078 A server sending a BNDUPD SHOULD remember the "interesting" options
3079
3080
3081
3082 Droms, et. al. Expires September 2003 [Page 55]
3083 \f
3084 Internet Draft DHCP Failover Protocol March 2003
3085
3086
3087 or the information that would appear in an "interesting" option for
3088 transmission at a time when the BNDUPD is not closely associated
3089 with a DHCP client request.
3090
3091 A server SHOULD send the following "interesting" options. It MAY
3092 send any DHCP client options. As new options are defined, the RFC
3093 defining these options SHOULD include information that they are
3094 "interesting to failover servers" if they should be sent as part of
3095 a BNDUPD.
3096
3097
3098 option option
3099 number name
3100 -----------------------------------------
3101
3102 58 renewal-time
3103 59 rebinding-time
3104
3105 Table 7.1.1-3: Options which SHOULD be sent in
3106 the client-reply-options option in a BNDUPD message.
3107
3108
3109 The BNDUPD message SHOULD be sent as soon as possible from the time
3110 that the DHCP client received a response and the lease bindings data-
3111 base is written on stable storage.
3112
3113 7.1.2. Receiving the BNDUPD message
3114
3115 When a server receives a BNDUPD message, it needs to decide how to
3116 process the binding update transaction it contains and whether that
3117 transaction represents a conflict of any sort. The conflict resolu-
3118 tion process MUST be used on the receipt of every BNDUPD message, not
3119 just those that are received while in POTENTIAL-CONFLICT state, in
3120 order to increase the robustness of the protocol.
3121
3122 There are three sorts of conflicts:
3123
3124 o Two clients, one IP address conflict
3125
3126 This is the duplicate IP address allocation conflict. There are
3127 two different clients each allocated the same address. See sec-
3128 tion 7.1.3 for how to resolve this conflict.
3129
3130 o Two IP addresses, one client conflict
3131
3132 This conflict exists when a client on one server is associated
3133 with a one IP address, and on the other server with a different
3134 IP address in the same or a related subnet. This does not refer
3135
3136
3137
3138 Droms, et. al. Expires September 2003 [Page 56]
3139 \f
3140 Internet Draft DHCP Failover Protocol March 2003
3141
3142
3143 to the case where a single client has addresses in multiple dif-
3144 ferent subnets or administrative domains, but rather the case
3145 where on the same subnet the client has as lease on one IP
3146 address in one server and on a different IP address on the other
3147 server.
3148
3149 This conflict may or may not be a problem for a given DHCP
3150 server implementation. In the event that a DHCP server requires
3151 that a DHCP client have only one outstanding lease for an IP
3152 address on one subnet, this conflict should be resolved by
3153 accepting the lease information which has the latest client-
3154 last-transaction-time.
3155
3156 o binding-status conflict
3157
3158 This is normal conflict, where one server is updating the other
3159 with newer information. See section 7.1.3 for details of how to
3160 resolve these conflicts.
3161
3162 7.1.3. Deciding whether to accept the binding update transaction in a
3163 BNDUPD message
3164
3165 When analyzing a BNDUPD message from a partner server, if there is
3166 insufficient information in the BNDUPD to process it, then reject the
3167 BNDUPD with reject-reason 3: "Missing binding information".
3168
3169 If the IP address in the BNDUPD is not an IP address associated with
3170 the failover endpoint which received the BNDUPD message, then reject
3171 it with reject-reason 1: "Illegal IP address (not part of any address
3172 pool)".
3173
3174 IP addresses undergo binding status changes for several reasons,
3175 including receipt and processing of DHCP client requests, administra-
3176 tive inputs and receipt of BNDUPD messages. Every DHCP server needs
3177 to respond to DHCP client requests and administrative inputs with
3178 changes to its internal record of the binding-status of an IP
3179 address, and this response is not in the scope of the failover proto-
3180 col. However, the receipt of BNDUPD messages implies at least a pos-
3181 sible change of the binding-status for an IP address, and must be
3182 discussed here. See section 7.1.2 for general actions to take upon
3183 receipt of a BNDUPD message.
3184
3185 Every BNDUPD message SHOULD contain a client-last-transaction-time
3186 option, which MUST, if it appears, be the time that the server last
3187 interacted with the DHCP client. It MUST NOT be, for instance, the
3188 time that the lease on an IP address expired. If there has been no
3189 interaction with the DHCP client in question (or there is no DHCP
3190 client presently associated with this IP address), then there will be
3191
3192
3193
3194 Droms, et. al. Expires September 2003 [Page 57]
3195 \f
3196 Internet Draft DHCP Failover Protocol March 2003
3197
3198
3199 no client-last-transaction-time option in the BNDUPD message.
3200
3201 The list in Figure 7.1.3-1 is indexed by the binding-status that a
3202 server receives in a BNDUPD message. In many cases, the binding-
3203 status of an IP address within the receiving server's data storage
3204 will have an affect upon the checks performed prior to accepting the
3205 new binding-status in a BNDUPD message.
3206
3207 In Figure 7.1.3-1, to "accept" a BNDUPD means to update the server's
3208 bindings database with the information contained in the BNDUPD and
3209 once that update is complete, send a BNDACK message corresponding to
3210 the BNDUPD message. To "reject" a BNDUPD means to respond to the
3211 BNDUPD with a BNDACK with a reject-reason option included.
3212
3213 When interpreting the information in the following table (Figure
3214 7.1.3-1), for those rules that are listed with "time" -- if a BNDUPD
3215 doesn't have a client-last-transaction-time value, then it MUST NOT
3216 be considered later than the client-last-transaction-time in the
3217 receiving server's binding. If the BNDUPD contains a client-last-
3218 transaction-time value and the receiving server's binding does not,
3219 then the client-last-transaction-time value in the BNDUPD MUST be
3220 considered later than the server's.
3221
3222
3223
3224
3225
3226
3227
3228
3229
3230
3231
3232
3233
3234
3235
3236
3237
3238
3239
3240
3241
3242
3243
3244
3245
3246
3247
3248
3249
3250 Droms, et. al. Expires September 2003 [Page 58]
3251 \f
3252 Internet Draft DHCP Failover Protocol March 2003
3253
3254
3255
3256 binding-status in received BNDUPD
3257 binding-status
3258 in receiving FREE RESET
3259 server ACTIVE EXPIRED RELEASED BACKUP ABANDONED
3260
3261 ACTIVE accept(5) time(2) time(1) time(2) accept
3262 EXPIRED time(1) accept accept accept accept
3263 RELEASED time(1) time(1) accept accept accept
3264 FREE/BACKUP accept accept accept accept accept
3265 RESET time(3) accept accept accept accept
3266 ABANDONED reject(4) reject(4) reject(4) reject(4) accept
3267
3268 time(1): If the client-last-transaction-time in the BNDUPD
3269 is later than the client-last-transaction-time in the
3270 receiving server's binding, accept it, else reject it.
3271
3272 time(2): If the current time is later than the receiving
3273 servers' lease-expiration-time, accept it, else reject it.
3274
3275 time(3): If the client-last-transaction-time in the BNDUPD
3276 is later than the start-time-of-state in the receiving server's
3277 binding, accept it, else reject it.
3278
3279 (1,2,3): If rejecting, use reject reason 15: "Outdated binding
3280 information".
3281
3282 (4): Use reject reason 16: "Less critical binding information".
3283
3284 (5): If the clients in a BNDUPD message and in a receiving
3285 server's binding differ, then if the receiving server is a
3286 secondary accept it, else reject it with a reject reason of 2:
3287 "Fatal conflict exists: address in use by other client".
3288
3289 Figure 7.1.3-1: Accepting BNDUPD messages
3290
3291
3292
3293 If the IP address in the BNDUPD message has the R flag set in the
3294 IP-flags option, indicating it is a reserved IP address, and if the
3295 binding-status in the BNDUPD is BACKUP, then if the receiving server
3296 does not show the IP address as reserved, the receiving server SHOULD
3297 reject the BNDUPD using reject reason 19: "IP not reserved on this
3298 server".
3299
3300 7.1.4. Accepting the BNDUPD message
3301
3302 When accepting a BNDUPD message, the information contained in the
3303
3304
3305
3306 Droms, et. al. Expires September 2003 [Page 59]
3307 \f
3308 Internet Draft DHCP Failover Protocol March 2003
3309
3310
3311 client-request-options and client-reply-options SHOULD be examined
3312 for any information of interest to this server. For instance, a
3313 server which wished to detect changes in client specified host names
3314 might want to examine and save information from the host-name or
3315 client-FQDN options. Servers which expect to utilize information
3316 from the relay-agent-information option SHOULD store this informa-
3317 tion.
3318
3319 7.1.5. Time values related to the BNDUPD message
3320
3321 There are four time values that MAY be sent in a BNDUPD message.
3322
3323 o lease-expiration-time
3324
3325 The time that the server gave to the client, i.e., the time that
3326 the server believes that the client's lease will expire.
3327
3328 o potential-expiration-time
3329
3330 The time that the server wants to be sure its partner waits
3331 (added to the MCLT) before assuming that this lease has expired.
3332 Typically some time beyond the desired client lease time.
3333
3334 o client-last-transaction-time
3335
3336 The time that the client last interacted with this server.
3337
3338 o start-time-of-state
3339
3340 The time at which the binding first went into the current state.
3341
3342 As discussed in section 5.2, each server knows what its partner has
3343 ACKed with regard to potential-expiration time. In addition, each
3344 server needs to remember what it has told its partner as the
3345 potential-expiration-time. Moreover, each server must remember what
3346 it has acked to the *other* server as the most recent potential-
3347 expiration-time from that server.
3348
3349 Remember that each server sends a potential-expiration-time and
3350 receives an ACK for that as well as receiving a potential-
3351 expiration-time and needing to remember what it has acked for that.
3352
3353 While they don't have to be named in any particular way, the times
3354 that a server needs to remember for every IP address in order to
3355 implement the failover protocol are:
3356
3357 o lease-expiration-time
3358
3359
3360
3361
3362 Droms, et. al. Expires September 2003 [Page 60]
3363 \f
3364 Internet Draft DHCP Failover Protocol March 2003
3365
3366
3367 The time that a server gave to the DHCP client. A DHCP server
3368 needs to remember this time already, just to be a DHCP server.
3369 A server SHOULD update this time with the lease-expiration time
3370 received from a partner in a BNDUPD if the received lease-
3371 expiration time is later than the lease-expiration time recorded
3372 for this binding.
3373
3374 o sent-potential-expiration-time
3375
3376 The latest time sent to the partner for a potential-expiration-
3377 time.
3378
3379 o acked-potential-expiration-time
3380
3381 The latest time that the partner has acked for a potential
3382 expiration time. Typically the same as sent-potential-
3383 expiration-time if there is not a BNDUPD outstanding.
3384
3385 o received-potential-expiration-time
3386
3387 The latest time that this server has ever received as a
3388 potential-expiration-time from its partner in a BNDUPD that this
3389 server ACKed.
3390
3391 So, a server has to remember two additional times concerning BNDUPD
3392 messages that it has initiated, and one additional time concerning
3393 BNDUPD message that it has received. How are these times used?
3394
3395 First, let's look at the time that a DHCP server can offer to a DHCP
3396 client. A server can offer to a DHCP client a time that is no longer
3397 than the MCLT beyond the max( received-potential-expiration-time,
3398 acked-potential-expiration-time). One might think that the server
3399 should be able to offer only the MCLT beyond the acked-potential-
3400 expiration-time, and while that is certainly simple and easy to
3401 understand, it has negative consequences in actual operation.
3402
3403 To illustrate this, in the simple case where the primary updates the
3404 secondary for a while and then fails, if the secondary can then renew
3405 the client for only the MCLT beyond the acked-potential-expiration-
3406 time, then the secondary will only be able to renew the client for
3407 the MCLT, because the secondary has never sent a BNDUPD packet to the
3408 primary concerning this IP address and client, and so its acked-
3409 potential-expiration-time is zero.
3410
3411 However, since the secondary is allowed to renew the client with the
3412 MCLT beyond the max( received-potential-expiration-time, acked-
3413 potential-expiration-time), then the secondary can usually renew the
3414 client for the full lease period, at least for the first renew it
3415
3416
3417
3418 Droms, et. al. Expires September 2003 [Page 61]
3419 \f
3420 Internet Draft DHCP Failover Protocol March 2003
3421
3422
3423 sees from the client, since the received-potential-expiration-time is
3424 generally longer than the client's desired lease interval. The
3425 difference in renew times could make a big difference in server load
3426 on the secondary in this case.
3427
3428 What are the consequences of allowing a server to offer a DHCP client
3429 a lease term of the MCLT beyond the max( received-potential-
3430 expiration-time, acked-potential-expiration-time)? The consequences
3431 appear whenever a server enters PARTNER-DOWN state, and affect how
3432 long that server has to wait before reallocating expired leases.
3433 With this approach, when a server goes into PARTNER-DOWN state, it
3434 must wait the MCLT beyond the max( lease-expiration-time, sent-
3435 potential-expiration-time, acked-potential-expiration-time,
3436 received-potential-expiration-time ) for each IP address before it
3437 can reallocate that IP address to another DHCP client. One might
3438 normally think that it needed to wait only the MCLT beyond the max(
3439 lease-expiration-time, received-potential-expiration-time ), i.e.,
3440 beyond what it has told the client and what it has explicitly acked
3441 to the other server. But with the optimization discussed above --
3442 where either server can offer the DHCP client a lease term of the
3443 MCLT beyond the max( received-potential-expiration-time, acked-
3444 potential-expiration-time), then the additional times sent-
3445 potential-expiration-time and acked-potential-expiration-time must be
3446 added into the expression, since the partner could have used those
3447 times as part of its own lease time calculation.
3448
3449 Thus this optimization may require a longer waiting time when enter-
3450 ing PARTNER-DOWN state, but will generally allow servers to operate
3451 considerably more effectively when running in COMMUNICATIONS-
3452 INTERRUPTED state.
3453
3454 7.2. BNDACK message [4]
3455
3456 A server sends a binding acknowledgement (BNDACK) message when it has
3457 processed a BNDUPD message and after it has successfully committed to
3458 stable storage any binding database changes made as a result of pro-
3459 cessing the BNDUPD message. A BNDACK message is used to both accept
3460 or reject a BNDUPD message. A BNDACK message which contains a
3461 reject-reason option is a rejection of the corresponding BNDUPD mes-
3462 sage.
3463
3464 In order to reduce the complexity of the discussion, the rest of this
3465 section is written as though every BNDUPD message contains only a
3466 single binding update transaction and thus every corresponding BNDACK
3467 message would also contain reply information about only a single
3468 binding update transaction. See section 6.3 for information on how
3469 to create and process BNDUPD and BNDACK messages which contain multi-
3470 ple binding update transactions.
3471
3472
3473
3474 Droms, et. al. Expires September 2003 [Page 62]
3475 \f
3476 Internet Draft DHCP Failover Protocol March 2003
3477
3478
3479 Note that while a server MAY generate BNDUPD messages with multiple
3480 binding update transactions, every server MUST be able to process a
3481 BNDUPD message which contains multiple binding update transactions
3482 and generate the corresponding BNDACK messages with status for multi-
3483 ple binding update transactions. If a server does not ever create
3484 BNDUPD messages which contain multiple binding update transactions,
3485 then it does not need to be able to process a received BNDACK message
3486 with multiple binding update transactions. However, all servers MUST
3487 be able to create BNDACK messages which deal with multiple binding
3488 update transactions received in a BNDUPD message.
3489
3490 Every BNDUPD message that is received by a server MUST be responded
3491 to with a corresponding BNDACK message. The receiving server SHOULD
3492 respond quickly to every BNDUPD message but it MAY choose to respond
3493 preferentially to DHCP client requests instead of BNDUPD messages,
3494 since there is no absolute time period within which a BNDACK must be
3495 sent in response to a BNDUPD message, while DHCP clients frequently
3496 have strict time constraints.
3497
3498 A BNDACK message can only be sent in response to a BNDUPD message
3499 using the same TCP connection from which the BNDUPD message was
3500 received, since the XID's in BNDUPD messages are guaranteed unique
3501 only during the life of a single TCP connection. When a connection
3502 to a partner server goes down, a server with unprocessed BNDUPD mes-
3503 sages MAY simply drop all of those messages, since it can be sure
3504 that the partner will resend them when they are next in communica-
3505 tions (albeit with a different XID), or it MAY instead choose to pro-
3506 cess those BNDUPD messages, but it MUST NOT send any BNDACK messages
3507 in response.
3508
3509 The following table summarizes the options for the BNDACK message.
3510
3511
3512
3513
3514
3515
3516
3517
3518
3519
3520
3521
3522
3523
3524
3525
3526
3527
3528
3529
3530 Droms, et. al. Expires September 2003 [Page 63]
3531 \f
3532 Internet Draft DHCP Failover Protocol March 2003
3533
3534
3535
3536
3537 Option accept reject
3538 ------ ------ ------
3539 assigned-IP-address (1) MUST MUST
3540 IP-flags SHOULD NOT SHOULD NOT
3541 binding-status SHOULD NOT SHOULD NOT
3542 client-identifier SHOULD NOT SHOULD NOT
3543 client-hardware-address SHOULD NOT SHOULD NOT
3544 reject-reason SHOULD NOT MUST
3545 message SHOULD NOT SHOULD
3546 lease-expiration-time SHOULD NOT SHOULD NOT
3547 potential-expiration-time SHOULD NOT SHOULD NOT
3548 start-time-of-state SHOULD NOT SHOULD NOT
3549 client-last-trans.-time SHOULD NOT SHOULD NOT
3550 DDNS(1) SHOULD NOT SHOULD NOT
3551
3552 (1) assigned-IP-address MUST be the first option for an IP address
3553
3554 Table 7.2-1: Options used in a BNDACK message
3555
3556
3557 7.2.1. Sending the BNDACK message
3558
3559 The BNDACK message MUST contain the same xid as the corresponding
3560 BNDUPD message.
3561
3562 The assigned-IP-address option from the BNDUPD message MUST be
3563 included in the BNDACK message. Any additional options from the
3564 BNDUPD message SHOULD NOT appear in the BNDACK message. Note that
3565 any information sent in options (e.g, a later lease-expiration time)
3566 in the BNDACK message MUST NOT be assumed to necessarily be recorded
3567 in the stable storage of the server who receives the BNDACK message
3568 because there is no corresponding ACK of the BNDACK message. Any
3569 information that SHOULD be recorded in the partner server's stable
3570 storage MUST be transmitted in a subsequent BNDUPD.
3571
3572 If the server is accepting the BNDUPD, the BNDACK message includes
3573 only the assigned-IP-address option. If the server is rejecting the
3574 BNDUPD, the additional option reject-reason MUST appear in the BNDACK
3575 message, and the message option SHOULD appear in this case containing
3576 a human-readable error message describing in some detail the reason
3577 for the rejection of the BNDUPD message.
3578
3579 If the server rejects the BNDUPD message with a BNDACK and a reject-
3580 reason option, it may be because the server believes that it has
3581 binding information that the other server should know. A server
3582 which is rejecting a BNDUPD may initiate a BNDUPD of its own in order
3583
3584
3585
3586 Droms, et. al. Expires September 2003 [Page 64]
3587 \f
3588 Internet Draft DHCP Failover Protocol March 2003
3589
3590
3591 to update its partner with what it believes is better binding infor-
3592 mation, but it MUST ensure through some means that it will not end up
3593 in a situation where each server is sending BNDUPD messages as fast
3594 as possible because they can't agree on which server has better bind-
3595 ing data. Placing a considerable delay on the initiation of a BNDUPD
3596 message after sending a BNDACK with a reject-reason would be one way
3597 to ensure this situation doesn't occur.
3598
3599 7.2.2. Receiving the BNDACK message
3600
3601 When a server receives a BNDACK message, if it doesn't contain a
3602 reject-reason option that means that the BNDUPD message was accepted,
3603 and the server which sent the BNDUPD SHOULD update its stable storage
3604 with the potential-expiration-time value sent in the BNDUPD message.
3605
3606 If the BNDACK message contains a reject-reason option, that means
3607 that the BNDUPD was rejected. There SHOULD be a message option in
3608 the BNDACK giving a text reason for the rejection, and the server
3609 SHOULD log the message in some way. The server MUST NOT immediately
3610 try to resend the BNDUPD message as there is no reason to believe the
3611 partner won't reject it a second time. However a server MAY choose
3612 to send another BNDUPD at some future time, for instance when the
3613 server next processes an update request from its partner.
3614
3615 7.3. UPDREQ message [9]
3616
3617 The update request (UPDREQ) message is used by one server to request
3618 that its partner send it all of the binding database information that
3619 it has not already seen. Since each server is required to keep
3620 track at all times of the binding information the other server has
3621 ACKed, one server can request transmission of all un-ACKed binding
3622 database information held by the other server by using the UPDREQ
3623 message.
3624
3625 The UPDREQ message is used whenever the sending server cannot proceed
3626 before it has processed all previously un-ACKed binding update infor-
3627 mation, since the UPDREQ message should yield a corresponding UPDDONE
3628 message. The UPDDONE message is not sent until the server that sent
3629 the UPDREQ message has responded to all of the BNDUPD messages gen-
3630 erated by the UPDREQ message with BNDACK messages (they may either be
3631 accepted or rejected by the BNDACK messages, but they MUST have been
3632 responded to). Thus, the sender of the UPDREQ message can be sure
3633 upon receipt of an UPDDONE message that it has received and committed
3634 to stable storage all outstanding binding database updates.
3635
3636 See section 9, Failover Endpoint States, for the details of when the
3637 UPDREQ message is sent.
3638
3639
3640
3641
3642 Droms, et. al. Expires September 2003 [Page 65]
3643 \f
3644 Internet Draft DHCP Failover Protocol March 2003
3645
3646
3647 7.3.1. Sending the UPDREQ message
3648
3649 The UPDREQ message has no message specific options.
3650
3651 7.3.2. Receiving the UPDREQ message
3652
3653 A server receiving an UPDREQ message MUST send all binding database
3654 changes that have not yet been ACKed by the sending server. These
3655 changes are sent as undistinguished BNDUPD messages.
3656
3657 However, the server which received and is processing the UPDREQ mes-
3658 sage MUST track the BNDACK messages that correspond to the BNDUPD
3659 messages triggered by the UPDREQ message and, when they are all
3660 received, the server MUST send an UPDDONE message.
3661
3662 The server processing the UPDREQ message and sending BNDUPD messages
3663 to its partner SHOULD only track the BNDUPD and BNDACK message pairs
3664 for unACKed binding database changes that were present upon the
3665 receipt of the UPDREQ message. A server which has received an UPDREQ
3666 message SHOULD send BNDUPD messages for binding database changes that
3667 occur after receipt of the UPDREQ message, but it SHOULD NOT include
3668 those additional BNDUPD messages and their corresponding BNDACK mes-
3669 sages in the accounting necessary to consider the UPDREQ complete and
3670 subsequently send the UPDDONE message. If some additional binding
3671 database changes end up becoming part of the set of BNDUPD messages
3672 considered as part of the UPDREQ (due to whatever algorithm the
3673 server uses to scan its bindings database for unacked changes) it
3674 will probably not cause any difficulty, but a server MUST NOT attempt
3675 to include all such later BNDUPD messages in the accounting for the
3676 UPDREQ in order to be able to transmit an UPDDONE message.
3677
3678 When queuing up the BNDUPD messages for transmission to the sender of
3679 the UPDREQ message, the server processing the UPDREQ message MUST
3680 honor the value returned in the max-unacked-bndupd option in the CON-
3681 NECT or CONNECTACK message that set up the connection with the send-
3682 ing server. It MUST NOT send more BNDUPD messages without receiving
3683 corresponding BNDACKs than the value returned in max-unacked-bndupd.
3684 (See section 8 for more details.)
3685
3686 7.4. UPDREQALL message [7]
3687
3688 The update request all (UPDREQALL) message is used by one server to
3689 request that its partner send it all of the binding database informa-
3690 tion. This message is used to allow one server to recover from a
3691 failure of stable storage and to restore its binding database in its
3692 entirety from the other server.
3693
3694 A server which sends an UPDREQALL message cannot proceed until all of
3695
3696
3697
3698 Droms, et. al. Expires September 2003 [Page 66]
3699 \f
3700 Internet Draft DHCP Failover Protocol March 2003
3701
3702
3703 its binding update information is restored, and it knows that all of
3704 that information is restored when an UPDDONE message is received.
3705
3706 See section 9, Protocol state transitions, for the details of when
3707 the UPDREQALL message is sent.
3708
3709 The UPDREQALL message has no message specific options.
3710
3711 7.4.1. Sending the UPDREQALL message
3712
3713 The UPDREQALL is sent.
3714
3715 7.4.2. Receiving the UPDREQALL message
3716
3717 A server receiving an UPDREQALL message MUST send all binding data-
3718 base information to the sending server. See section 5.16 for details
3719 of what might actually comprise "all binding database information".
3720
3721 A server receiving an UPDREQALL message MUST remember that such a
3722 message has been received, ensure that all binding information extant
3723 at that point is sent to the partner prior to any UPDDONE message
3724 being sent to that partner. One way to do this is to remember the
3725 receipt of an UPDREQALL message and to and treat every subsequent
3726 UPDREQ message as an UPDREQALL message until it sends the first
3727 UPDDONE message after receipt of the UPDREQALL message. This
3728 requirement exists because communications may fail and become re-
3729 established between the two servers, and the specific conditions
3730 which provoked the UPDREQALL message may not longer exist even though
3731 the UPDREQALL message may not yet have completed. See section 5.17
3732 for information on a more efficient way to meet the above require-
3733 ment.
3734
3735 These changes are sent as undistinguished BNDUPD messages. Otherwise
3736 the processing is the same as for the UPDREQ message. See section
3737 7.3.2 for details.
3738
3739 7.5. UPDDONE message [8]
3740
3741 The update done (UPDDONE) message is used by a server receiving an
3742 UPDREQ or UPDREQALL message to signify that it has sent all of the
3743 BNDUPD messages requested by the UPDREQ or UPDREQALL request and that
3744 it has received a BNDACK for each of those messages.
3745
3746 While a BNDACK message MUST have been received for each BNDUPD mes-
3747 sage prior to the transmission of the UPDDONE message, this doesn't
3748 necessarily mean that all of the BNDUPD messages were accepted, only
3749 that all of them were responded to with a BNDACK message. Thus, a
3750 NAK (comprised of a BNDACK message containing a reject-reason option)
3751
3752
3753
3754 Droms, et. al. Expires September 2003 [Page 67]
3755 \f
3756 Internet Draft DHCP Failover Protocol March 2003
3757
3758
3759 could be used to reject a BNDUPD, but for the purposes of the UPDDONE
3760 message, such NAK would count as a response to the associated BNDUPD
3761 message, and would not block the eventual transmission of the UPDDONE
3762 message.
3763
3764 The xid in an UPDDONE message MUST be identical to the xid in the
3765 UPDREQ or UPDREQALL message that initiated the update process.
3766
3767 The UPDDONE message has no message specific options.
3768
3769 7.5.1. Sending the UPDDONE message
3770
3771 The UPDDONE message SHOULD be sent as soon as the last BNDACK message
3772 corresponding to a BNDUPD message requested by the UPDREQ or
3773 UPDREQALL is received from the server which sent the UPDREQ or
3774 UPDREQALL. The XID of the UPDDONE message MUST be the same as the
3775 XID of the corresponding UPDREQ or UPDREQALL message.
3776
3777 7.5.2. Receiving the UPDDONE message
3778
3779 A server receiving the UPDDONE message knows that all of the informa-
3780 tion that it requested by sending an UPDREQ or UPDREQALL message has
3781 now been sent and that it has recorded this information in its stable
3782 storage. It typically uses the receipt of an UPDDONE message to move
3783 to a different failover state. See sections 9.5.2 and 9.8.3 for
3784 details.
3785
3786 7.6. POOLREQ message [1]
3787
3788 The pool request (POOLREQ) message is used by the secondary server to
3789 request an allocation of IP addresses from the primary server. It
3790 MUST be sent by a secondary server to a primary server to request IP
3791 address allocation by the primary. The IP addresses allocated are
3792 transmitted using normal BNDUPD messages from the primary to the
3793 secondary.
3794
3795 The POOLREQ message SHOULD be sent from the secondary to the primary
3796 whenever the secondary makes a transition into NORMAL state. It
3797 SHOULD periodically be resent in order that any change in the number
3798 of available IP addresses on the primary be reflected in the pool on
3799 the secondary. The period may be influenced by the secondary
3800 server's leasing activity.
3801
3802 The POOLREQ message has no message specific options.
3803
3804 7.6.1. Sending the POOLREQ message
3805
3806 The POOLREQ message is sent.
3807
3808
3809
3810 Droms, et. al. Expires September 2003 [Page 68]
3811 \f
3812 Internet Draft DHCP Failover Protocol March 2003
3813
3814
3815 7.6.2. Receiving the POOLREQ message
3816
3817 When a primary server receives a POOLREQ message it SHOULD examine
3818 the binding database and determine how many IP addresses the secon-
3819 dary server should have, and set these IP addresses to BACKUP state.
3820 It SHOULD then send BNDUPD messages concerning all of these IP
3821 addresses to the secondary server.
3822
3823 Servers frequently have several kinds of IP addresses available on a
3824 particular network segment. The failover protocol assumes that both
3825 primary and secondary servers are configured in such a way that each
3826 knows the type and number of IP addresses on every network segment
3827 participating in the failover protocol. The primary server is
3828 responsible for allocating the secondary server the correct propor-
3829 tion of available IP addresses of each kind, and the secondary server
3830 is responsible for being configured in such a way that it can tell
3831 the kind of every IP address based solely on the IP address itself.
3832
3833 A primary server MUST keep track of how many IP addresses were allo-
3834 cated as a result of processing the POOLREQ message, and send that
3835 number in the POOLRESP message.
3836
3837 A primary server MAY choose to defer processing a POOLREQ message
3838 until a more convenient time to process it, but it should not depend
3839 on the secondary server to resend the POOLREQ message in that case.
3840
3841 If a secondary server receives a POOLREQ message it SHOULD report an
3842 error.
3843
3844 7.7. POOLRESP message [2]
3845
3846 A primary server sends a POOLRESP message to a secondary server after
3847 the allocation process for available addresses to the secondary
3848 server is complete. Typically this message will precede some of the
3849 BNDUPD messages that the primary uses to send the actual allocated IP
3850 addresses to the secondary.
3851
3852 The xid in the POOLRESP message MUST be identical to the xid in the
3853 POOLREQ message for which this POOLRESP is a response.
3854
3855
3856 7.7.1. Sending the POOLRESP message
3857
3858 The POOLRESP message MUST contain the same xid as the corresponding
3859 POOLREQ message.
3860
3861 Only one option MUST appear in a POOLREQ message:
3862
3863
3864
3865
3866 Droms, et. al. Expires September 2003 [Page 69]
3867 \f
3868 Internet Draft DHCP Failover Protocol March 2003
3869
3870
3871 o addresses-transferred
3872
3873 The number of addresses allocated to the secondary server by the
3874 primary server as a result of a POOLREQ is contained in the
3875 addresses-transferred option in a POOLRESP message. Note this
3876 is the number of addresses that are transferred to the secondary
3877 in the primary's binding database as a result of the correspond-
3878 ing POOLREQ message, and that it may be some time before they
3879 can all be transmitted to the secondary server through the use
3880 of BNDUPD messages.
3881
3882 7.7.2. Receiving the POOLRESP message
3883
3884 When a secondary server receives a POOLRESP message, it SHOULD send
3885 another POOLREQ message if the value of the addresses-transferred
3886 option is non-zero.
3887
3888 Typically, no other action is taken on the reception of a POOLRESP
3889 message.
3890
3891 7.8. CONNECT message [5]
3892
3893 The connect message is used to establish an applications level con-
3894 nection over a newly created TCP connection. It gives the source
3895 information for the connection and critical configuration informa-
3896 tion. It MUST be sent only by the primary server. Either server can
3897 initiate a TCP connection, but the CONNECT message is only sent by
3898 the primary server.
3899
3900 The CONNECT message MUST be the first message sent down a newly esta-
3901 blished connection, and it MUST be sent only by the primary server.
3902
3903 The following table summarizes the options that are associated with
3904 the CONNECT message:
3905
3906
3907
3908
3909
3910
3911
3912
3913
3914
3915
3916
3917
3918
3919
3920
3921
3922 Droms, et. al. Expires September 2003 [Page 70]
3923 \f
3924 Internet Draft DHCP Failover Protocol March 2003
3925
3926
3927
3928
3929 Option
3930 ------
3931 relationship-name MUST
3932 max-unacked-bndupd MUST
3933 receive-timer MUST
3934 vendor-class-identifier MUST
3935 protocol-version MUST
3936 TLS-request MUST (1)
3937 MCLT MUST
3938 hash-bucket-assignment MUST
3939
3940 (1) MUST NOT if CONNECT is being sent over a TLS connection
3941
3942 Table 7.8-1: Options used in a CONNECT message
3943
3944
3945 7.8.1. Sending the CONNECT message
3946
3947 The CONNECT message MUST be the first message sent by the primary
3948 server after the establishment of a new TCP connection with a secon-
3949 dary server participating in the failover protocol.
3950
3951 The xid of the CONNECT message is not related to any previous xid
3952 sequence, but initiates the sequence for this connection.
3953
3954 The name of the failover relationship MUST be placed in the
3955 relationship-name option. This information is placed in an option
3956 inside of the message in order to allow the identity of the sender to
3957 be covered by a shared secret.
3958
3959 The number of BNDUPD messages the primary server can accept without
3960 blocking the TCP connection MUST be placed in the max-unacked-bndupd
3961 option. This MUST be a number equal to or greater than 1, SHOULD be
3962 a number greater than 10, and SHOULD be a number less than 100.
3963
3964 The length of the receive timer (tReceive, see section 8.3) MUST be
3965 placed in the receive-timer option.
3966
3967 The MCLT MUST be placed in the MCLT option.
3968
3969 The hash-bucket-assignment option MUST be included in the CONNECT
3970 message. In the event that load balancing is not configured for this
3971 server, the hash-bucket-assignment option will indicate that. The
3972 value of the hash-bucket-assignment option is determined from the
3973 specific buckets that the primary server has determined that the
3974 secondary server MUST service as part of the load-balancing
3975
3976
3977
3978 Droms, et. al. Expires September 2003 [Page 71]
3979 \f
3980 Internet Draft DHCP Failover Protocol March 2003
3981
3982
3983 algorithm. The way in which the primary server determines this
3984 information is outside the scope of this protocol definition. The
3985 primary server SHOULD be configured with a percentage of clients that
3986 the secondary server will be instructed to service, and the primary
3987 server SHOULD use the algorithm in [RFC 3074] to generate a Hash
3988 Bucket Assignment which it sends to the secondary server.
3989
3990 The vendor class identifier MUST be placed in the vendor-class-
3991 identifier option.
3992
3993 The protocol-version option MUST be included in every CONNECT mes-
3994 sage. The current value of the protocol version is 1.
3995
3996 The TLS-request option MUST be sent and contains the desired TLS con-
3997 nection request as well as information concerning whether TLS is sup-
3998 ported. If this CONNECT message is being sent over a already
3999 created TLS connection, the TLS-request MUST NOT appear.
4000
4001 7.8.2. Receiving the CONNECT message
4002
4003 When a server established a TCP connection on a failover port, if it
4004 is a PRIMARY server it should send a CONNECT message, and if it is a
4005 secondary server it should wait for a CONNECT message before sending
4006 any messages. To avoid denial of service attacks, a secondary should
4007 only wait for a CONNECT message on a new connection for a limited
4008 amount of time and close the connection if none is received during
4009 that time.
4010
4011 When a secondary server receives a CONNECT message it should:
4012
4013 1. Record the time at which the message was received.
4014
4015 2. Examine the protocol-version option, and decide if this server
4016 is capable of interoperating with another server running that
4017 protocol version. If not, send the CONNECTACK message with
4018 the reject reason 14: "Protocol version mismatch". The server
4019 MUST include its protocol-version in the CONNECTACK message.
4020
4021 3. Examine the TLS-request option. Figure out the TLS-reply
4022 value based on the capabilities and configuration of this
4023 server. If the result for the TLS-reply value is a 1 and the
4024 connection is accepted, indicating use of TLS, then immedi-
4025 ately send the CONNECTACK message and go into TLS negotiation.
4026 If the TLS-reply value implies rejection of the connection,
4027 then immediately send the CONNECTACK message with the TLS-
4028 reply value and the appropriate reject-reason option value.
4029 In all other cases, save the TLS-reply option information for
4030 the eventual CONNECTACK message.
4031
4032
4033
4034 Droms, et. al. Expires September 2003 [Page 72]
4035 \f
4036 Internet Draft DHCP Failover Protocol March 2003
4037
4038
4039 The possibilities for TLS-request and TLS-reply are:
4040
4041 CONNECT CONNECTACK
4042 TLS TLS
4043 request reply
4044 Reject
4045 t1 t1 Reason Comments
4046 -- -- ------ --------
4047 0 0 no TLS used
4048 0 1 11 primary won't use TLS, secondary requires TLS
4049 1 0 primary desires TLS, secondary doesn't
4050 1 1 primary desires TLS, secondary will use TLS
4051 2 0 9, 10 primary requires TLS and secondary won't
4052 2 1 primary requires TLS and secondary will use TLS
4053
4054
4055
4056 4. Check to see if there is a message-digest option in the CON-
4057 NECT message. If there was, and the server does not support
4058 message-digests, then reject the connection with reject reason
4059 12: "Message digest not supported" in the CONNECTACK. If the
4060 server does support message-digests, then check this message
4061 for validity based on the message-digest, and reject it if the
4062 digest indicates the message was altered with reject reason
4063 20: "Message digest failed to compare".
4064
4065 5. Determine if the sender (from the relationship-name option)
4066 and the implicit role of the sender (i.e., primary) represents
4067 a server with which the receiver was configured to engage in
4068 failover activity. This is performed after any TLS or message
4069 digest processing so that it occurs after a secure connection
4070 is created, to ensure that there is no tampering with the
4071 relationship name of the partner. In the absence of any other
4072 security capability (i.e., when TLS or a message digest is not
4073 used), the server MAY wish to be configured with the IP
4074 address of the partner and check the source-ip of the CONNECT
4075 message against that IP address as a weak form of security.
4076
4077 If not, then the receiving server should reject the CONNECT
4078 request by sending a CONNECTACK message with a reject-reason
4079 value of: 8, invalid failover partner.
4080
4081 If it is, then the receiving failover endpoint should be
4082 determined.
4083
4084 6. Decide if the time delta between the sending of the message,
4085 in the time field, and the receipt of the message, recorded in
4086 step 1 above, is acceptable. A server MAY require an
4087
4088
4089
4090 Droms, et. al. Expires September 2003 [Page 73]
4091 \f
4092 Internet Draft DHCP Failover Protocol March 2003
4093
4094
4095 arbitrarily small delta in time values in order to set up a
4096 failover connection with another server. See section 5.10 for
4097 information on time synchronization.
4098
4099 If the delta between the time values is too great, the server
4100 should reject the CONNECT request by sending a CONNECTACK mes-
4101 sage with a reject-reason of 4, time mismatch too great.
4102
4103 If the time mismatch is not considered too great then the
4104 receiving server MUST record the delta between the servers.
4105 The receiving server MUST use this delta to correct all of the
4106 absolute times received from the other server in all time-
4107 valued options. Note that servers can participate in failover
4108 with arbitrarily great time mismatches, as long as it is more
4109 or less constant.
4110
4111 7. Examine the MCLT option in the CONNECT request and use the
4112 value of the MCLT as the MCLT for this failover endpoint.
4113
4114 The secondary server SHOULD be able to operate with any MCLT
4115 sent by the primary, but if it cannot, then it should send a
4116 CONNECTACK with a reject-reason of 5, MCLT mismatch. In the
4117 event that the MCLT from the primary does not match that con-
4118 figured on the secondary, and the secondary will run with the
4119 primary's value, then the secondary MUST save the MCLT in
4120 secondary storage since it will need it even if it cannot con-
4121 tact the primary. The secondary MUST NOT use a different MCLT
4122 value than it received from the primary even if it cannot con-
4123 tact the primary.
4124
4125 8. The server MUST store hash-bucket-assignment option for use
4126 during processing during NORMAL state. If this hash bucket
4127 assignment conflicts with the secondary server's configured
4128 hash bucket assignment for use in other than NORMAL state, the
4129 secondary server should send a CONNECTACK with a reject reason
4130 of 19, Hash bucket assignment conflict.
4131
4132 9. The receiving server MAY use the vendor-class-identifier to do
4133 vendor specific processing.
4134
4135 7.9. CONNECTACK message [6]
4136
4137 The CONNECTACK message is sent to accept or reject a CONNECT message.
4138 It is sent by the secondary server which received a CONNECT message.
4139
4140 Attempting immediately to reconnect after either receiving a CONNEC-
4141 TACK with a reject-reason or after sending a CONNECTACK with a
4142 reject-reason could yield unwanted looping behavior, since the reason
4143
4144
4145
4146 Droms, et. al. Expires September 2003 [Page 74]
4147 \f
4148 Internet Draft DHCP Failover Protocol March 2003
4149
4150
4151 that the connection was rejected may well not have changed since the
4152 last attempt. A simple suggested solution is to wait a minute or two
4153 after sending or receiving a CONNECTACK message with a reject-reason
4154 before attempting to reestablish communication.
4155
4156 The following table summarizes the options associated with the CON-
4157 NECTACK message:
4158
4159
4160 Option accept reject
4161 ------
4162 relationship-name MUST MUST
4163 max-unacked-bndupd MUST MUST NOT
4164 receive-timer MUST MUST NOT
4165 vendor-class-identifier MUST MUST NOT
4166 protocol-version MUST MUST
4167 TLS-reply (1) (2)
4168 reject-reason MUST NOT MUST
4169 message MUST NOT SHOULD
4170 MCLT MUST NOT MUST NOT
4171 hash-bucket-assignment MUST NOT MUST NOT
4172
4173 (1) MUST NOT if sending CONNECTACK after TLS negotiation, MUST
4174 if TLS-request in CONNECT, else MUST NOT.
4175 (2) MUST if TLS-request in CONNECT message, else MUST NOT.
4176
4177 Table 7.9-1: Options used in a CONNECTACK message
4178
4179
4180 7.9.1. Sending the CONNECTACK message
4181
4182 The xid of the CONNECTACK message MUST be that of the corresponding
4183 CONNECT message.
4184
4185 The name of the relationship MUST be placed in the relationship-name
4186 option. This information is placed in an option inside of the mes-
4187 sage in order to allow the identity of the sender to be covered by a
4188 shared secret.
4189
4190 The protocol-version option MUST be included in every CONNECTACK mes-
4191 sage. The current value of the protocol version is 1.
4192
4193 If the connection has been rejected, the reject-reason option MUST be
4194 placed in the CONNECTACK message with an appropriate reason, and a
4195 message option SHOULD be included with a human-readable error message
4196 describing the reason for the rejection in some detail. If the
4197 reject-reason option appears, then the remaining options listed below
4198 do not appear. The sending server should close the connection after
4199
4200
4201
4202 Droms, et. al. Expires September 2003 [Page 75]
4203 \f
4204 Internet Draft DHCP Failover Protocol March 2003
4205
4206
4207 sending the CONNECTACK if the connection was rejected.
4208
4209 The results of the TLS negotiation MUST be placed in the TLS-reply
4210 option. If this CONNECTACK message is being sent over an already TLS
4211 secured connection, then there MUST NOT be a TLS-reply option.
4212
4213 If there was a message-digest option in the CONNECT message, then
4214 there MUST be a message-digest in the CONNECTACK message and any sub-
4215 sequent messages if the CONNECTACK does not contain a reject-reason.
4216
4217 The number of BNDUPD messages the server can accept without blocking
4218 the TCP connection MUST be placed in the max-unacked-bndupd option.
4219 This SHOULD be a number greater than 10, and SHOULD be a number less
4220 than 100.
4221
4222 The length of the receive timer (tReceive, see section 8.3) MUST be
4223 placed in the receive-timer option.
4224
4225 The vendor class identifier MUST be placed in the vendor-class-
4226 identifier option.
4227
4228 After a connection is created (either by sending a CONNECTACK message
4229 to the first CONNECT message, or sending a CONNECTACK message to a
4230 CONNECT message received over a TLS connection), the server MUST send
4231 a STATE message.
4232
4233 After a connection is created, the server MUST start two timers for
4234 the connection: tSend and tReceive. The tSend timer SHOULD be
4235 approximately 33 percent of the time in the receiver-timer option in
4236 the corresponding CONNECT message. The tReceive timer SHOULD be the
4237 time sent in the receiver-timer option in the CONNECTACK message.
4238
4239 The tReceive timer is reset whenever a message is received from this
4240 TCP connection. If it ever expires, the TCP connection is dropped
4241 and communications with this partner is considered not ok. The
4242 reject reason 17: "No traffic within sufficient time" is placed in
4243 the DISCONNECT message sent prior to dropping the TCP connection.
4244
4245 The tSend timer is reset whenever a message is sent over this connec-
4246 tion. When it expires, a CONTACT message MUST be sent.
4247
4248 7.9.2. Receiving the CONNECTACK message
4249
4250 If a CONNECTACK message is received with a different XID from the one
4251 in the CONNECT that was sent, it SHOULD be ignored. To avoid denial
4252 of service attacks, a primary should only wait for a CONNECTACK mes-
4253 sage on a new connection for a limited amount of time and close the
4254 connection if none is received during that time.
4255
4256
4257
4258 Droms, et. al. Expires September 2003 [Page 76]
4259 \f
4260 Internet Draft DHCP Failover Protocol March 2003
4261
4262
4263 When a CONNECTACK message is received, the following actions should
4264 be taken:
4265
4266 1. Record the time the message was received.
4267
4268 2. Check to see if the xid on the CONNECTACK matches an outstand-
4269 ing CONNECT message on this TCP connection.
4270
4271 3. Check to see if there is a reject-reason option in the CONNEC-
4272 TACK message. If not, continue with step 3. If there is a
4273 reject-reason option, the server SHOULD report the error code.
4274 If a message option appears a server SHOULD display the string
4275 from the message option in a user visible way. The server
4276 MUST close the connection if a reject-reason option appears.
4277
4278 4. Check the value of the TLS-reply option (if any, which there
4279 won't be if this CONNECT is taking place utilizing TLS), and
4280 if it was 1, then skip processing of the rest of the CONNEC-
4281 TACK message, and immediately enter into TLS connection setup.
4282
4283 This step occurs prior to steps 5 and 6 in order to allow
4284 creation of a secure connection (if required) prior to pro-
4285 cessing the protocol version and IP address information.
4286
4287 5. Examine the value of the protocol-version option. If this
4288 server is able to establish connections with another server
4289 running this protocol version, then continue, else close the
4290 connection.
4291
4292 6. Decide if the time delta between the sending of the message,
4293 in the time field, and the receipt of the message, recorded in
4294 step 1 above, is acceptable. A server MAY require an arbi-
4295 trarily small delta in time values in order to set up a fail-
4296 over connection with another server.
4297
4298 If the delta between the time values is too great, the server
4299 should drop the TCP connection (see section 7.12).
4300
4301 If the time mismatch is not considered too great then the
4302 receiving server MUST record the delta between the servers.
4303 The receiving server MUST use this delta to correct all of the
4304 absolute times received from the other server in all time-
4305 valued options. Note that the failover protocol is con-
4306 structed so that two servers can be failover partners with
4307 arbitrarily great time mismatches.
4308
4309 7. The receiving server MAY use the vendor-class-identifier to do
4310 vendor specific processing.
4311
4312
4313
4314 Droms, et. al. Expires September 2003 [Page 77]
4315 \f
4316 Internet Draft DHCP Failover Protocol March 2003
4317
4318
4319 8. After accepting a CONNECTACK message, the server MUST send a
4320 STATE message.
4321
4322 After receiving a CONNECTACK message, the server MUST start
4323 two timers for the connection: tSend and tReceive. The tSend
4324 timer SHOULD be approximately 20 percent of the time in the
4325 receiver-timer option in the corresponding CONNECTACK message.
4326 The tReceive timer SHOULD be set to the time sent in the
4327 receiver-timer option in the CONNECT message.
4328
4329 The tReceive timer is reset whenever a message is received
4330 from this TCP connection. If it ever expires, the TCP connec-
4331 tion is dropped and communications with this partner is con-
4332 sidered not ok. The reject reason 17: "No traffic within suf-
4333 ficient time" is placed in the DISCONNECT message sent prior
4334 to dropping the TCP connection.
4335
4336 The tSend timer is reset whenever a message is sent over this
4337 connection. When it expires, a CONTACT message MUST be sent.
4338
4339 7.10. STATE message [10]
4340
4341 The state (STATE) message is used to communicate the current failover
4342 state to the partner server.
4343
4344 The STATE message MUST be sent after sending a CONNECTACK message
4345 that didn't contain a reject-reason option, and MUST be sent after
4346 receiving a CONNECTACK message without a reject-reason option.
4347
4348 A STATE message MUST be sent whenever the failover endpoint changes
4349 its failover state and a connection exists to the partner.
4350
4351 The STATE message requires no response from the failover partner.
4352
4353 The following table shows the options that MUST appear in a STATE
4354 message:
4355
4356
4357 Option
4358 ------
4359 sending-state MUST
4360 server-flags MUST
4361 start-time-of-state MUST
4362
4363 Table 7.10-1: Options used in a STATE message
4364
4365
4366
4367
4368
4369
4370 Droms, et. al. Expires September 2003 [Page 78]
4371 \f
4372 Internet Draft DHCP Failover Protocol March 2003
4373
4374
4375 7.10.1. Sending the STATE message
4376
4377 The current failover state is placed in the server-state option and
4378 the current state of the STARTUP flag is placed in the server-flags
4379 option.
4380
4381 The message is sent with a unique xid.
4382
4383 A server SHOULD only send the STATE message either when the connec-
4384 tion is created (i.e, after sending or receiving a CONNECTACK message
4385 with no reject-reason option), or when there is a change from the
4386 values sent in a previous STATE message.
4387
4388 7.10.2. Receiving the STATE message
4389
4390 Every STATE message SHOULD indicate a change in state or a change in
4391 the flags.
4392
4393 When a STATE message is received, any state transitions specified in
4394 section 9 are taken.
4395
4396 No response to a STATE message is required.
4397
4398 7.11. CONTACT message [11]
4399
4400 The contact (CONTACT) message is sent to verify communications
4401 integrity with a failover partner. The CONTACT message is sent when
4402 no messages have been sent to the failover partner for a specified
4403 period of time. This is determined by the tSend timer expiring (see
4404 section 8.3).
4405
4406 The CONTACT message has no message specific options.
4407
4408 7.11.1. Sending the CONTACT message
4409
4410 The CONTACT message is sent.
4411
4412 7.11.2. Receiving the CONTACT message
4413
4414 When a CONTACT message is received, the tReceive timer is reset (as
4415 it is with any message that is received).
4416
4417 A server SHOULD use the time in the time field and the time the mes-
4418 sage was received to refine the delta time calculations between the
4419 servers.
4420
4421
4422
4423
4424
4425
4426 Droms, et. al. Expires September 2003 [Page 79]
4427 \f
4428 Internet Draft DHCP Failover Protocol March 2003
4429
4430
4431 7.12. DISCONNECT message [12]
4432
4433 The DISCONNECT is the last message sent over a connection before
4434 dropping an established connection (note that an established connec-
4435 tion is one where a CONNECTACK has been sent without a reject rea-
4436 son).
4437
4438 After sending or receiving a DISCONNECT message, a server needs to
4439 have some mechanism to prevent an error loop. Simply reconnecting to
4440 the partner immediately is not the best option, especially after
4441 several consecutive attempts.
4442
4443 A simple suggested solution is to wait a minute or two after sending
4444 or receiving a DISCONNECT before attempting to reestablish communica-
4445 tion.
4446
4447 The DISCONNECT message MUST be the last message sent down a connec-
4448 tion before it is closed.
4449
4450 The following table summarizes the options that are associated with
4451 the DISCONNECT message:
4452
4453
4454 Option
4455 ------
4456 reject-reason MUST
4457 message SHOULD
4458
4459 Table 7.12-1: Options used in a DISCONNECT message
4460
4461
4462
4463 7.12.1. Sending the DISCONNECT message
4464
4465 The DISCONNECT message MUST be the last message sent by the a server
4466 which is dropping a TCP connection.
4467
4468 The xid of the DISCONNECT message must be unique.
4469
4470 The reject-reason option MUST appear giving a reason why the connec-
4471 tion was dropped. A message option SHOULD appear giving a human
4472 readable error message with possibly more details.
4473
4474 7.12.2. Receiving the DISCONNECT message
4475
4476 When a server receives a DISCONNECT message it should log the message
4477 if there was one and possibly raise an alarm of some sort if the
4478 reject reason was one that was sufficiently serious.
4479
4480
4481
4482 Droms, et. al. Expires September 2003 [Page 80]
4483 \f
4484 Internet Draft DHCP Failover Protocol March 2003
4485
4486
4487 8. Connection Management
4488
4489 Servers participating in the failover protocol communicate over TCP
4490 connections. These TCP connections are used both to transmit bind-
4491 ing information from one server to another as well as to allow each
4492 server to determine whether communications is possible with the other
4493 server.
4494
4495 Central to the operation of the failover protocol is a notion of
4496 "communications okay" or "communications failed". Failover state
4497 transitions are taken in many cases when the status of communications
4498 with the partner changes, and the existence or non-existence of a TCP
4499 connections between failover endpoints is used to determine if com-
4500 munications is "okay" or "failed".
4501
4502 A single TCP connection exists which connects two failover endpoints.
4503
4504 8.1. Connection granularity
4505
4506 There exists one TCP connection between each set of failover end-
4507 points. See section 5.1.1 for an explanation of failover endpoints.
4508
4509 Typically there is one failover endpoint for each end of a failover
4510 relationship between two servers, and only a single relationship
4511 between any two servers. Given the integration of loadbalancing into
4512 the failover protocol, there is little value in having more than one
4513 failover relationship between two servers, though the protocol will
4514 support multiple relationships between two servers.
4515
4516 Each failover relationship MUST have a unique relationship-name, and
4517 the relationship-name option is used to communicate this name in the
4518 CONNECT and CONNECTACK messages.
4519
4520 8.2. Creating the TCP connection
4521
4522 All failover TCP connections are initiated over port 647. Every
4523 server implementing the failover protocol MUST listen on port 647.
4524
4525 Every server implementing the failover protocol SHOULD attempt to
4526 connect to all of its partners periodically, where the period is
4527 implementation dependent and SHOULD be configurable. In the event
4528 that a connection has been rejected by a CONNECTACK message with a
4529 reject-reason option contained in it or a DISCONNECT message, a
4530 server SHOULD reduce the frequency with which it attempts to connect
4531 to that server but it SHOULD continue to attempt to connect periodi-
4532 cally.
4533
4534 When a connection attempt succeeds, if the server generating the
4535
4536
4537
4538 Droms, et. al. Expires September 2003 [Page 81]
4539 \f
4540 Internet Draft DHCP Failover Protocol March 2003
4541
4542
4543 connection attempt is a primary server for that relationship, then it
4544 MUST send a CONNECT message down the connection. If it is not a pri-
4545 mary server for the relationship, then it MUST just drop the connec-
4546 tion and wait for the primary server to connect to it.
4547
4548 When a connection attempt is received on port 647, the only informa-
4549 tion that the receiving server has is the IP address of the partner
4550 initiating a connection. It also knows whether it has the primary
4551 role for any failover relationships with the connecting server. If
4552 it has any relationships for which it is a primary server, it should
4553 initiate a connection of its own to port 647 of the partner server,
4554 one for each primary relationship it has with that server.
4555
4556 If it has any relationships with the connecting server for which it
4557 is a seconary server, it should just await the CONNECT message to
4558 determine which relationship this connection is to serve.
4559
4560 If it has no secondary relationships with the connecting server, it
4561 SHOULD drop the connection.
4562
4563 To summarize -- a primary server MUST use a connection that it has
4564 initiated in order to send a CONNECT message. Every server that is a
4565 secondary server in a relationship attempts to create a connection to
4566 the server which is primary in the relationship, but that connection
4567 is only used to stimulate the primary server into recognizing that
4568 the secondary server is ready for operation. The reason behind this
4569 is that the secondary server has no way to communicate to the primary
4570 server which relationship a connection is designed to serve.
4571
4572 A server which has multiple secondary relationships with a primary
4573 server SHOULD only send one stimulus connection attempt to the pri-
4574 mary server.
4575
4576 Once a connection is established, the primary server MUST send a CON-
4577 NECT message across the connection. A secondary server MUST wait for
4578 the CONNECT message from a primary server. If the secondary server
4579 doesn't receive a CONNECT message from the primary server in an ins-
4580 tallation dependent amount of time, it MAY drop the connection and
4581 send another stimulus connection attempt to the primary server.
4582
4583 Every CONNECT message includes a TLS-request option, and if the CON-
4584 NECTACK message does not reject the CONNECT message and the TLS-reply
4585 option says TLS MUST be used, then the servers will immediately enter
4586 into TLS negotiation.
4587
4588 Once TLS negotiation is complete, the primary server MUST resend the
4589 CONNECT message on the newly secured TLS connection and then wait for
4590 the CONNECTACK message in response. The TLS-request and TLS-reply
4591
4592
4593
4594 Droms, et. al. Expires September 2003 [Page 82]
4595 \f
4596 Internet Draft DHCP Failover Protocol March 2003
4597
4598
4599 options MUST NOT appear in either this second CONNECT or its associ-
4600 ated CONNECTACK message as they had in the first messages.
4601
4602 The second message sent over a new connection (either a bare TCP con-
4603 nection or a connection utilizing TLS) is a STATE message. Upon the
4604 receipt of this message, the receiver can consider communications up.
4605
4606 It is entirely possible that two servers will attempt to make connec-
4607 tions to each other essentially simultaneously, and in this case the
4608 secondary server will be waiting for a CONNECT message on each con-
4609 nection. The primary server MUST send a CONNECT message over one
4610 connection and it MUST close the other connection.
4611
4612 A secondary server MUST NOT respond to the closing of a TCP connec-
4613 tion with a blind attempt to reconnect -- there may be another TCP
4614 connection to the same failover partner already in use.
4615
4616 8.3. Using the TCP connection for determining communications status
4617
4618 The TCP connection is used to determine the communications status of
4619 the other server, i.e., communications-ok, or communications-
4620 interrupted.
4621
4622 Three things must happen for a server to consider that communications
4623 are ok with respect to another server:
4624
4625
4626 1. A TCP connection must be established to the other server.
4627
4628 2. A CONNECT message must be received and a CONNECTACK message
4629 sent in response. The CONNECT message is used to determine
4630 the identify of the failover endpoint of the other end of the
4631 TCP connection -- without it, the failover endpoint cannot be
4632 uniquely determined. Without knowledge of the failover end-
4633 point, then the entity with which communications is ok is
4634 undetermined.
4635
4636 3. A STATE message must be received from the other server over
4637 the connection. This STATE message initializes important
4638 information necessary to the operation of the state machine
4639 the governs the behavior of this failover endpoint.
4640
4641 There are two ways that a server can determine that communications
4642 has failed:
4643
4644
4645 1. The TCP connection can go down, yielding an error when
4646 attempting to send or receive a message. This will happen at
4647
4648
4649
4650 Droms, et. al. Expires September 2003 [Page 83]
4651 \f
4652 Internet Draft DHCP Failover Protocol March 2003
4653
4654
4655 least as often as the period of the tSend timer.
4656
4657 2. The tReceive timer can expire.
4658
4659 In either of these cases, communications is considered interrupted.
4660
4661 If the tReceive timer expires, the connection MUST be dropped. The
4662 reject reason 17: "No traffic within sufficient time" is placed in
4663 the DISCONNECT message sent prior to dropping the TCP connection.
4664
4665 Several difficulties arise when trying to use one TCP connection for
4666 both bulk data transfer as well as to sense the communications status
4667 of the other server. One aspect of the problem stems from the dif-
4668 ferent requirements of both uses. The bulk data transfer is of
4669 course critically important to the protocol, but the speed with which
4670 it is processed is not terribly significant. It might well be
4671 minutes before a BNDUPD message is processed, and while not optimal,
4672 such an occasional delay doesn't compromise the correctness of the
4673 protocol. However, the speed with which one server detects the other
4674 server is up (or, more importantly, down) is more highly constrained.
4675 Generally one server should be able to detect that the other server
4676 is not communicating within a minute or less.
4677
4678 These differing time constraints makes it difficult to use the same
4679 TCP connection for data transfer as well as to sense communications
4680 integrity. See section 3.5 for additional details on TCP.
4681
4682 The solution to this problem is to require that some message be
4683 received by each end of the connection within a limited time or that
4684 the connection will be considered down. If no messages have been
4685 sent recently, then a CONTACT message is sent.
4686
4687 In the case where there is no data queued to be sent, this is not a
4688 problem, but in the case where there is data queued to be sent to the
4689 partner, then the CONTACT message will not actually be transmitted
4690 until the queued data is sent. Section 3.5 explains why waiting for
4691 TCP to determine that the connection is down is not acceptable, and
4692 leads to a requirement that the receiving server never block the
4693 sending server from sending CONTACT messages.
4694
4695 In order to meet this requirement, each server tells the other server
4696 the number of outstanding BNDUPD messages that it will accept. The
4697 receiving server is required to always be able to accept that many
4698 BNDUPD messages off of the connection's input queue even if it cannot
4699 process them immediately, and to accept all other messages immedi-
4700 ately.
4701
4702 Thus, the sending server's TCP is never blocked from sending a
4703
4704
4705
4706 Droms, et. al. Expires September 2003 [Page 84]
4707 \f
4708 Internet Draft DHCP Failover Protocol March 2003
4709
4710
4711 message except for very short periods, less than a few seconds unless
4712 the network connection itself has problems. In this case, if the
4713 CONTACT messages don't make it to the partner then the partner will
4714 close the connection.
4715
4716 DISCUSSION:
4717
4718 When implementing this capability, one needs to be careful when
4719 sending any message on the TCP connection as TCP can easily block
4720 the server if the local TCP send buffers are full. This can't be
4721 prevented because if the receiver is not reachable (via the net-
4722 work), the sending TCP can't send and thus it will be unable to
4723 empty the local TCP send buffers. So, all send operations either
4724 need to assume they may block for some time or non-blocking sends
4725 must be used carefully.
4726
4727 8.4. Using the TCP connection for binding data
4728
4729 Binding data, in the form of BNDUPD messages and BNDACK messages to
4730 respond to them, are sent across the TCP connection.
4731
4732 In order to support timely detection of any failure in the partner
4733 server, the TCP connection MUST NOT block for more than a very short
4734 time, on the order of a few seconds. Therefore, a server that is
4735 sending BNDUPD messages MUST send only a restricted number before
4736 receiving BNDACK messages about previous messages sent.
4737
4738 The number of outstanding BNDUPD messages that each server will
4739 accept without causing TCP to block transmission of additional data
4740 (i.e, CONTACT messages) is sent by each server in the CONNECT and
4741 CONNECTACK messages in the max-unacked-bndupd option.
4742
4743 8.5. Using the TCP connection for control messages
4744
4745 The TCP connection is used for control messages: POOLREQ, UPDREQ,
4746 STATE, CONTACT, UPDREQALL and the corresponding reply messages: POOL-
4747 RESP, UPDDONE. A server MUST immediately accept all of these mes-
4748 sages from the TCP connection. A server MUST immediately accept any
4749 BNDACK which is received as well.
4750
4751 8.6. Losing the TCP connection
4752
4753 When the TCP connection is lost, then communications is not ok with
4754 the other server. A server which has lost communications SHOULD
4755 immediately attempt to reconnect to the other server, and should
4756 retry these connection attempts periodically.
4757
4758 An acknowledgement message (BNDACK, POOLRESP, UPDDONE) message can
4759
4760
4761
4762 Droms, et. al. Expires September 2003 [Page 85]
4763 \f
4764 Internet Draft DHCP Failover Protocol March 2003
4765
4766
4767 only be sent in response to a request message (BNDUPD, POOLREQ,
4768 UPDREQ, UPDREQALL) on the same TCP connection from which the request
4769 was received, in part since the XID's in the request messages are
4770 guaranteed unique only during the life of a single TCP connection.
4771
4772 When a connection to a partner server goes down, a server with unpro-
4773 cessed request messages MAY simply drop all of those messages, since
4774 it can be sure that the partner will resend them when they are next
4775 in communications. A server with unprocessed BNDUPD messages when a
4776 TCP connection goes down MAY instead choose to process those BNDUPD
4777 messages, but it MUST NOT send any BNDACK messages in response (again
4778 because of the issues surrounding XID uniqueness).
4779
4780 When the TCP connection is closed explicitly, the DISCONNECT message
4781 with a reject-reason option (and, ideally, a message option) MUST be
4782 sent over the TCP connection.
4783
4784 9. Failover Endpoint States
4785
4786 This section discusses the various states that a failover endpoint
4787 may take, and the server actions required when entering the state,
4788 operating in the state, and leaving the state, as well as the events
4789 that cause transitions out of the state into another state.
4790
4791 The state transition diagram in Figure 9.2-1 is relevant for this
4792 section. This is the common state transition diagram for both servers
4793 in a failover pair. In the event that the textual description of a
4794 state differs from the state transition diagram, the textual descrip-
4795 tion is to be considered authoritative.
4796
4797 9.1. Server Initialization
4798
4799 When a server starts it starts out in STARTUP state. See section 9.3
4800 below for details.
4801
4802 9.2. Server State Transitions
4803
4804 Whenever a server makes a transition into a new state, it MUST record
4805 the state and the time at which it entered that state in stable
4806 storage. If communications is "ok", it MUST also send a STATE mes-
4807 sage to its failover partner.
4808
4809 Figure 9.2-1 is the diagram of the server state transitions. The
4810 remainder of this section contains information important to the
4811 understanding of that diagram.
4812
4813 The server stays in the current state until all of the actions speci-
4814 fied on the state transition are complete. If communications fails
4815
4816
4817
4818 Droms, et. al. Expires September 2003 [Page 86]
4819 \f
4820 Internet Draft DHCP Failover Protocol March 2003
4821
4822
4823 during one of the actions, the server simply stays in the current
4824 state and attempts a transition whenever the conditions for a transi-
4825 tion are later fulfilled.
4826
4827 In the state transition diagram below, the "+" or "-" in the upper
4828 right corner of each state is a notation about whether communication
4829 is ongoing with the other server.
4830
4831 The legend "responsive", "balanced", or "unresponsive" in each state
4832 indicates whether the server is responsive to all DHCP client
4833 requests, running in load balanced mode, or totally unresponsive in
4834 the respective state. The terms "responsive" and "unresponsive" have
4835 the obvious meanings, while "balanced" means that a DHCP server may
4836 respond to all DHCPREQUEST messages that are RENEWAL or REBINDING,
4837 and to all other messages from clients for which the load balancing
4838 algorithm indicates that it MUST respond to. See sections 5.3 and
4839 9.8.2 for details on load balancing.
4840
4841 Note that in situations where a server does not respond to a DHCP
4842 client message, it MUST NOT remember any of the information from that
4843 message.
4844
4845 In the state transition diagram below, when communication is reesta-
4846 blished between the two servers, each must record the state of the
4847 partner when communication was restored. State transitions on one
4848 server in some cases imply state transitions on the partner server,
4849 so a record of the current state of the partner server must be kept
4850 by each server.
4851
4852 If the state of the partner changes while communicating a server
4853 moves through the communications-failed transition and into whatever
4854 state results. It then immediately moves through whatever state
4855 transition is appropriate given the current state of the partner
4856 server. A server performing this operation SHOULD NOT close the TCP
4857 connection to its partner.
4858
4859 DISCUSSION:
4860
4861 The point of this technique is simplicity, both in explanation of
4862 the protocol and in its implementation. The alternative to this
4863 technique of memory of partner state and automatic state transi-
4864 tion on change of partner state is to have every state in the fol-
4865 lowing diagram have a state transition for every possible state of
4866 the partner. With the approach adopted, only the states in which
4867 communications are reestablished require a state transition for
4868 each possible partner state.
4869
4870 The current state of a server MUST be recorded in stable storage and
4871
4872
4873
4874 Droms, et. al. Expires September 2003 [Page 87]
4875 \f
4876 Internet Draft DHCP Failover Protocol March 2003
4877
4878
4879 thus be available to the server after a server restart.
4880
4881 A transition into SHUTDOWN or PAUSED state is not represented in the
4882 following figure, since other than sending that state to its partner,
4883 the remaining actions involved look just like the server halting in
4884 its otherwise current state, which then becomes the previous state
4885 upon server restart.
4886
4887
4888
4889
4890
4891
4892
4893
4894
4895
4896
4897
4898
4899
4900
4901
4902
4903
4904
4905
4906
4907
4908
4909
4910
4911
4912
4913
4914
4915
4916
4917
4918
4919
4920
4921
4922
4923
4924
4925
4926
4927
4928
4929
4930 Droms, et. al. Expires September 2003 [Page 88]
4931 \f
4932 Internet Draft DHCP Failover Protocol March 2003
4933
4934
4935
4936 +---------------+ V +--------------+
4937 | RECOVER -|+| | | STARTUP - |
4938 |(unresponsive) | +->+(unresponsive)|
4939 +------+--------+ +--------------+
4940 +-Comm. OK +-----------------+
4941 | Other State: | PARTNER DOWN - +<----------------------+
4942 | RESOLUTION-INTER. | (responsive) | ^
4943 All POTENTIAL- +----+------------+ |
4944 Others CONFLICT------------ | --------+ |
4945 | CONFLICT-DONE Comm. OK | +--------------+ |
4946 UPDREQ or Other State: | +--+ RESOLUTION - | |
4947 UPDREQALL | | | | | INTERRUPTED | |
4948 Rcv UPDDONE RECOVER All | | | (responsive) | |
4949 | +---------------+ | Others | | +------------+-+ |
4950 +->+RECOVER-WAIT +-| RECOVER | | | ^ | |
4951 |(unresponsive) | WAIT or | | Comm. | Ext. |
4952 +-----------+---+ DONE | | OK Comm. Cmd----->+
4953 Comm.---+ Wait MCLT | V V V Failed |
4954 Changed | V +---+ +---+-----+--+-+ | |
4955 | +---+----------++ | | POTENTIAL + +-------+ |
4956 | |RECOVER-DONE +-| Wait | CONFLICT +------+ |
4957 +->+(unresponsive) | for |(unresponsive)| Primary |
4958 +------+--------+ Other +>+----+--------++ resolve Comm. |
4959 Comm. OK State: | | ^ conflict Changed |
4960 +---Other State:-+ RECOVER | Secondary | V V | |
4961 | | | DONE | resolve | ++----------+---++ |
4962 | All Others: POTENT. | | conflict | |CONFLICT-DONE-|+| |
4963 | Wait for CONFLICT- | ----+ see (9.10) | | (responsive) | |
4964 | Other State: V V | +------+---------+ |
4965 | NORMAL or RECOVER ++------------+---+ Other State: NORMAL |
4966 | | DONE | NORMAL + +<--------------+ |
4967 | +--+----------+-->+ (balanced) +-------External Command--->+
4968 | ^ ^ +--------+--------+ or Other State: |
4969 | | | | | SHUTDOWN |
4970 | Wait for Comm. OK Comm. Failed or | |
4971 | Other Other Other State: PAUSED | External
4972 | State: State: | | Command
4973 | RECOVER-DONE NORMAL Start Safe Comm. OK or
4974 | | COMM. INT. Period Timer Other State: Safe
4975 | Comm. OK. | V All Others Period
4976 | Other State: | +---------+--------+ | expiration
4977 | RECOVER +--+ COMMUNICATIONS - +----+ |
4978 | +-------------+ INTERRUPTED | |
4979 RECOVER | (responsive) +-------------------------->+
4980 RECOVER-WAIT--------->+------------------+
4981 Figure 9.2-1: Server state diagram.
4982
4983
4984
4985
4986 Droms, et. al. Expires September 2003 [Page 89]
4987 \f
4988 Internet Draft DHCP Failover Protocol March 2003
4989
4990
4991
4992 9.3. STARTUP state
4993
4994 The STARTUP state affords an opportunity for a server to probe its
4995 partner server, before starting to service DHCP clients.
4996
4997 DISCUSSION:
4998
4999 Without the STARTUP state, a server would likely start in a state
5000 derived from its previously stored state (held in stable storage),
5001 if any. However, this may be inconsistent with the current state
5002 of the partner. The STARTUP state affords the opportunity for a
5003 server to potentially learn the partner's state and determine if
5004 that state is consistent with its derived starting state or
5005 whether some significant state change has occurred at the partner
5006 that forces the server to start in another state. This is
5007 especially critical if significant time has elapsed while the
5008 server was down.
5009
5010
5011 9.3.1. Operation while in STARTUP state
5012
5013 Whenever a server is in STARTUP state, it MUST be unresponsive to
5014 DHCP client requests, and so the time spent in the STARTUP state is
5015 necessarily short, typically on the order of a few seconds to a few
5016 tens of seconds. The exact time spent in the STARTUP state is imple-
5017 mentation dependent, and the primary and secondary server are not
5018 required to spend the same amount of time in the STARTUP state. See
5019 section 5.9 for some guidelines on the time to spend in STARTUP
5020 state.
5021
5022 Whenever a STATE message is sent to the partner while in STARTUP
5023 state the STARTUP bit MUST be set in the server-flags option and the
5024 previously recorded failover state MUST be placed in the server-state
5025 option.
5026
5027
5028 9.3.2. Transition out of STARTUP state
5029
5030 Each server starts out in startup state every time it initializes
5031 itself, and performs the following algorithm as part of its initiali-
5032 zation:
5033
5034 1. Is there any record in stable storage of a previous failover
5035 state? If yes, set previous-state to the last recorded state
5036 in stable storage, and continue with step 2.
5037
5038 Is there any configuration information that indicates that
5039
5040
5041
5042 Droms, et. al. Expires September 2003 [Page 90]
5043 \f
5044 Internet Draft DHCP Failover Protocol March 2003
5045
5046
5047 this server was previously running but lost its stable
5048 storage? Such information must typically come from some
5049 administrative intervention, since it is difficult for a
5050 server to distinguish first startup from a startup after it
5051 has lost its stable storage. If yes, then set the previous-
5052 state to RECOVER, and set the time-of-failure to whatever time
5053 was configured, and go on to step 2. This time-of-failure
5054 will be used in the transition out of the RECOVER-WAIT state
5055 into the RECOVER-DONE state, below.
5056
5057 If there is no record of any previous failover state in stable
5058 storage for this server, then set the previous-state to
5059 RECOVER and set the time-of-failure to a time before the
5060 maximum-client-lead-time before now. If using standard Posix
5061 times, 0 would typically do quite well. This will allow two
5062 servers which already have lease information to synchronize
5063 themselves prior to operating.
5064
5065 Note that neither server is responsive to DHCP client requests
5066 while in the RECOVER state. If both servers can communicate,
5067 however, they will come out of the RECOVER state and progress
5068 through RECOVER-WAIT to RECOVER-DONE and thence to NORMAL or
5069 COMMUNICATIONS-INTERRUPTED state quickly. If both have state,
5070 then they will exchange information. If only one has state,
5071 then the one that does not will complete its update of its
5072 partner quickly (since it has nothing to send).
5073
5074 In some cases, an existing server will be commissioned as a
5075 failover server and brought back into operation where its
5076 partner is not yet available. In this case, the newly commis-
5077 sioned failover server will not operate until its partner
5078 comes online -- but it has operational responsibilities as a
5079 DHCP server nonetheless. To properly handle this situation, a
5080 server SHOULD be configurable in such a way as to move
5081 directly into PARTNER-DOWN state after the startup period
5082 expires if it has been unable to contact its partner during
5083 the startup period.
5084
5085 2. If the previous state is one where communications was "OK",
5086 then set the previous state to the state that is the result of
5087 the communications failed state transition in Figure 9.2-1 (if
5088 such transition is shown -- some states don't have a communi-
5089 cations failed state transition, since they allow both commun-
5090 ications OK and failed).
5091
5092 3. Start the STARTUP state timer. The time that a server remains
5093 in the STARTUP state (absent any communications with its
5094 partner) is implementation dependent and SHOULD be
5095
5096
5097
5098 Droms, et. al. Expires September 2003 [Page 91]
5099 \f
5100 Internet Draft DHCP Failover Protocol March 2003
5101
5102
5103 configurable. It SHOULD be long enough for a TCP connection
5104 to be created to a heavily loaded partner across a slow net-
5105 work.
5106
5107 4. Attempt to create a TCP connection to the failover partner.
5108 See section 8.2.
5109
5110 5. Wait for "communications okay", i.e., the process discussed in
5111 section 8.2 "Creating the TCP Connection", to complete,
5112 including the receipt of a STATE message from the partner.
5113
5114 When and if communications become "okay", clear the STARTUP
5115 flag, and set the current state to the previous-state.
5116
5117 If the partner is in PARTNER-DOWN state, and if the time at
5118 which it entered PARTNER-DOWN state (as received in the
5119 start-time-of-state option in the STATE message) is later than
5120 the last recorded time of operation of this server, then set
5121 the current state to RECOVER. If the time at which it entered
5122 PARTNER-DOWN state is earlier than the last recorded time of
5123 operation of this server, then set the current state to
5124 POTENTIAL-CONFLICT.
5125
5126 Then, transition to the current state and take the "communica-
5127 tions okay" state transition based on the current state of
5128 this server and the partner.
5129
5130 6. If the startup time expires, take an implementation dependent
5131 action: The server MAY go to the previous-state, or the
5132 server MAY wait.
5133
5134 Reasons to go to previous-state and begin processing:
5135
5136 If the current server is the only operational server, then if
5137 it waits, there will be no operational DHCP servers. This
5138 situation could occur very easily where one server fails and
5139 then the other crashes and reboots. If the rebooting server
5140 doesn't start processing DHCP client requests without first
5141 being in communication with the other server, then the level
5142 of DHCP redundancy is not particularly high. This is an
5143 appropriate approach if the possibility of partition is low,
5144 or if the safe period expiration time is well beyond the time
5145 at which an operator would notice and react to a partition
5146 situation. It is also quite appropriate if the safe period
5147 will never expire.
5148
5149 Reasons to wait:
5150
5151
5152
5153
5154 Droms, et. al. Expires September 2003 [Page 92]
5155 \f
5156 Internet Draft DHCP Failover Protocol March 2003
5157
5158
5159 If the current server has been down for longer than the
5160 maximum-client-lead-time, and it is partitioned from the other
5161 server, then when it returns it will attempt to use its own
5162 available addresses to allocate to new DHCP clients, and the
5163 other server may well be in PARTNER-DOWN state and may have
5164 already allocated some of those available addresses to DHCP
5165 clients. In cases where the possibility of partition is high,
5166 and the safe period expiration time is less than the likely
5167 operator reaction time, this is a good approach to use.
5168
5169 9.4. PARTNER-DOWN state
5170
5171 PARTNER-DOWN state is a state either server can enter. When in this
5172 state, the server does not assume that the other server could still
5173 be operating and servicing a different set of clients, but instead
5174 assumes that it is the only server operating. If one server is in
5175 PARTNER-DOWN state, the other server MUST NOT be operating.
5176
5177
5178 9.4.1. Upon entry to PARTNER-DOWN state
5179
5180 No special actions are required when entering PARTNER-DOWN state.
5181
5182 The server should continue to attempt to connect to the partner
5183 periodically.
5184
5185
5186 9.4.2. Operation while in PARTNER-DOWN state
5187
5188 A server in PARTNER-DOWN state MUST respond to DHCP client requests.
5189 It will allow renewal of all outstanding leases on IP addresses, and
5190 will allocate IP addresses from its own pool, and after a fixed
5191 period of time (the MCLT interval) has elapsed from entry into
5192 PARTNER-DOWN state, it will allocate IP addresses from the set of all
5193 available IP addresses.
5194
5195 Once a server has entered NORMAL state, the PARTNER-DOWN state is
5196 entered only on command of an external agency (typically an adminis-
5197 trator of some sort) or after the expiration of an externally config-
5198 ured minimum safe-time after the beginning of COMMUNICATIONS-
5199 INTERRUPTED state.
5200
5201 Any IP address tagged as available for allocation by the other server
5202 (at entry to PARTNER-DOWN state) MUST NOT be allocated to a new
5203 client until the maximum-client-lead-time beyond the entry into
5204 PARTNER-DOWN state has elapsed.
5205
5206 A server in PARTNER-DOWN state MUST NOT allocate an IP address to a
5207
5208
5209
5210 Droms, et. al. Expires September 2003 [Page 93]
5211 \f
5212 Internet Draft DHCP Failover Protocol March 2003
5213
5214
5215 DHCP client different from that to which it was allocated at the
5216 entrance to PARTNER-DOWN state until the maximum-client-lead-time
5217 beyond the maximum of the following times: client expiration time,
5218 most recently transmitted potential-expiration-time, most recently
5219 received ack of potential-expiration-time from the partner, and most
5220 recently acked potential-expiration-time to the partner. See section
5221 7.1.5 for details. If this time would be earlier than the current
5222 time plus the maximum-client-lead-time, then the time the server
5223 entered PARTNER-DOWN state plus the maximum-client-lead-time is used.
5224
5225 Two options exist for lease times given out while in PARTNER-DOWN
5226 state, with different ramifications flowing from each.
5227
5228 If the server wishes the Failover protocol to protect it from loss of
5229 stable storage in PARTNER-DOWN state, then it should ensure that the
5230 MCLT based lease time restrictions in section 5.1 are maintained,
5231 even in PARTNER-DOWN state.
5232
5233 If the server wishes to forego the protection of the Failover proto-
5234 col in the event of loss of stable storage, then it need recognize no
5235 restrictions on actual client lease times while in PARTNER-DOWN
5236 state.
5237
5238 A server in PARTNER-DOWN state MUST continue to attempt to establish
5239 communications and synchronization with its partner.
5240
5241 9.4.3. Transitions out of PARTNER-DOWN state
5242
5243 When a server in PARTNER-DOWN state succeeds in establishing a con-
5244 nection to its partner, its actions are conditional on the state and
5245 flags received in the STATE message from the other server as part of
5246 the process of establishing the connection.
5247
5248 If the STARTUP bit is set in the server-flags option of a received
5249 STATE message, a server in PARTNER-DOWN state MUST NOT take any state
5250 transitions based on reestablishing communications. Essentially, if a
5251 server is in PARTNER-DOWN state, it ignores all STATE messages from
5252 its partner that have the STARTUP bit set in the server-flags option
5253 of the STATE message.
5254
5255 If the STARTUP bit is not set in the server-flags option of a STATE
5256 message received from its partner, then a server in PARTNER-DOWN
5257 state takes the following actions based on the value of the server-
5258 state option in the received STATE message (either immediately after
5259 establishing communications or at any time later when a new state is
5260 received):
5261
5262 o partner in NORMAL, COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN,
5263
5264
5265
5266 Droms, et. al. Expires September 2003 [Page 94]
5267 \f
5268 Internet Draft DHCP Failover Protocol March 2003
5269
5270
5271 POTENTIAL-CONFLICT, RESOLUTION-INTERRUPTED, or CONFLICT-DONE
5272 state
5273
5274 transition to POTENTIAL-CONFLICT state
5275
5276 o partner in RECOVER, RECOVER-WAIT, SHUTDOWN, PAUSED state
5277
5278 stay in PARTNER-DOWN state
5279
5280 o partner in RECOVER-DONE state
5281
5282 transition into NORMAL state
5283
5284 9.5. RECOVER state
5285
5286 This state indicates that the server has no information in its stable
5287 storage or that it is re-integrating with a server in PARTNER-DOWN
5288 state after it has been down. A server in this state MUST attempt to
5289 refresh its stable storage from the other server.
5290
5291 9.5.1. Operation in RECOVER state
5292
5293 A server in RECOVER MUST NOT respond to DHCP client requests.
5294
5295 A server in RECOVER state will attempt to reestablish communications
5296 with the other server.
5297
5298 9.5.2. Transitions out of RECOVER state
5299
5300 If the other server is in POTENTIAL-CONFLICT, RESOLUTION-INTERRUPTED,
5301 or CONFLICT-DONE state when communications are reestablished, then
5302 the server in RECOVER state will move to POTENTIAL-CONFLICT state
5303 itself.
5304
5305 If the other server is in any other state, then the server in RECOVER
5306 state will request an update of missing binding information by send-
5307 ing an UPDREQ message. If the server has been instructed (through
5308 configuration or other external agency) that it has lost its stable
5309 storage, or if it has deduced that from the fact that it has no
5310 record of ever having talked to its partner, while its partner does
5311 have a record of communicating with it, it MUST send an UPDREQALL
5312 message, otherwise it MUST send an UPDREQ message. See Figure
5313 9.5.2-1.
5314
5315 It will wait for an UPDDONE message, and upon receipt of that message
5316 it will transition to RECOVER-WAIT state.
5317
5318 If communications fails during the reception of the results of the
5319
5320
5321
5322 Droms, et. al. Expires September 2003 [Page 95]
5323 \f
5324 Internet Draft DHCP Failover Protocol March 2003
5325
5326
5327 UPDREQ or UPDREQALL message, the server will remain in RECOVER state,
5328 and will re-issue the UPDREQ or UPDREQALL when communications are
5329 re-established. (See section 5.17).
5330
5331 If an UPDDONE message isn't received within an implementation depen-
5332 dent amount of time, and no BNDUPD messages are being received, the
5333 connection SHOULD be dropped.
5334
5335
5336
5337
5338 A B
5339 Server Server
5340
5341 | |
5342 RECOVER PARTNER-DOWN
5343 | |
5344 | >--UPDREQ--------------------> |
5345 | |
5346 | <---------------------BNDUPD--< |
5347 | >--BNDACK--------------------> |
5348 ... ...
5349 | |
5350 | <---------------------BNDUPD--< |
5351 | >--BNDACK--------------------> |
5352 | |
5353 | <--------------------UPDDONE--< |
5354 | |
5355 RECOVER-WAIT |
5356 | |
5357 | >--STATE-(RECOVER-WAIT)------> |
5358 | |
5359 | |
5360 Wait MCLT from last known |
5361 time of failover operation |
5362 | |
5363 RECOVER-DONE |
5364 | |
5365 | >--STATE-(RECOVER-DONE)------> |
5366 | NORMAL
5367 | <-------------(NORMAL)-STATE--< |
5368 NORMAL |
5369 | >---- State-(NORMAL)--------------->
5370 | |
5371 | |
5372
5373 Figure 9.5.2-1: Transition out of RECOVER state
5374
5375
5376
5377
5378 Droms, et. al. Expires September 2003 [Page 96]
5379 \f
5380 Internet Draft DHCP Failover Protocol March 2003
5381
5382
5383
5384 If, at any time while a server is in RECOVER state communications fails,
5385 the server will stay in RECOVER state. When communications are
5386 restored, it will restart the process of transitioning out of RECOVER
5387 state.
5388
5389 9.6. RECOVER-WAIT state
5390
5391 This state indicates that the server has done an UPDREQ or UPDREQALL
5392 and has received the UPDDONE message indicating that it has received
5393 all outstanding binding update information. In the RECOVER-WAIT
5394 state the server will wait for the MCLT in order to ensure that any
5395 processing that this server might have done prior to losing its
5396 stable storage will not cause future difficulties.
5397
5398 9.6.1. Operation in RECOVER-WAIT state
5399
5400 A server in RECOVER-WAIT MUST NOT respond to DHCP client requests.
5401
5402 9.6.2. Transitions out of RECOVER-WAIT state
5403
5404 Upon entry to RECOVER-WAIT state the server MUST start a timer whose
5405 expiration is set to a time equal to the time the server went down
5406 (if known) or the time the server started (if the down-time is
5407 unknown) plus the maximum-client-lead-time. When this timer goes
5408 off, the server will transition into RECOVER-DONE state.
5409
5410 This is to allow any IP addresses that were allocated by this server
5411 prior to loss of its client binding information in stable storage to
5412 contact the other server or to time out.
5413
5414 If this is the first time this server has run failover -- as
5415 determined by the information received from the partner, not
5416 necessarily only as determined by this server's stable storage (as
5417 that may have been lost), then the waiting time discussed above may
5418 be skipped, and the server may transition immediately to RECOVER-DONE
5419 state.
5420
5421 See Figure 9.5.2-1.
5422
5423 DISCUSSION:
5424
5425 The actual requirement on this wait period in RECOVER is that it
5426 start not before the recovering server went down, not necessarily
5427 when it came back up. If the time when the recovering server
5428 failed is known, it could be communicated to the recovering server
5429 (perhaps through actions of the network administrator), and the
5430 wait period could be reduced to the maximum-client-lead-time less
5431
5432
5433
5434 Droms, et. al. Expires September 2003 [Page 97]
5435 \f
5436 Internet Draft DHCP Failover Protocol March 2003
5437
5438
5439 the difference between the current time and the time the server
5440 failed. In this way, the waiting period could be minimized.
5441 Various heuristics could be used to estimate this time, for
5442 example if the recovering server periodically updates stable
5443 storage with a time stamp, the wait period could be calculated to
5444 start at the time of the last update of stable storage plus the
5445 time required for the next update (which never occurred). This
5446 estimate is later than the server went down, but probably not too
5447 much later.
5448
5449 If the server has never before run failover, then there is no need
5450 to wait in this state -- but, again, to determine if this server
5451 has run failover it is vital that the information provided by the
5452 partner be utilized, since the stable storage of this server may
5453 have been lost.
5454
5455 If communications fails while a server is in RECOVER-WAIT state, it
5456 has no effect on the operation of this state. The server SHOULD
5457 continue to operate its timer, and the timer goes off during the
5458 period where communications with the other server have failed, then
5459 the server SHOULD transition to RECOVER-DONE state. This is rare --
5460 failover state transitions are not usually made while communications
5461 are interrupted, but in this case there is no reason to inhibit the
5462 timer. A server MAY state in RECOVER-WAIT state even after expiry of
5463 the timer and transition to RECOVER-DONE state upon re-establishing
5464 communications with the partner if desired. The key point here is to
5465 allow the timer to continue to operate, not whether or not the state
5466 transition is made before or after communications are re-established.
5467
5468
5469 9.7. RECOVER-DONE state
5470
5471 This state exists to allow an interlocked transition for one server
5472 from RECOVER state and another server from PARTNER-DOWN or
5473 COMMUNICATIONS-INTERRUPTED state into NORMAL state.
5474
5475 9.7.1. Operation in RECOVER-DONE state
5476
5477 A server in RECOVER-DONE state MUST respond only to
5478 DHCPREQUEST/RENEWAL and DHCPREQUEST/REBINDING DHCP messages.
5479
5480 9.7.2. Transitions out of RECOVER-DONE state
5481
5482 When a server in RECOVER-DONE state determines that its partner
5483 server has entered NORMAL or RECOVER-DONE state, then it will transi-
5484 tion into NORMAL state.
5485
5486 If communications fails while in RECOVER-DONE state, a server will
5487
5488
5489
5490 Droms, et. al. Expires September 2003 [Page 98]
5491 \f
5492 Internet Draft DHCP Failover Protocol March 2003
5493
5494
5495 stay in RECOVER-DONE state.
5496
5497
5498 9.8. NORMAL state
5499
5500 NORMAL state is the state used by a server when it is communicating
5501 with the other server, and any required resynchronization has been
5502 performed. While some bindings database synchronization is performed
5503 in NORMAL state, potential conflicts are resolved prior to entry into
5504 NORMAL state as is binding database data loss.
5505
5506
5507 9.8.1. Upon entry to NORMAL state
5508
5509 When entering NORMAL state, a server will send to the other server
5510 all currently unacknowledged binding updates as BNDUPD messages.
5511
5512 When the above process is complete, if the server entering NORMAL
5513 state is a secondary server, then it will request IP addresses for
5514 allocation using the POOLREQ message.
5515
5516
5517 9.8.2. Processing DHCP client requests and load balancing
5518
5519 In NORMAL state, a server MUST process every DHCPREQUEST/RENEWAL or
5520 DHCPREQUEST/REBINDING request it receives. And, it processes other
5521 requests only for those clients as dictated by the load balancing
5522 algorithm specified in [RFC 3074].
5523
5524 As discussed in section 5.3, each server will take the client-
5525 identifier from each DHCP client request (or the client-hardware-
5526 address, i.e., the chaddr if no client-identifier is present in the
5527 request) and use it as the 'Request ID' specified in [RFC 3074].
5528 After applying the algorithm specified in [RFC 3074] and comparing
5529 the result with the hash bucket assignment (performed during connect
5530 processing between failover servers), each failover server will be
5531 able to unambiguously determine if it should process the DHCP client
5532 request.
5533
5534 9.8.3. Operation in NORMAL state
5535
5536 When in NORMAL state, for every DHCP client request that it
5537 processes, as determined by the algorithm described in section 9.8.2,
5538 above, a server will operate in the following manner:
5539
5540 o Lease time calculations
5541
5542 As discussed in section 5.2.1, "Control of lease time", the
5543
5544
5545
5546 Droms, et. al. Expires September 2003 [Page 99]
5547 \f
5548 Internet Draft DHCP Failover Protocol March 2003
5549
5550
5551 lease interval given to a DHCP client can never be more than the
5552 MCLT greater than the most recently received potential-
5553 expiration-time from the failover partner or the current time,
5554 whichever is later.
5555
5556 As long as a server adheres to this constraint, the specifics of
5557 the lease interval that it gives to a DHCP client or the value
5558 of the potential-expiration-time sent to its failover partner
5559 are implementation dependent. One possible approach is dis-
5560 cussed in section 5.2.1, but that particular approach is in no
5561 way required by this protocol.
5562
5563 See section 7.1.5 for details concerning the storage of time
5564 associated with IP addresses and how to use these times when
5565 calculating lease times for DHCP clients.
5566
5567 o Lazy update of partner server
5568
5569 After an DHCPACK of a IP address binding, the server servicing a
5570 DHCP client request attempts to update its partner with the new
5571 binding information. The lease time used in the update of the
5572 secondary MUST be at least that given to the DHCP client in the
5573 DHCPACK, and the potential-expiration-time MUST be at least the
5574 lease time, and SHOULD be considerably longer.
5575
5576 o Reallocation of IP addresses between clients
5577
5578 Whenever a client binding is released or expires, a BNDUPD mes-
5579 sage must be sent to the partner, setting the binding state to
5580 RELEASED or EXPIRED. However, until a BNDACK is received for
5581 this message, the IP address cannot be allocated to another
5582 client. It cannot be allocated to the same client again if a
5583 BNDUPD was sent, otherwise it can. See section 5.2.2.
5584
5585 In normal state, each server receives binding updates from its
5586 partner server in BNDUPD messages. It records these in its client
5587 binding database in stable storage and then sends a corresponding
5588 BNDACK message to its partner server. It MUST ensure that the infor-
5589 mation is recorded in stable storage prior to sending the BNDACK mes-
5590 sage back to its partner.
5591
5592
5593 9.8.4. Transitions out of NORMAL state
5594
5595 If an external command is received by a server in NORMAL state
5596 informing it that its partner is down, then transition into PARTNER-
5597 DOWN state. Generally, this would be an unusual situation, where
5598 some external agency knew the partner server was down. Using the
5599
5600
5601
5602 Droms, et. al. Expires September 2003 [Page 100]
5603 \f
5604 Internet Draft DHCP Failover Protocol March 2003
5605
5606
5607 command in this case would be appropriate if the polling interval and
5608 timeout were long.
5609
5610 If a server in NORMAL state fails to receive acks to messages sent to
5611 its partner for an implementation dependent period of time, it MAY
5612 move into COMMUNICATIONS-INTERRUPTED state. This situation might
5613 occur if the partner server was capable of maintaining the TCP con-
5614 nection between the server and also capable of sending a CONTACT mes-
5615 sage every tSend seconds, but was (for some reason) incapable of pro-
5616 cessing BNDUPD messages.
5617
5618 If the communications is determined to not be "ok" (as defined in
5619 section 8), then transition into COMMUNICATIONS-INTERRUPTED state.
5620
5621 If a server in NORMAL state receives any messages from its partner
5622 where the partner has changed state from that expected by the server
5623 in NORMAL state, then the server should transition into
5624 COMMUNICATIONS-INTERRUPTED state and take the appropriate state tran-
5625 sition from there. For example, it would be expected for the partner
5626 to transition from POTENTIAL-CONFLICT into NORMAL state, but not for
5627 the partner to transition from NORMAL into POTENTIAL-CONFLICT state.
5628
5629 If a server in NORMAL state receives any messages from its partner
5630 where the PARTNER has changed into PAUSED state, the server should
5631 transition into COMMUNICATIONS-INTERRUPTED state. If a server in
5632 NORMAL state receives any messages from its partner where the PARTNER
5633 has changed into SHUTDOWN state, the server should transition into
5634 PARTNER-DOWN state.
5635
5636 9.9. COMMUNICATIONS-INTERRUPTED State
5637
5638 A server goes into COMMUNICATIONS-INTERRUPTED state whenever it is
5639 unable to communicate with the other server. Primary and secondary
5640 servers cycle automatically (without administrative intervention)
5641 between NORMAL and COMMUNICATIONS-INTERRUPTED state as the network
5642 connection between them fails and recovers, or as the partner server
5643 cycles between operational and non-operational. No duplicate IP
5644 address allocation can occur while the servers cycle between these
5645 states.
5646
5647
5648 9.9.1. Upon entry to COMMUNICATIONS-INTERRUPTED state
5649
5650 When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been
5651 configured to support an automatic transition out of COMMUNICATIONS-
5652 INTERRUPTED state and into PARTNER-DOWN state (i.e., a "safe period"
5653 has been configured, see section 10), then a timer MUST be started
5654 for the length of the configured safe period.
5655
5656
5657
5658 Droms, et. al. Expires September 2003 [Page 101]
5659 \f
5660 Internet Draft DHCP Failover Protocol March 2003
5661
5662
5663 A server transitioning into the COMMUNICATIONS-INTERRUPTED state from
5664 the NORMAL state SHOULD raise some alarm condition to alert adminis-
5665 trative staff to a potential problem in the DHCP subsystem.
5666
5667
5668 9.9.2. Operation in COMMUNICATIONS-INTERRUPTED State
5669
5670 In this state a server MUST respond to all DHCP client requests, and
5671 the algorithm for load balancing described in section 5.3 MUST NOT be
5672 used. When allocating new IP addresses, each server allocates from
5673 its own IP address pool, where the primary MUST allocate only FREE IP
5674 addresses, and the secondary MUST allocate only BACKUP IP addresses.
5675 When responding to renewal requests, each server will allow continued
5676 renewal of a DHCP client's current lease on an IP address irrespec-
5677 tive of whether that lease was given out by the receiving server or
5678 not, although the renewal period MUST NOT exceed the maximum client
5679 lead time (MCLT) beyond the latest of: 1) the potential-expiration-
5680 time already acknowledged by the other server, or 2) the lease-
5681 expiration-time, or 3) the potential-expiration-time received from
5682 the partner server.
5683
5684 However, since the server cannot communicate with its partner in this
5685 state, the acknowledged-potential-expiration time will not be updated
5686 in any new bindings. This is likely to eventually cause the actual-
5687 client-lease-times to be the current time plus the maximum-client-
5688 lead-time (unless this is greater than the desired-client-lease-
5689 time).
5690
5691 The server should continue to try to establish a connection with its
5692 partner.
5693
5694
5695 9.9.3. Transition out of COMMUNICATIONS-INTERRUPTED State
5696
5697 If the safe period timer expires while a server is in the
5698 COMMUNICATIONS-INTERRUPTED state, it will transition immediately into
5699 PARTNER-DOWN state.
5700
5701 If an external command is received by a server in COMMUNICATIONS-
5702 INTERRUPTED state informing it that its partner is down, it will
5703 transition immediately into PARTNER-DOWN state.
5704
5705 If communications is restored with the other server, then the server
5706 in COMMUNICATIONS-INTERRUPTED state will transition into another
5707 state based on the state of the partner:
5708
5709 o partner in NORMAL or COMMUNICATIONS-INTERRUPTED
5710
5711
5712
5713
5714 Droms, et. al. Expires September 2003 [Page 102]
5715 \f
5716 Internet Draft DHCP Failover Protocol March 2003
5717
5718
5719 The partner SHOULD NOT be in NORMAL state here, since upon res-
5720 toration of communications it MUST have created a new TCP con-
5721 nection which would have forced it into COMMUNICATIONS-
5722 INTERRUPTED state. Still, we should account for every state
5723 just in case.
5724
5725 Transition into the NORMAL state.
5726
5727 o partner in RECOVER
5728
5729 Stay in COMMUNICATIONS-INTERRUPTED state.
5730
5731 o partner in RECOVER-DONE
5732
5733 Transition into NORMAL state.
5734
5735 o partner in PARTNER-DOWN, POTENTIAL-CONFLICT, CONFLICT-DONE, or
5736 RESOLUTION-INTERRUPTED
5737
5738 Transition into POTENTIAL-CONFLICT state.
5739
5740 o partner in PAUSED
5741
5742 Stay in COMMUNICATIONS-INTERRUPTED state.
5743
5744 o partner in SHUTDOWN
5745
5746 Transition into PARTNER-DOWN state.
5747
5748 The following figure illustrates the transition from NORMAL to
5749 COMMUNICATIONS-INTERRUPTED state and then back to NORMAL state again.
5750
5751
5752
5753
5754
5755
5756
5757
5758
5759
5760
5761
5762
5763
5764
5765
5766
5767
5768
5769
5770 Droms, et. al. Expires September 2003 [Page 103]
5771 \f
5772 Internet Draft DHCP Failover Protocol March 2003
5773
5774
5775
5776 Primary Secondary
5777 Server Server
5778
5779 NORMAL NORMAL
5780 | >--CONTACT-------------------> |
5781 | <--------------------CONTACT--< |
5782 | [TCP connection broken] |
5783 COMMUNICATIONS : COMMUNICATIONS
5784 INTERRUPTED : INTERRUPTED
5785 | [attempt new TCP connection] |
5786 | [connection succeeds] |
5787 | |
5788 | >--CONNECT-------------------> |
5789 | <-----------------CONNECTACK--< |
5790 | NORMAL
5791 | <-------------------STATE-----< |
5792 NORMAL |
5793 | >--STATE---------------------> |
5794 |
5795 | >--BNDUPD--------------------> |
5796 | <---------------------BNDACK--< |
5797 | |
5798 | <---------------------BNDUPD--< |
5799 | >------BNDACK----------------> |
5800 ... ...
5801 | |
5802 | <--------------------POOLREQ--< |
5803 | >--POOLRESP-(2)--------------> |
5804 | |
5805 | >--BNDUPD-(#1)---------------> |
5806 | <---------------------BNDACK--< |
5807 | |
5808 | <--------------------POOLREQ--< |
5809 | >--POOLRESP-(0)--------------> |
5810 | |
5811 | >--BNDUPD-(#2)---------------> |
5812 | <---------------------BNDACK--< |
5813 | |
5814
5815 Figure 9.9.3-1: Transition from NORMAL to COMMUNICATIONS-
5816 INTERRUPTED and back (example with 2
5817 addresses allocated to secondary)
5818
5819
5820
5821
5822
5823
5824
5825
5826 Droms, et. al. Expires September 2003 [Page 104]
5827 \f
5828 Internet Draft DHCP Failover Protocol March 2003
5829
5830
5831
5832 9.10. POTENTIAL-CONFLICT state
5833
5834 This state indicates that the two servers are attempting to re-
5835 integrate with each other, but at least one of them was running in a
5836 state that did not guarantee automatic reintegration would be
5837 possible. In POTENTIAL-CONFLICT state the servers may determine that
5838 the same IP address has been offered and accepted by two different
5839 DHCP clients.
5840
5841 It is a goal of this protocol to minimize the possibility that
5842 POTENTIAL-CONFLICT state is ever entered.
5843
5844 9.10.1. Upon entry to POTENTIAL-CONFLICT state
5845
5846 When a primary server enters POTENTIAL-CONFLICT state it should
5847 request that the secondary send it all updates of which it is
5848 currently unaware by sending an UPDREQ message to the secondary
5849 server.
5850
5851 A secondary server entering POTENTIAL-CONFLICT state will wait for
5852 the primary to send it an UPDREQ message.
5853
5854 9.10.2. Operation in POTENTIAL-CONFLICT state
5855
5856 Any server in POTENTIAL-CONFLICT state MUST NOT process any incoming
5857 DHCP requests.
5858
5859
5860 9.10.3. Transitions out of POTENTIAL-CONFLICT state
5861
5862 If communications fails with the partner while in POTENTIAL-CONFLICT
5863 state, then the server will transition to RESOLUTION-INTERRUPTED
5864 state.
5865
5866 Whenever either server receives an UPDDONE message from its partner
5867 while in POTENTIAL-CONFLICT state, it MUST transition to a new state.
5868 The primary MUST transition to CONFLICT-DONE state, and the secondary
5869 MUST transition to NORMAL state. This will cause the primary server
5870 to leave POTENTIAL-CONFLICT state prior to the secondary, since the
5871 primary sends an UPDREQ message and receives an UPDDONE before the
5872 secondary sends an UPDREQ message and receives its UPDDONE message.
5873
5874 When a secondary server receives an indication that the primary
5875 server has made a transition from POTENTIAL-CONFLICT to CONFLICT-DONE
5876 state, it SHOULD send an UPDREQ message to the primary server.
5877
5878
5879
5880
5881
5882 Droms, et. al. Expires September 2003 [Page 105]
5883 \f
5884 Internet Draft DHCP Failover Protocol March 2003
5885
5886
5887
5888
5889 Primary Secondary
5890 Server Server
5891
5892 | |
5893 POTENTIAL-CONFLICT POTENTIAL-CONFLICT
5894 | |
5895 | >--UPDREQ--------------------> |
5896 | |
5897 | <---------------------BNDUPD--< |
5898 | >--BNDACK--------------------> |
5899 ... ...
5900 | |
5901 | <---------------------BNDUPD--< |
5902 | >--BNDACK--------------------> |
5903 | |
5904 | <--------------------UPDDONE--< |
5905 CONFLICT-DONE |
5906 | >--STATE--(CONFLICT-DONE)----> |
5907 | <---------------------UPDREQ--< |
5908 | |
5909 | >--BNDUPD--------------------> |
5910 | <---------------------BNDACK--< |
5911 ... ...
5912 | >--BNDUPD--------------------> |
5913 | <---------------------BNDACK--< |
5914 | |
5915 | >--UPDDONE-------------------> |
5916 | NORMAL
5917 | <------------STATE--(NORMAL)--< |
5918 NORMAL |
5919 | >--STATE--(NORMAL)-----------> |
5920 | |
5921 | <--------------------POOLREQ--< |
5922 | >------POOLRESP-(n)----------> |
5923 | addresses |
5924
5925 Figure 9.10.3-1: Transition out of POTENTIAL-CONFLICT
5926
5927
5928
5929
5930
5931
5932
5933
5934
5935
5936
5937
5938 Droms, et. al. Expires September 2003 [Page 106]
5939 \f
5940 Internet Draft DHCP Failover Protocol March 2003
5941
5942
5943
5944 9.11. RESOLUTION-INTERRUPTED state
5945
5946 This state indicates that the two servers were attempting to re-
5947 integrate with each other in POTENTIAL-CONFLICT state, but
5948 communications failed prior to completion of re-integration.
5949
5950 If the servers remained in POTENTIAL-CONFLICT while communications
5951 was interrupted, neither server would be responsive to DHCP client
5952 requests, and if one server had crashed, then there might be no
5953 server able to process DHCP requests.
5954
5955 9.11.1. Upon entry to RESOLUTION-INTERRUPTED state
5956
5957 When a server enters RESOLUTION-INTERRUPTED state it SHOULD raise an
5958 alarm condition to alert administrative staff of a problem in the
5959 DHCP subsystem.
5960
5961 9.11.2. Operation in RESOLUTION-INTERRUPTED state
5962
5963 In this state a server MUST respond to all DHCP client requests, and
5964 any load balancing (described in section 5.3) MUST NOT be used. When
5965 allocating new IP addresses, each server SHOULD allocate from its own
5966 IP address pool (if that can be determined), where the primary SHOULD
5967 allocate only FREE IP addresses, and the secondary SHOULD allocate
5968 only BACKUP IP addresses. When responding to renewal requests, each
5969 server will allow continued renewal of a DHCP client's current lease
5970 on an IP address irrespective of whether that lease was given out by
5971 the receiving server or not, although the renewal period MUST not
5972 exceed the maximum client lead time (MCLT) beyond the latest of: 1)
5973 the potential-expiration-time already acknowledged by the other
5974 server or 2) the lease-expiration-time or 3) `potential-expiration-
5975 time received from the partner server.
5976
5977 However, since the server cannot communicate with its partner in this
5978 state, the acknowledged-potential-expiration time will not be updated
5979 in any new bindings.
5980
5981
5982 9.11.3. Transitions out of RESOLUTION-INTERRUPTED state
5983
5984 If an external command is received by a server in RESOLUTION-
5985 INTERRUPTED state informing it that its partner is down, it will
5986 transition immediately into PARTNER-DOWN state.
5987
5988 If communications is restored with the other server, then the server
5989 in RESOLUTION-INTERRUPTED state will transition into POTENTIAL-
5990 CONFLICT state.
5991
5992
5993
5994 Droms, et. al. Expires September 2003 [Page 107]
5995 \f
5996 Internet Draft DHCP Failover Protocol March 2003
5997
5998
5999
6000 9.12. CONFLICT-DONE state
6001
6002 This state indicates that during the process where the two servers
6003 are attempting to re-integrate with each other, the primary server
6004 has received all of the updates from the secondary server. It make a
6005 transition into CONFLICT-DONE state in order that it may be totally
6006 responsive to the client load, as opposed to NORMAL state where it
6007 would be in a "balanced" responsive state, running the load balancing
6008 algorithm.
6009
6010 9.12.1. Upon entry to CONFLICT-DONE state
6011
6012 A secondary server should never enter CONFLICT-DONE state.
6013
6014 9.12.2. Operation in CONFLICT-DONE state
6015
6016 A primary server in CONFLICT-DONE state is fully responsive to all
6017 DHCP clients (similar to the situation in COMMUNICATIONS-INTERRUPTED
6018 state).
6019
6020 If communications fails, remain in CONFLICT-DONE state. If communi-
6021 cations becomes OK, remain in CONFLICT-DONE state until the condi-
6022 tions for transition out become satisfied.
6023
6024
6025 9.12.3. Transitions out of CONFLICT-DONE state
6026
6027 If communications fails with the partner while in CONFLICT-DONE
6028 state, then the server will remain in CONFLICT-DONE state.
6029
6030 When a primary server determines that the secondary server has made a
6031 transition into NORMAL state, the primary server will also transition
6032 into NORMAL state.
6033
6034 9.13. PAUSED state
6035
6036 This state exists to allow one server to inform another that it will
6037 be out of service for what is predicted to be a relatively short
6038 time, and to allow the other server to transition to COMMUNICATIONS-
6039 INTERRUPTED state immediately and to begin servicing all DHCP clients
6040 with no interruption in service to new DHCP clients.
6041
6042 A server which is aware that it is shutting down temporarily SHOULD
6043 send a STATE message with the server-state option containing PAUSED
6044 state and close the TCP connection.
6045
6046 While a server may or may not transition internally into PAUSED
6047
6048
6049
6050 Droms, et. al. Expires September 2003 [Page 108]
6051 \f
6052 Internet Draft DHCP Failover Protocol March 2003
6053
6054
6055 state, the 'previous' state determined when it is restarted MUST be
6056 the state the server was in prior to receiving the command to shut-
6057 down and restart and which precedes its entry into the PAUSED state.
6058 See section 9.3.2 concerning the use of the previous state upon
6059 server restart.
6060
6061 9.13.1. Upon entry to PAUSED state
6062
6063 When entering PAUSED state, the server MUST store the previous state
6064 in stable storage, and use that state as the previous state when it
6065 is restarted.
6066
6067 9.13.2. Transitions out of PAUSED state
6068
6069 A server makes a transition out of PAUSED state by being restarted.
6070 At that time, the previous state MUST be the state the server was in
6071 prior to entering the PAUSED state.
6072
6073
6074 9.14. SHUTDOWN state
6075
6076 This state exists to allow one server to inform another that it will
6077 be out of service for what is predicted to be a relatively long time,
6078 and to allow the other server to transition immediately to PARTNER-
6079 DOWN state, and take over completely for the server going down.
6080
6081 9.14.1. Upon entry to SHUTDOWN state
6082
6083 When entering SHUTDOWN state, the server MUST record the previous
6084 state in stable storage for use when the server is restarted. It
6085 also MUST record the current time as the last time operational.
6086
6087 A server which is aware that it is shutting down SHOULD send a STATE
6088 message with the server-state field containing SHUTDOWN.
6089
6090 9.14.2. Operation in SHUTDOWN state
6091
6092 A server in SHUTDOWN state MUST NOT respond to any DHCP client input.
6093
6094 If a server receives any message indicating that the partner has
6095 moved to PARTNER-DOWN state while it is in SHUTDOWN state then it
6096 MUST record RECOVER state as the previous state to be used when it is
6097 restarted.
6098
6099 A server SHOULD wait for a few seconds after informing the partner of
6100 entry into SHUTDOWN state (if communications are okay) to determine
6101 if the partner entered PARTNER-DOWN state.
6102
6103
6104
6105
6106 Droms, et. al. Expires September 2003 [Page 109]
6107 \f
6108 Internet Draft DHCP Failover Protocol March 2003
6109
6110
6111 9.14.3. Transitions out of SHUTDOWN state
6112
6113 A server makes a transition out of SHUTDOWN state by being restarted.
6114
6115 10. Safe Period
6116
6117 Due to the restrictions imposed on each server while in
6118 COMMUNICATIONS-INTERRUPTED state, long-term operation in this state
6119 is not feasible for either server. One reason that these states
6120 exist at all, is to allow the servers to easily survive transient
6121 network communications failures of a few minutes to a few days
6122 (although the actual time periods will depend a great deal on the
6123 DHCP activity of the network in terms of arrival and departure of
6124 DHCP clients on the network).
6125
6126 Eventually, when the servers are unable to communicate, they will
6127 have to move into a state where they no longer can re-integrate
6128 without some possibility of a duplicate IP address allocation. There
6129 are two ways that they can move into this state (known as PARTNER-
6130 DOWN).
6131
6132 They can either be informed by external command that, indeed, the
6133 partner server is down. In this case, there is no difficulty in mov-
6134 ing into the PARTNER-DOWN state since it is an accurate reflection of
6135 reality and the protocol has been designed to operate correctly (even
6136 during reintegration) as long as, when in PARTNER-DOWN state the
6137 partner is, indeed, down.
6138
6139 The more difficult scenario is when the servers are running unat-
6140 tended for extended periods, and in this case an option is provided
6141 to configure something called a "safe-period" into each server. This
6142 OPTIONAL safe-period is the period after which either the primary or
6143 secondary server will automatically transition to PARTNER-DOWN from
6144 COMMUNICATIONS-INTERRUPTED state. If this transition is completed
6145 and the partner is not down, then the possibility of duplicate IP
6146 address allocations will exist.
6147
6148 The goal of the "safe-period" is to allow network operations staff
6149 some time to react to a server moving into COMMUNICATIONS-INTERRUPTED
6150 state. During the safe-period the only requirement is that the net-
6151 work operations staff determine if both servers are still running --
6152 and if they are, to either fix the network communications failure
6153 between them, or to take one of the servers down before the expira-
6154 tion of the safe-period.
6155
6156 The length of the safe-period is installation dependent, and depends
6157 in large part on the number of unallocated IP addresses within the
6158 subnet address pool and the expected frequency of arrival of
6159
6160
6161
6162 Droms, et. al. Expires September 2003 [Page 110]
6163 \f
6164 Internet Draft DHCP Failover Protocol March 2003
6165
6166
6167 previously unknown DHCP clients requiring IP addresses. Many
6168 environments should be able to support safe-periods of several days.
6169
6170 During this safe period, either server will allow renewals from any
6171 existing client. The only limitation concerns the need for IP
6172 addresses for the DHCP server to hand out to new DHCP clients and the
6173 need to re-allocate IP addresses to different DHCP clients.
6174
6175 The number of "extra" IP addresses required is equal to the expected
6176 total number of new DHCP clients encountered during the safe period.
6177 This is dependent only on the arrival rate of new DHCP clients, not
6178 the total number of outstanding leases on IP addresses.
6179
6180 In the unlikely event that a relatively short safe period of an hour
6181 is all that can be used (given a dearth of IP addresses or a very
6182 high arrival rate of new DHCP clients), even that can provide sub-
6183 stantial benefits in allowing the DHCP subsystem to ride through
6184 minor problems that could occur and be fixed within that hour. In
6185 these cases, no possibility of duplicate IP address allocation
6186 exists, and re-integration after the failure is solved will be
6187 automatic and require no operator intervention.
6188
6189 11. Security
6190
6191 The Failover protocol communicates DHCP lease activity and this data
6192 is generally easily discovered via other means, such as by pinging
6193 addresses and doing DNS lookups. Therefore, the need to encrypt the
6194 data over the wire is likely not great (though some sites may feel
6195 differently).
6196
6197 However, it is very desirable to assure the integrity of failover
6198 partners and to thus ensure proper operation of the servers. For
6199 example, denial of service attacks are possible by the communication
6200 of invalid state information to one or both servers.
6201
6202 Therefore, the Failover protocol MUST be capable of being secured by
6203 using a simple shared secret message digest which covers each mes-
6204 sage. This provides authentication of the servers, but does not pro-
6205 vide encryption of the data exchange.
6206
6207 The Failover protocol MAY also be secured by using TLS [RFC 2246]
6208 (Transport Layer Security) if encryption of the data exchange is
6209 desired. The use of the shared secret or TLS will not protect
6210 against TCP or IP layer attacks (such as someone sending fake TCP RST
6211 segments). IPsec [RFC 2401] SHOULD be used to protect against most
6212 (if not all) of these kinds of attacks.
6213
6214
6215
6216
6217
6218 Droms, et. al. Expires September 2003 [Page 111]
6219 \f
6220 Internet Draft DHCP Failover Protocol March 2003
6221
6222
6223 11.1. Simple shared secret
6224
6225 Messages between the failover partners can be authenticated through
6226 the use of a shared secret, which is never sent over the network and
6227 must be known by each server. How each server is told about this
6228 shared secret and secures its storage of the shared secret is outside
6229 the scope of this document. If a server is configured with a shared
6230 secret for a partner, it MUST send the message-digest option in ALL
6231 messages to that partner and it MUST treat any messages received from
6232 that partner without a message-digest option as failing authentica-
6233 tion and reject them with reject reason 21: "Missing message digest".
6234 Note that the message digest option MUST be the first option in the
6235 message.
6236
6237 If a server is not configured with a shared secret for a partner, it
6238 MUST NOT send the message-digest option in any message to that
6239 partner and it MUST treat any messages received from that partner
6240 with a message-digest option as failing authentication with reject
6241 reason 13: "Message digest not configured".
6242
6243 The shared secret is used to calculate a 16 octet message-digest
6244 which is sent in every failover message in the message-digest option.
6245 See section 12.16. The message-digest contains a one-way 16 octet
6246 HMAC-MD5 [RFC 2104] hash calculated over a stream of octets consist-
6247 ing of the entire message concatenated with the shared secret.
6248
6249 For calculation, the message includes the message-digest option with
6250 the message-digest data zeroed (16-octets of zero). Once the calcula-
6251 tion is complete, these 16 octets of zero are replaced by the 16-
6252 octet HMAC-MD5 hash and the message is sent.
6253
6254 For verification, the 16-octet message-digest is saved and replaced
6255 with 16-octets of zero and calculated per above. The resulting HMAC-
6256 MD5 hash is compared to the received hash and if they match, the mes-
6257 sage is assumed authenticated.
6258
6259 A failover partner that fails to authenticate a received message or
6260 receives a message without a message-digest option when configured
6261 with a shared secret MUST close the connection immediately and take
6262 steps to notify operators.
6263
6264 Every time a CONNECT message is received, the time at which that mes-
6265 sage was sent by the partner (i.e., the time that actually appears in
6266 the message itself) MUST be saved. If a CONNECT message is ever
6267 received containing that time or containing a time before that time,
6268 it MUST be rejected.
6269
6270 The XID (see section 6.1) of every message received at a failover
6271
6272
6273
6274 Droms, et. al. Expires September 2003 [Page 112]
6275 \f
6276 Internet Draft DHCP Failover Protocol March 2003
6277
6278
6279 endpoint MUST be greater than that of the previous message received
6280 on that failover endpoint or the message just received MUST be
6281 rejected.
6282
6283 A server MAY operate with arbitrary time skew between servers (see
6284 section 5.10), but when using a shared secret administrators MAY wish
6285 to configure a maximum allowable time skew between a failover server
6286 and its partner(s). Servers SHOULD allow an administrator to config-
6287 ure a maximum allowable time skew between two failover partners.
6288
6289 11.2. TLS
6290
6291 TLS, Transport Layer Security, as specified in [RFC 2246] MAY be
6292 used. The use of TLS would be similar to the way it is used with
6293 SMTP [RFC 2487] and IMAP/POP3/ACAP [RFC 2595].
6294
6295 To request the use of TLS, the primary MUST send the TLS-request
6296 option as part of the CONNECT message. The secondary receiving the
6297 TLS-request option MUST respond with a TLS-reply option indicating
6298 its acceptance or rejection of the TLS-request in the CONNECT mes-
6299 sage."
6300
6301 If the CONNECTACK message contained a TLS-reply of 1 , then both
6302 servers immediately begin TLS negotiation.
6303
6304 Upon completion of this negotiation, the primary server sends another
6305 CONNECT message without any TLS-request option, and must wait for a
6306 corresponding CONNECTACK.
6307
6308 Implementation of the TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA [RFC 2246]
6309 cipher suite is REQUIRED in Failover servers supporting TLS. This is
6310 important as it assures that any two compliant implementations can be
6311 configured to interoperate.
6312
6313 12. Failover Options
6314
6315 This section lists all of the options that are currently defined to
6316 be used with the failover protocol. See section 6.2 for details con-
6317 cerning time values.
6318
6319
6320
6321
6322
6323
6324
6325
6326
6327
6328
6329
6330 Droms, et. al. Expires September 2003 [Page 113]
6331 \f
6332 Internet Draft DHCP Failover Protocol March 2003
6333
6334
6335
6336 12.1. addresses-transferred
6337
6338 A 32 bit unsigned long in network byte order. Reports the number of
6339 addresses transferred by the primary to the secondary server
6340 (addresses to be used for the secondary server's private address
6341 pool).
6342
6343 Code Len Number of Addresses
6344 +-----+-----+-----+-----+----+-----+-----+-----+
6345 | 0 | 1 | 0 | 4 | n1 | n2 | n3 | n4 |
6346 +-----+-----+-----+-----+----+-----+-----+-----+
6347
6348
6349 12.2. assigned-IP-address
6350
6351 The DHCP managed IP address to which this message refers.
6352
6353 Code Len Address
6354 +-----+-----+-----+-----+----+-----+-----+-----+
6355 | 0 | 2 | 0 | 4 | a1 | a2 | a3 | a4 |
6356 +-----+-----+-----+-----+----+-----+-----+-----+
6357
6358
6359 12.3. binding-status
6360
6361 This option is used to convey the current state of a binding.
6362
6363 Code Len Type
6364 +-----+-----+-----+-----+-----+
6365 | 0 | 3 | 0 | 1 | 1-7 |
6366 +-----+-----+-----+-----+-----+
6367
6368 Legal values for this option are:
6369
6370 Value Binding Status
6371 ----- ------------------------------------------------
6372 1 FREE Lease is currently available to the primary
6373 2 ACTIVE Lease is assigned to a client
6374 3 EXPIRED Lease has expired
6375 4 RELEASED Lease has been released by client
6376 5 ABANDONED A server, or client flagged address as unusable
6377 6 RESET Lease was freed by some external agent
6378 7 BACKUP Lease belongs to secondary's private address pool
6379
6380
6381
6382
6383
6384
6385
6386 Droms, et. al. Expires September 2003 [Page 114]
6387 \f
6388 Internet Draft DHCP Failover Protocol March 2003
6389
6390
6391
6392 12.4. client-identifier
6393
6394 This is the client-identifier for the client associated with a
6395 binding. The client-identifier data is subject to the same
6396 conventions as DHCP option 81 [RFC 2132].
6397
6398 Code Len Client Identifier
6399 +-----+-----+-----+-----+----+-----+---
6400 | 0 | 4 | 0 | n | i1 | i2 | ...
6401 +-----+-----+-----+-----+----+-----+--
6402
6403
6404 12.5. client-hardware-address
6405
6406 This is the hardware address for the client associated with a
6407 binding. Byte t1 (type) MUST be set to the proper ARP hardware
6408 address code, as defined in the ARP section of RFC 1700 (it MUST NOT
6409 be zero!)
6410
6411 Code Len htype chaddr
6412 +-----+-----+-----+-----+----+-----+-----+---
6413 | 0 | 5 | 0 | n | t1 | c1 | c2 | ...
6414 +-----+-----+-----+-----+----+-----+-----+---
6415
6416
6417 12.6. client-last-transaction-time
6418
6419 The time at which this server last received a DHCP request from a
6420 particular client expressed as an absolute time (see section 6.2).
6421
6422
6423 Code Len client last transaction time
6424 +-----+-----+-----+-----+----+-----+-----+-----+
6425 | 0 | 6 | 0 | 4 | t1 | t2 | t3 | t4 |
6426 +-----+-----+-----+-----+----+-----+-----+-----+
6427
6428
6429
6430
6431
6432
6433
6434
6435
6436
6437
6438
6439
6440
6441
6442 Droms, et. al. Expires September 2003 [Page 115]
6443 \f
6444 Internet Draft DHCP Failover Protocol March 2003
6445
6446
6447
6448 12.7. client-reply-options
6449
6450 This option contains options from a DHCP server's reply to a DHCP
6451 client request. It is sent in a BNDUPD message. The first 4 bytes
6452 of the option contain the "magic number" of the option area from
6453 which the DHCP reply options were taken and serves to define the
6454 format of the rest of the sub-options contained in this option.
6455 After the magic number, the options included are in the normal
6456 options format appropriate for that magic number.
6457
6458 A server SHOULD NOT include all of the options in a DHCP server's
6459 reply to a client's request in this option, but rather a server
6460 SHOULD include only those options which are of likely interest to its
6461 partner server. See section 7.1 for details.
6462
6463 Code Len Magic Number Embedded options
6464 +-----+-----+-----+-----+----+----+----+----+----+----+--
6465 | 0 | 7 | 0 | n | m1 | m2 | m3 | m4 | b1 | b2 | ...
6466 +-----+-----+-----+-----+----+----+----+----+----+----+--
6467
6468
6469 12.8. client-request-options
6470
6471 This option contains options from a DHCP client's request. It is
6472 sent in a BNDUPD message. The first 4 bytes of the option contain
6473 the "magic number" of the option area from which the DHCP client's
6474 request options were taken and serves to define the format of the
6475 rest of the sub-options contained in this option. After the magic
6476 number, the options included are in the normal options format
6477 appropriate for that magic number.
6478
6479 A server SHOULD NOT include all of the options in a DHCP client
6480 request in this option, but rather a server SHOULD include only those
6481 options which are of likely interest to its partner server. See
6482 section 7.1 for details.
6483
6484 Code Len Magic Number Embedded options
6485 +-----+-----+-----+-----+----+----+----+----+----+----+--
6486 | 0 | 8 | 0 | n | m1 | m2 | m3 | m4 | b1 | b2 | ...
6487 +-----+-----+-----+-----+----+----+----+----+----+----+--
6488
6489
6490
6491
6492
6493
6494
6495
6496
6497
6498 Droms, et. al. Expires September 2003 [Page 116]
6499 \f
6500 Internet Draft DHCP Failover Protocol March 2003
6501
6502
6503
6504 12.9. DDNS
6505
6506 If an implementation supports Dynamic DNS updates, this option is
6507 used to communicate the status of the DDNS update associated with a
6508 particular lease binding. The Flags field conveys the types of DNS
6509 RRs that are to be updated by the DHCP server, and the status of the
6510 DDNS update. The Domain Name field conveys the DNS FQDN that the
6511 DHCP server is using to refer to the client, in DNS encoding as
6512 specified in [RFC 1035].
6513
6514 Code Len Flags Domain Name
6515 +-----+-----+-----+-----+-----+------+------+-----+------
6516 | 0 | 9 | 0 | n | flags | d1 | d2 | ...
6517 +-----+-----+-----+-----+-----+------+------+-----+------
6518
6519 The Flags field is a 16-bit field; several bit positions are
6520 specified here.
6521
6522 1 1 1 1 1 1
6523 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
6524 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
6525 |C|A|D|P| MBZ |
6526 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
6527
6528 The bits (numbered from the least-significant bit in network
6529 byte-order) are used as follows:
6530
6531 0 (C): name to address (such as A RR) update successfully completed
6532 1 (A): Server is controlling A RR on behalf of the client
6533 2 (D): address to name (such as PTR RR) update successfully completed (Done)
6534 3 (P): Server is controlling PTR RR on behalf of the client
6535 4-15 : Must be zero
6536
6537 All of the unspecified bit positions SHOULD be set to 0 by servers
6538 sending the Failover-DDNS option, and they MUST be ignored by servers
6539 receiving the option.
6540
6541
6542
6543
6544
6545
6546
6547
6548
6549
6550
6551
6552
6553
6554 Droms, et. al. Expires September 2003 [Page 117]
6555 \f
6556 Internet Draft DHCP Failover Protocol March 2003
6557
6558
6559
6560 12.10. delayed-service-parameter
6561
6562 The delayed-service-parameter is an optional load balancing tuning
6563 parameter, defined in [RFC 3074]. If it is used, it MUST be sent in
6564 the same message as the hash-bucket-assignment option (see section
6565 12.11).
6566
6567 Format :
6568
6569
6570 Code Len Seconds
6571 +-----+-----+-----+-----+----+
6572 | 0 | 10 | 0 | 1 | S |
6573 +-----+-----+-----+-----+----+
6574
6575 S is a one byte value, 1..255.
6576
6577
6578 12.11. hash-bucket-assignment
6579
6580 A set of load balancing hash values for the secondary server. A one
6581 bit in the hash buckets indicates that the secondary is to service
6582 that set of clients. See section 5.3 for more information on how
6583 this option is used. This option is only sent from the primary to
6584 the secondary.
6585
6586 The format and usage of the data in this option is defined in [RFC
6587 3074].
6588
6589 Code Len Hash Buckets
6590 +-----+-----+-----+-----+-----+-----+-----+-----+
6591 | 0 | 11 | 0 | 32 | b1 | b2 | ... | b32 |
6592 +-----+-----+-----+-----+-----+-----+-----+-----+
6593
6594
6595
6596
6597
6598
6599
6600
6601
6602
6603
6604
6605
6606
6607
6608
6609
6610 Droms, et. al. Expires September 2003 [Page 118]
6611 \f
6612 Internet Draft DHCP Failover Protocol March 2003
6613
6614
6615
6616 12.12. IP-flags
6617
6618 This option is used to convey the current flags of the assigned-IP-
6619 address option preceding it.
6620
6621 Code Len IP Flags
6622 +-----+-----+-----+-----+-----+-----+
6623 | 0 | 12 | 0 | 1 | f1 | f2 |
6624 +-----+-----+-----+-----+-----+-----+
6625
6626 The IP-flags field is a 16-bit field; two bit positions are
6627 specified here.
6628
6629 1 1 1 1 1 1
6630 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
6631 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
6632 |R|B| MBZ |
6633 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
6634
6635 The bits (numbered from the least-significant bit in network
6636 byte-order) are used as follows:
6637
6638 0 (R): RESERVED (this bit allocated and in use and named "RESERVED")
6639 Bit 0 MUST be set to 1 whenever the IP address in the preceding
6640 assigned-IP-address option is reserved on the server sending the
6641 packet.
6642 1 (B): BOOTP
6643 Bit 1 MUST be set to 1 whenever the IP address in the preceding
6644 assigned-IP-address option is a an IP address which has been
6645 allocated due to an interaction with a BOOTP client (as opposed
6646 to a DHCP client).
6647 2-15 : Must be zero
6648
6649
6650
6651
6652
6653
6654
6655
6656
6657
6658
6659
6660
6661
6662
6663
6664
6665
6666 Droms, et. al. Expires September 2003 [Page 119]
6667 \f
6668 Internet Draft DHCP Failover Protocol March 2003
6669
6670
6671
6672 12.13. lease-expiration-time
6673
6674 The lease expiration time is the lease interval that a DHCP server
6675 has ACKed to a DHCP client added to the time at which that ACK was
6676 transmitted -- expressed as an absolute time (see section 6.2).
6677
6678
6679 Code Len Time
6680 +-----+-----+-----+-----+----+-----+-----+-----+
6681 | 0 | 13 | 0 | 4 | t1 | t2 | t3 | t4 |
6682 +-----+-----+-----+-----+----+-----+-----+-----+
6683
6684
6685 12.14. max-unacked-bndupd
6686
6687 The maximum number of BNDUPD message that this server is prepared to
6688 accept over the TCP connection without causing the TCP connection to
6689 block. A 32 bit unsigned integer value, in network byte order.
6690
6691
6692 Code Len Maximum Unacked BNDUPD
6693 +-----+-----+-----+-----+----+-----+-----+-----+
6694 | 0 | 14 | 0 | 4 | n1 | n2 | n3 | n4 |
6695 +-----+-----+-----+-----+----+-----+-----+-----+
6696
6697
6698 12.15. MCLT
6699
6700 Maximum Client Lead Time, an interval, in seconds. A 32 bit unsigned
6701 integer value, in network byte order.
6702
6703 Code Len Time
6704 +-----+-----+-----+-----+----+-----+-----+-----+
6705 | 0 | 15 | 0 | 4 | t1 | t2 | t3 | t4 |
6706 +-----+-----+-----+-----+----+-----+-----+-----+
6707
6708
6709
6710
6711
6712
6713
6714
6715
6716
6717
6718
6719
6720
6721
6722 Droms, et. al. Expires September 2003 [Page 120]
6723 \f
6724 Internet Draft DHCP Failover Protocol March 2003
6725
6726
6727
6728 12.16. message
6729
6730 This option is used to supply a human readable message text. It may
6731 be used in association with the Reject Reason Code to provide a human
6732 readable error message for the reject.
6733
6734
6735 Code Len Text
6736 +-----+-----+-----+-----+------+-----+--
6737 | 0 | 16 | 0 | n | c1 | c2 | ...
6738 +-----+-----+-----+-----+------+-----+--
6739
6740
6741 12.17. message-digest
6742
6743 The message digest for this message.
6744
6745 This option consists of a variable number of bytes which contain the
6746 message digest of the message prior to the inclusion of this option.
6747
6748 When this option appears in a message, it MUST appear as the first
6749 option in the message. It MUST appear in every message if message
6750 digests are required. The Type MUST be configurable (once additional
6751 types are defined). When additional types are defined, they MUST be
6752 specified as either optional (MAY be supported) or required (MUST be
6753 supported). See the section on IANA considerations for more details.
6754
6755 Code Len Type Message Digest
6756 +-----+-----+-----+-----+-----+-----+-----+--
6757 | 0 | 17 | 0 | n | t | d1 | d2 | ...
6758 +-----+-----+-----+-----+-----+-----+-----+--
6759
6760
6761 Type: 0 Not Allowed
6762 1 HMAC-MD5
6763 2-255 Not Allowed
6764
6765
6766
6767
6768
6769
6770
6771
6772
6773
6774
6775
6776
6777
6778 Droms, et. al. Expires September 2003 [Page 121]
6779 \f
6780 Internet Draft DHCP Failover Protocol March 2003
6781
6782
6783
6784 12.18. potential-expiration-time
6785
6786 The potential expiration time is the time that one server tells
6787 another server that it may wish to grant in a lease to a DHCP client.
6788 It is an absolute time. See section 6.2.
6789
6790
6791 Code Len Time
6792 +-----+-----+-----+-----+----+-----+-----+-----+
6793 | 0 | 18 | 0 | 4 | t1 | t2 | t3 | t4 |
6794 +-----+-----+-----+-----+----+-----+-----+-----+
6795
6796
6797 12.19. receive-timer
6798
6799 The number of seconds (an interval) within which the server must
6800 receive a message from its partner, or it will assume that
6801 communications with the partner is not ok. An unsigned 32 bit
6802 integer in network byte order.
6803
6804 Code Len Receive Timer
6805 +-----+-----+-----+-----+----+-----+-----+-----+
6806 | 0 | 19 | 0 | 4 | s1 | s2 | s3 | s4 |
6807 +-----+-----+-----+-----+----+-----+-----+-----+
6808
6809
6810 12.20. protocol-version
6811
6812 The protocol version being used by the server. It is only sent in the
6813 CONNECT and CONNECTACK messages. The current value for the version
6814 is 1.
6815
6816 Code Len Version
6817 +-----+-----+-----+-----+-----+
6818 | 0 | 20 | 0 | 1 | 1 |
6819 +-----+-----+-----+-----+-----+
6820
6821
6822
6823
6824
6825
6826
6827
6828
6829
6830
6831
6832
6833
6834 Droms, et. al. Expires September 2003 [Page 122]
6835 \f
6836 Internet Draft DHCP Failover Protocol March 2003
6837
6838
6839
6840 12.21. reject-reason
6841
6842 This option is used to selectively reject binding updates. It MAY be
6843 used in a BNDACK message or a CONNECTACK message, always associated
6844 with an assigned-IP-address option, which contains the IP address of
6845 the update being rejected.
6846
6847 Code Len Reason Code
6848 +-----+-----+-----+-----+-----+
6849 | 0 | 21 | 0 | 1 | R1 |
6850 +-----+-----+-----+-----+-----+
6851
6852 Reason codes (section where referenced in parentheses):
6853
6854 0 Reserved
6855 1 Illegal IP address (not part of any address pool). (7.1.3)
6856 2 Fatal conflict exists: address in use by other client. (7.1.3)
6857 3 Missing binding information. (7.1.3)
6858 4 Connection rejected, time mismatch too great. (7.8.2)
6859 5 Connection rejected, invalid MCLT. (7.8.2)
6860 6 Connection rejected, unknown reason. (not specifically referenced)
6861 7 Connection rejected, duplicate connection. (unused)
6862 8 Connection rejected, invalid failover partner. (7.8.2)
6863 9 TLS not supported. (7.8.2)
6864 10 TLS supported but not configured. (7.8.2)
6865 11 TLS required but not supported by partner. (7.8.2)
6866 12 Message digest not supported. (11.1)
6867 13 Message digest not configured. (11.1)
6868 14 Protocol version mismatch. (7.8.2)
6869 15 Outdated binding information. (7.1.3)
6870 16 Less critical binding information. (7.1.3)
6871 17 No traffic within sufficient time. (8.6)
6872 18 Hash bucket assignment conflict. (7.8.2)
6873 19 IP not reserved on this server. (7.1.3)
6874 20 Message digest failed to compare. (7.8.2)
6875 21 Missing message digest. (7.1.3)
6876 22-253, reserved.
6877 254 Unknown: Error occurred but does not match any reason code.
6878 255 Reserved for code expansion.
6879
6880
6881
6882
6883
6884
6885
6886
6887
6888
6889
6890 Droms, et. al. Expires September 2003 [Page 123]
6891 \f
6892 Internet Draft DHCP Failover Protocol March 2003
6893
6894
6895
6896 12.22. relationship-name
6897
6898 A string which is a unique identifier for the failover relationship.
6899
6900 Code Len Relationship Name
6901 +-----+-----+-----+-----+----+-----+---
6902 | 0 | 22 | 0 | n | c1 | c2 | ...
6903 +-----+-----+-----+-----+----+-----+---
6904
6905
6906 12.23. server-flags
6907
6908 This option is used to convey the current flags of the failover
6909 endpoint in the sending server.
6910
6911 Code Len Server Flags
6912 +-----+-----+-----+-----+-------+
6913 | 0 | 23 | 0 | 1 | flags |
6914 +-----+-----+-----+-----+-------+
6915
6916 The flags field is an 8-bit field; one bit position is
6917 specified here.
6918
6919
6920 0 1 2 3 4 5 6 7
6921 +-+-+-+-+-+-+-+-+
6922 |S| MBZ |
6923 +-+-+-+-+-+-+-+-+
6924
6925 The bits (numbered from the least-significant bit in network
6926 byte-order) are used as follows:
6927
6928 0 (S): STARTUP,
6929 Bit 0 MUST be set to 1 whenever the server is in STARTUP state,
6930 and set to 0 otherwise. (Note that when in STARTUP state, the
6931 state transmitted in the server-state option is usually the last
6932 recorded state from stable storage, but see section 9.3 for
6933 details.)
6934 1-7 : Must be zero
6935
6936
6937
6938
6939
6940
6941
6942
6943
6944
6945
6946 Droms, et. al. Expires September 2003 [Page 124]
6947 \f
6948 Internet Draft DHCP Failover Protocol March 2003
6949
6950
6951
6952 12.24. server-state
6953
6954 This option is used to convey the current state of the failover
6955 endpoint in the sending server.
6956
6957 Code Len Server State
6958 +-----+-----+-----+-----+-----+
6959 | 0 | 24 | 0 | 1 | 1-9 |
6960 +-----+-----+-----+-----+-----+
6961
6962 Legal values for this option are:
6963
6964 Value Server State
6965 ----- -------------------------------------------------------------
6966 0 reserved
6967 1 STARTUP Startup state (1)
6968 2 NORMAL Normal state
6969 3 COMMUNICATIONS-INTERRUPTED Communication interrupted (safe)
6970 4 PARTNER-DOWN Partner down (unsafe mode)
6971 5 POTENTIAL-CONFLICT Synchronizing
6972 6 RECOVER Recovering bindings from partner
6973 7 PAUSED Shutting down for a short period.
6974 8 SHUTDOWN Shutting down for an extended
6975 period.
6976 9 RECOVER-DONE Interlock state prior to NORMAL
6977 10 RESOLUTION-INTERRUPTED Comm. failed during resolution
6978 11 CONFLICT-DONE Primary has resolved its conflicts
6979
6980 (1) The STARTUP state is never sent to the partner server, it is
6981 indicated by the STARTUP bit in the server-flags options (see section
6982 12.22).
6983
6984
6985 12.25. start-time-of-state
6986
6987 This option is used for different states in different messages. In a
6988 BNDUPD message it represents the start time of the state of the lease
6989 in the BNDUPD message. In a STATE message, it represents the start
6990 time of the partner server's failover state. In all cases it is an
6991 absolute time.
6992
6993
6994 Code Len Start Time of State
6995 +-----+-----+-----+-----+----+-----+-----+-----+
6996 | 0 | 25 | 0 | 4 | t1 | t2 | t3 | t4 |
6997 +-----+-----+-----+-----+----+-----+-----+-----+
6998
6999
7000
7001
7002 Droms, et. al. Expires September 2003 [Page 125]
7003 \f
7004 Internet Draft DHCP Failover Protocol March 2003
7005
7006
7007
7008 12.26. TLS-reply
7009
7010 This option contains information relating to TLS security
7011 negotiation. It is sent in a CONNECTACK message
7012
7013 A t1 value of 0 indicates no TLS operation, a value of 1 indicates
7014 that TLS operation is required.
7015
7016 Code Len TLS
7017 +-----+-----+-----+-----+-----+
7018 | 0 | 26 | 0 | 1 | t1 |
7019 +-----+-----+-----+-----+-----+
7020
7021
7022 12.27. TLS-request
7023
7024 This option contains information relating to TLS security
7025 negotiation. It is sent in a CONNECT message.
7026
7027 The t1 byte is the TLS request from the primary server. A value of 0
7028 indicates no TLS operation (to communicate the secondary server MUST
7029 NOT require TLS), a value of 1 indicates that TLS operation is
7030 desired but not required (to communicate, the secondary server MAY
7031 utilize TLS), and a value of 2 indicates that TLS operation is
7032 required (to communicate the secondary server MUST utilize TLS) to
7033 establish communications with the primary server.
7034
7035 Code Len TLS
7036 +-----+-----+-----+-----+-----+
7037 | 0 | 27 | 0 | 1 | t1 |
7038 +-----+-----+-----+-----+-----+
7039
7040
7041 12.28. vendor-class-identifier
7042
7043 A string which identifies the vendor of the failover protocol
7044 implementation.
7045
7046 Code Len vendor class string
7047 +-----+-----+-----+-----+----+-----+---
7048 | 0 | 28 | 0 | n | c1 | c2 | ...
7049 +-----+-----+-----+-----+----+-----+---
7050
7051
7052
7053
7054
7055
7056
7057
7058 Droms, et. al. Expires September 2003 [Page 126]
7059 \f
7060 Internet Draft DHCP Failover Protocol March 2003
7061
7062
7063
7064 12.29. vendor-specific-options
7065
7066 This option is used to convey options specific to a particular
7067 vendor's implementation. The vendor class identifier is used to
7068 specify which option space the embedded options are drawn from.
7069 Every message that uses vendor specific options MUST have a vendor-
7070 class-identifier option in it.
7071
7072 It functions similarly to the vendor class identifier and vendor
7073 specific options in the DHCP protocol.
7074
7075 This option contains other options in the same two byte code, two
7076 byte length format. If this option appears in a message without a
7077 corresponding vendor class identifier, it MUST be ignored.
7078
7079 Code Len Embedded options
7080 +-----+-----+-----+-----+----+-----+---
7081 | 0 | 29 | 0 | n | c1 | c2 | ...
7082 +-----+-----+-----+-----+----+-----+---
7083
7084
7085
7086
7087 13. IANA Considerations
7088
7089 This document defines several number spaces (failover options, fail-
7090 over message types, message digest types, and failover reject reason
7091 codes). For all of these number spaces, certain values are defined in
7092 this specification. New values may only be defined by IETF Con-
7093 sensus, as described in [RFC 2434]. Basically, this means that they
7094 are defined by RFCs approved by the IESG.
7095
7096
7097 14. Acknowledgments
7098
7099 Ralph Droms started it all, by sketching out an initial interserver
7100 draft that embodied ideas from several past IETF meetings. In that
7101 draft, he acknowledged contributions by Jeff Mogul, Greg Minshall,
7102 Rob Stevens, Walt Wimer, Ted Lemon, and the DHC working group.
7103
7104 Kim Kinnear and Bob Cole each extended that draft, separately and
7105 then together, until they created an interserver draft that supported
7106 any number of servers. The complexity of that approach was just too
7107 great, and that draft wasn't greeted with enthusiasm by many, includ-
7108 ing its authors.
7109
7110 It did however lead to a much simpler approach embodied in the first
7111
7112
7113
7114 Droms, et. al. Expires September 2003 [Page 127]
7115 \f
7116 Internet Draft DHCP Failover Protocol March 2003
7117
7118
7119 Failover draft by Greg Rabil, Mike Dooley, Arun Kapur and Ralph
7120 Droms. This draft posited only two servers -- a primary and a secon-
7121 dary.
7122
7123 Kim Kinnear then wrote the Safe Failover draft to layer on top of the
7124 Failover Draft and increase its robustness in the face of certain
7125 rare network failures.
7126
7127 At the spring 1998 IETF meeting in LA, the DHC working group said
7128 that they wanted a merged Failover and Safe Failover draft. Steve
7129 Gonczi and Bernie Volz stepped up and produced the raw material for
7130 such a merged draft, along with a new message format designed around
7131 DHCP options and other extensions and clarifications. Kim Kinnear
7132 edited their work into draft format and made other changes in time
7133 for the Summer Chicago IETF meeting.
7134
7135 Many people have reviewed the various earlier drafts that went into
7136 this result. At American Internet, ideas were contributed by Brad
7137 Parker. At Cisco Systems Paul Fox and Ellen Garvey contributed to
7138 the design of the protocol.
7139
7140 During the summer and fall of 1998, two groups worked on separate
7141 implementations of the UDP failover draft. Bernie Volz and Steve
7142 Gonczi constituted one group, and Kim Kinnear, Mark Stapp and Paul
7143 Fox made up the other. These two groups worked together to produce
7144 considerable changes and simplifications of the protocol during that
7145 period, and Steve Gonczi and Kim Kinnear edited those changes into
7146 -03 draft in time for submission to the December 1998 Orlando IETF
7147 meeting.
7148
7149 In February of 1999 Kim Kinnear and Mark Stapp hosted a meeting of
7150 people interested in the failover draft. During that meeting a gen-
7151 eral agreement was reached to recast the failover protocol to use TCP
7152 instead of UDP. In addition, the group together brainstormed a work-
7153 able load-balancing technique. Kim Kinnear rewrote the entire draft
7154 to include the changes made at that meeting as well as to restructure
7155 the draft along guidelines suggested by Thomas Narten. The result
7156 was the -04 draft, submitted prior to the Oslo IETF meeting.
7157
7158 The initial idea for a hash-based load balancing approach was offered
7159 by Ted Lemon, and the determination of an algorithm and its integra-
7160 tion into the draft was done by Steve Gonczi. The security section
7161 was spearheaded by Bernie Volz. Both contributed considerably to the
7162 ideas and text in the rest of the draft with several reviews.
7163
7164 In early October of 1999, three conference calls were held to discuss
7165 the -04 draft. The -05 includes changes as a result of those calls,
7166 perhaps the largest of which was to remove the load balancing
7167
7168
7169
7170 Droms, et. al. Expires September 2003 [Page 128]
7171 \f
7172 Internet Draft DHCP Failover Protocol March 2003
7173
7174
7175 approach into a separate draft. Thanks to all of the many people
7176 who participated in the conference calls. Changes were made because
7177 of contributions by: Ted Lemon, David Erdmann, Richard Jones, Rob
7178 Stevens, Thomas Narten, Diana Lane, and Andre Kostur.
7179
7180 Another conference call was held in mid-January of 2000, and the -06
7181 draft was produced to tighten up the the -05 draft both technically
7182 as well as editorially.
7183
7184 The -07 draft was edited by Kim Kinnear and was based in part on
7185 reviews by Richard Jones, Bernie Volz, and Steve Gonczi. It embodies
7186 several technical updates as well as numerous editorial revisions
7187 that enhanced both correctness as well as clarity.
7188
7189 The -08 draft was edited by Kim Kinnear and was based on the results
7190 of two conference calls held in October and November of 2000. It
7191 includes the correct second port number, a new state to synchronize
7192 conflict resolution with load balancing, a generally accepted
7193 approach to secondary pool allocation, and many other updates based
7194 on both operational as well as implementation experience.
7195
7196 The -09 draft was edited by Kim Kinnear based on discussions held at
7197 the Minneapolis IETF in December of 2000, as well as issues raised by
7198 Ted Lemon based on implementation and deployment. The specific
7199 changes were mailed to the dhcp-v4 list.
7200
7201 The -10 draft differed from the -09 draft in that figure 9.8.3-1 was
7202 correctly relabeled figure 9.10.3-1, and it was updated to include
7203 the CONFLICT-DONE message. One of the authors affiliations was also
7204 updated.
7205
7206 This, the -11 draft differs only slightly from the -10 draft in
7207 correcting another author affiliation.
7208
7209 These most recent changes have not been widely circulated among the
7210 other authors prior to submission to the IETF.
7211
7212 Glenn Waters of Nortel Networks contributed ideas and enthusiasm to
7213 make a Failover protocol that was both "safe" and "lazy".
7214
7215
7216 15. References
7217
7218
7219 [DHCID] Stapp, M., Lemon, T., Gustafsson, A., "draft-ietf-dnsext-
7220 dhcid-rr-02.txt", March, 2001.
7221
7222 [DNSRES] Stapp, M., "draft-ietf-dhc-dns-resolution-01.txt", March,
7223
7224
7225
7226 Droms, et. al. Expires September 2003 [Page 129]
7227 \f
7228 Internet Draft DHCP Failover Protocol March 2003
7229
7230
7231 2001.
7232
7233 [FQDN] Rekhter, Y., Stapp, M., "draft-ietf-dhc-fqdn-option-01.txt",
7234 March, 2001.
7235
7236 [RFC 1035] Mockapetris, P., "Domain Names - Implementation and
7237 Specification", November, 1987.
7238
7239 [RFC 1534] Droms, R., "Interoperation between DHCP and BOOTP", RFC
7240 1534, October 1993.
7241
7242 [RFC 2104] Krawczyk, H., Bellare, M., and Canetti, R., "HMAC: Keyed
7243 Hashing for Message Authentication", RFC 2104, IBM T.J. Watson
7244 Research Center, University of California at San Diego, February
7245 1997.
7246
7247 [RFC 2119] Bradner, S. "Key words for use in RFCs to Indicate
7248 Requirement Levels", RFC 2119.
7249
7250 [RFC 2131] Droms, R., "Dynamic Host Configuration Protocol", RFC
7251 2131, March 1997.
7252
7253 [RFC 2132] Alexander, S., Droms, R., "DHCP Options and BOOTP Vendor
7254 Extensions", Internet RFC 2132, March 1997.
7255
7256 [RFC 2136] P. Vixie, S. Thomson, Y. Rekhter, J. Bound, "Dynamic
7257 Updates in the Domain Name System (DNS UPDATE)", RFC 2136, April
7258 1997
7259
7260 [RFC 2139] Rigney, C., "Radius Accounting", RFC 2139, Livingston
7261 Enterprises, April 1997.
7262
7263 [RFC 2246] Dierks, T., "The TLS Protocol, Version 1.0", RFC 2246,
7264 January 1999.
7265
7266 [RFC 2401] Kent, S., Atkinson, R., "Security Architecture for the
7267 Internet Protocol", RFC 2401, November 1998.
7268
7269 [RFC 2434] Alvestrand, H. and T. Narten, "Guidelines for Writing an
7270 IANA Considerations Section in RFCs", BCP 26, RFC 2434, October
7271 1998.
7272
7273 [RFC 2487] Hoffman, P., "SMTP Service Extension for Secure SMTP over
7274 TLS", RFC 2487, January 1999.
7275
7276 [RFC 2595] Newman, C., "Using TLS with IMAP, POP3, and ACAP", RFC
7277 2595, June 1999.
7278
7279
7280
7281
7282 Droms, et. al. Expires September 2003 [Page 130]
7283 \f
7284 Internet Draft DHCP Failover Protocol March 2003
7285
7286
7287 [RFC 3004] Stump, G., Droms, R., Gu, Y., Vyaghrapuri, R., Demirtjis,
7288 A., Privat, J. "The User Class Option for DHCP", November 2000.
7289
7290 [RFC 3011] Waters, G., "The IPv4 Subnet Selection Option for DHCP",
7291 November 2000.
7292
7293 [RFC 3046] Patrick, M., "DHCP Relay Agent Information Option", RFC
7294 3046, January 2001.
7295
7296 [RFC 3074] Volz, B., Gonczi, S., Lemon, T., Stevens, R., "DHC Load-
7297 balancing Algorithm", February, 2001.
7298
7299 16. Author's information
7300
7301 Ralph Droms
7302 Kim Kinnear
7303 Mark Stapp
7304 Cisco Systems
7305 250 Apollo Drive
7306 Chelmsford, MA 01824
7307
7308 Phone: (978) 497-0000
7309
7310 EMail: rdroms@cisco.com
7311 kkinnear@cisco.com
7312 mjs@cisco.com
7313
7314
7315
7316 Bernie Volz
7317 Ericsson
7318 959 Concord St.
7319 Framingham, MA 01701
7320
7321 Phone: (508) 875-3162
7322
7323 EMail: bernie.volz@ericsson.com
7324
7325
7326 Steve Gonczi
7327 Relicore, Inc.
7328 One Wall Street
7329 Burlington, MA 01803
7330
7331 Phone: (781) 229-1122
7332
7333 Email: steve@relicore.com
7334
7335
7336
7337
7338 Droms, et. al. Expires September 2003 [Page 131]
7339 \f
7340 Internet Draft DHCP Failover Protocol March 2003
7341
7342
7343 Greg Rabil
7344 Lucent Technologies
7345 400 Lapp Road
7346 Malvern, PA 19355
7347
7348 Phone: (800) 208-2747
7349
7350 EMail: grabil@lucent.com
7351
7352
7353
7354
7355 Michael Dooley
7356 Diamond IP Technologies
7357 One E Uwchlan Ave, Suite 112
7358 Exton, PA 19341
7359
7360 EMail: mdooley@diamondip.com
7361
7362
7363
7364
7365 Arun Kapur
7366 K5 Networks
7367 2 Toll House Lane
7368 Colts Neck, NJ 07722
7369
7370 Phone: (732) 817-9475
7371
7372 17. Full Copyright Statement
7373
7374 Copyright (C) The Internet Society (2003). All Rights Reserved.
7375
7376 This document and translations of it may be copied and furnished to oth-
7377 ers, and derivative works that comment on or otherwise explain it or
7378 assist in its implementation may be prepared, copied, published and dis-
7379 tributed, in whole or in part, without restriction of any kind, provided
7380 that the above copyright notice and this paragraph are included on all
7381 such copies and derivative works. However, this document itself may not
7382 be modified in any way, such as by removing the copyright notice or
7383 references to the Internet Society or other Internet organizations,
7384 except as needed for the purpose of developing Internet standards in
7385 which case the procedures for copyrights defined in the Internet Stan-
7386 dards process must be followed, or as required to translate it into
7387 languages other than English.
7388
7389 The limited permissions granted above are perpetual and will not be
7390 revoked by the Internet Society or its successors or assigns.
7391
7392
7393
7394 Droms, et. al. Expires September 2003 [Page 132]
7395 \f
7396 Internet Draft DHCP Failover Protocol March 2003
7397
7398
7399 This document and the information contained herein is provided on an "AS
7400 IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK
7401 FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT
7402 LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT
7403 INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FIT-
7404 NESS FOR A PARTICULAR PURPOSE.
7405
7406
7407
7408
7409
7410
7411
7412
7413
7414
7415
7416
7417
7418
7419
7420
7421
7422
7423
7424
7425
7426
7427
7428
7429
7430
7431
7432
7433
7434
7435
7436
7437
7438
7439
7440
7441
7442
7443
7444
7445
7446
7447
7448
7449
7450 Droms, et. al. Expires September 2003 [Page 133]
7451 \f