``ha``: High Availability Outage Resilience for Kea Servers
===========================================================
-This hook library can be
-loaded on a pair of DHCPv4 or DHCPv6 servers, to increase the reliability
-of the DHCP service in the event of an outage of one of the servers.
-This library was previously only available to ISC's paid subscribers,
-but is now part of the open source Kea, available to all users.
+This hook library can be loaded on a pair of DHCPv4 or DHCPv6 servers, to
+increase the reliability of the DHCP service in the event of an outage of one of
+the servers. This library was previously only available to ISC's paid
+subscribers, but is now part of the open source Kea, available to all users.
.. note::
- This library may only be loaded by the ``kea-dhcp4`` or
- ``kea-dhcp6`` process.
-
-High Availability (HA) of the DHCP service is provided by running
-multiple cooperating server instances. If any of these instances becomes
-unavailable for any reason (DHCP software crash, Control Agent software
-crash, power outage, hardware failure), a surviving server instance can
-continue providing reliable service to clients. Many DHCP server
-implementations include the "DHCP Failover" protocol, whose most
-significant features are communication between the servers, partner
-failure detection, and lease synchronization between the servers.
-However, the DHCPv4 failover standardization process was never completed
-by the IETF. The DHCPv6 failover standard (RFC 8156) was published, but
-it is complex, difficult to use, has significant operational
-constraints, and is different than its v4 counterpart. Although it may
-be useful for some users to use a "standard" failover protocol, it seems
-that most Kea users are simply interested in a working solution which
-guarantees high availability of the DHCP service. Therefore, the Kea HA
-hook library derives major concepts from the DHCP failover protocol but
-uses its own solutions for communication and configuration. It offers
-its own state machine, which greatly simplifies its implementation and
-generally fits better into Kea, and it provides the same features in
-both DHCPv4 and DHCPv6. This document intentionally uses the term "high
-availability" rather than "failover" to emphasize that it is not the
-failover protocol implementation.
-
-The following sections describe the configuration and operation of the
-Kea HA hook library.
+ This library may only be loaded by the ``kea-dhcp4`` or ``kea-dhcp6`` process.
+
+High Availability (HA) of the DHCP service is provided by running multiple
+cooperating server instances. If any of these instances becomes unavailable for
+any reason (DHCP software crash, Control Agent software crash, power outage,
+hardware failure), a surviving server instance can continue providing reliable
+service to clients. Many DHCP server implementations include the "DHCP Failover"
+protocol, whose most significant features are communication between the servers,
+partner failure detection, and lease synchronization between the servers.
+However, the DHCPv4 failover standardization process was never completed by the
+IETF. The DHCPv6 failover standard (RFC 8156) was published, but it is complex,
+difficult to use, has significant operational constraints, and is different than
+its v4 counterpart. Although it may be useful for some users to use a "standard"
+failover protocol, it seems that most Kea users are simply interested in a
+working solution which guarantees high availability of the DHCP service.
+Therefore, the Kea HA hook library derives major concepts from the DHCP failover
+protocol but uses its own solutions for communication and configuration. It
+offers its own state machine, which greatly simplifies its implementation and
+generally fits better into Kea, and it provides the same features in both DHCPv4
+and DHCPv6. This document intentionally uses the term "high availability" rather
+than "failover" to emphasize that it is not the failover protocol implementation.
+
+The following sections describe the configuration and operation of the Kea HA
+hook library.
.. _ha-supported-configurations:
Supported Configurations
~~~~~~~~~~~~~~~~~~~~~~~~
-The Kea HA hook library supports three configurations, also known as HA
-modes: ``load-balancing``, ``hot-standby``, and ``passive-backup``. In the
-load-balancing mode, two servers respond to DHCP requests. The
-load-balancing function is implemented as described in `RFC
-3074 <https://tools.ietf.org/html/rfc3074>`__, with each server
-responding to half the received DHCP queries. When one of the servers
-allocates a lease for a client, it notifies the partner server over the
-control channel (via the RESTful API), so the partner can save the lease
-information in its own database. If the communication with the partner
-is unsuccessful, the DHCP query is dropped and the response is not
-returned to the DHCP client. If the lease update is successful, the
-response is returned to the DHCP client by the server which has
-allocated the lease. By exchanging lease updates, both servers get a
-copy of all leases allocated by the entire HA setup, and either server
-can be switched to handle the entire DHCP traffic if its partner becomes
-unavailable.
-
-In the load-balancing configuration, one of the servers must be
-designated as ``primary`` and the other as ``secondary``. Functionally,
-there is no difference between the two during normal operation. However, this
-distinction is required when the two servers are started at (nearly) the
-same time and have to synchronize their lease databases. The primary
-server synchronizes the database first. The secondary server waits for
-the primary server to complete the lease database synchronization before
-it starts the synchronization.
-
-In the hot-standby configuration, one of the servers is designated
-as "primary" and the other as "standby." However, during normal
-operation, the primary server is the only one that responds to DHCP
-requests. The standby server receives lease updates from the primary
-over the control channel; however, it does not respond to any DHCP
-queries as long as the primary is running or, more accurately,
-until the standby considers the primary to be offline. If the standby
-server detects the failure of the primary, it starts responding to all
+The Kea HA hook library supports three configurations, also known as HA modes:
+``load-balancing``, ``hot-standby``, and ``passive-backup``. In the
+``load-balancing`` mode, two servers respond to DHCP requests. The
+``load-balancing`` function is implemented as described in `RFC
+3074 <https://tools.ietf.org/html/rfc3074>`__, with each server responding to
+half the received DHCP queries. When one of the servers allocates a lease for a
+client, it notifies the partner server over the control channel (via the RESTful
+API), so the partner can save the lease information in its own database. If the
+communication with the partner is unsuccessful, the DHCP query is dropped and
+the response is not returned to the DHCP client. If the lease update is
+successful, the response is returned to the DHCP client by the server which has
+allocated the lease. By exchanging lease updates, both servers get a copy of all
+leases allocated by the entire HA setup, and either server can be switched to
+handle the entire DHCP traffic if its partner becomes unavailable.
+
+In the ``load-balancing`` configuration, one of the servers must be designated
+as ``primary`` and the other as ``secondary``. Functionally, there is no
+difference between the two during normal operation. However, this distinction is
+required when the two servers are started at (nearly) the same time and have to
+synchronize their lease databases. The primary server synchronizes the database
+first. The secondary server waits for the primary server to complete the lease\
+database synchronization before it starts the synchronization.
+
+In the ``hot-standby`` configuration, one of the servers is designated as
+``primary`` and the other as ``standby``. However, during normal operation, the
+primary server is the only one that responds to DHCP requests. The standby
+server receives lease updates from the primary over the control channel; however,
+it does not respond to any DHCP queries as long as the primary is running or,
+more accurately, until the standby considers the primary to be offline. If the
+standby server detects the failure of the primary, it starts responding to all
DHCP queries.
.. note::
mode. The ``load-balancing`` has the benefit of splitting the DHCP load
between two instances, reducing the traffic processed by each of them.
However, it is not always clear to the operators that using the
- ``load-balancing`` mode requires manually splitting the address pools
- between two Kea instances using client classification, to preclude
- both servers from allocating the same address to different clients.
+ ``load-balancing`` mode requires manually splitting the address pools between
+ two Kea instances using client classification, to preclude both servers from
+ allocating the same address to different clients.
Such a split is not needed in the ``hot-standby`` mode. Thus, the benefit
- of using the ``hot-standby`` over the ``load-balancing`` mode is that the former
- has a simpler configuration. Conversely, ``load-balancing`` has higher
+ of using the ``hot-standby`` over the ``load-balancing`` mode is that the
+ former has a simpler configuration. Conversely, ``load-balancing`` has higher
performance potential at the cost of more complex configuration.
- See :ref:`ha-load-balancing-config` for details on how to split the
- pools using client classification.
+ See :ref:`ha-load-balancing-config` for details on how to split the pools
+ using client classification.
In the configurations described above, both the primary and secondary/standby
-are referred to as ``"active"`` servers, because they receive lease
-updates and can automatically react to the partner's failures by
-responding to the DHCP queries which would normally be handled by the
-partner. The HA hook library supports another server type/role: ``backup``.
-The use of a backup server is optional, and can be implemented in both
-load-balancing and hot-standby setup, in addition to the active servers.
-There is no limit on the number of backup servers in the HA setup;
-however, the presence of backup servers may increase the latency
-of DHCP responses, because not only do active servers send lease updates
-to each other, but also to the backup servers. The active
-servers do not expect acknowledgments from the backup servers
-before responding to the DHCP clients, so the overhead of sending
-lease updates to the backup servers is minimized.
-
-In the last supported configuration, ``passive-backup``, there is only one active
-server and typically one or more backup servers. A passive-backup
-configuration with no backup servers is also accepted, but it is no
-different than running a single server with no HA function at all.
-
-The passive-backup configuration is used in situations when an administrator
-wants to take advantage of the backup servers as an additional storage
-for leases without a need for running the full-blown failover setup.
-In this case, if the primary server fails, the DHCP service is lost;
-it requires the administrator to manually start the primary to resume
-DHCP service. The administrator may also configure one of the
-backup servers to provide DHCP service to the clients, as these
-servers should have accurate or nearly accurate information about the
-allocated leases. The major advantage of the passive-backup mode is that
-it provides some redundancy of the lease information but with better
-performance of the primary server responding to the DHCP queries.
-The primary server does not have to wait for
-acknowledgments to the lease updates from the backup servers before it
-sends a response to the DHCP client. This reduces the response time
-compared to the load-balancing and hot-standby cases, in which the
-server responding to the DHCP query has to wait for the acknowledgment
-from the other active server before it can respond to the client.
+are referred to as ``active`` servers, because they receive lease updates and
+can automatically react to the partner's failures by responding to the DHCP
+queries which would normally be handled by the partner. The HA hook library
+supports another server type/role: ``backup``. The use of a backup server is
+optional, and can be implemented in both ``load-balancing`` and ``hot-standby``
+setup, in addition to the active servers. There is no limit on the number of
+backup servers in the HA setup; however, the presence of backup servers may
+increase the latency of DHCP responses, because not only do active servers send
+lease updates to each other, but also to the backup servers. The active servers
+do not expect acknowledgments from the backup servers before responding to the
+DHCP clients, so the overhead of sending lease updates to the backup servers is
+minimized.
+
+In the last supported configuration, ``passive-backup``, there is only one
+active server and typically one or more backup servers. A ``passive-backup``
+configuration with no backup servers is also accepted, but it is no different
+than running a single server with no HA function at all.
+
+The ``passive-backup`` configuration is used in situations when an administrator
+wants to take advantage of the backup servers as an additional storage for
+leases without a need for running the full-blown failover setup. In this case,
+if the primary server fails, the DHCP service is lost; it requires the
+administrator to manually start the primary to resume DHCP service. The
+administrator may also configure one of the backup servers to provide DHCP
+service to the clients, as these servers should have accurate or nearly accurate
+information about the allocated leases. The major advantage of the
+``passive-backup`` mode is that it provides some redundancy of the lease
+information but with better performance of the primary server responding to the
+DHCP queries.
+The primary server does not have to wait for acknowledgments to the lease
+updates from the backup servers before it sends a response to the DHCP client.
+This reduces the response time compared to the load-balancing and ``hot-standby``
+cases, in which the server responding to the DHCP query has to wait for the
+acknowledgment from the other active server before it can respond to the client.
.. note::
An interesting use case for a single active server running in the
- passive-backup mode is a notification service, in which software
- pretending to be a backup server receives live notifications about
- allocated and deleted leases from the primary server and can display
- them on a monitoring screen, trigger alerts, etc.
+ ``passive-backup`` mode is a notification service, in which software
+ pretending to be a backup server receives live notifications about allocated
+ and deleted leases from the primary server and can display them on a
+ monitoring screen, trigger alerts, etc.
Clocks on Active Servers
~~~~~~~~~~~~~~~~~~~~~~~~
Synchronized clocks are essential for the HA setup to operate reliably.
The servers share lease information - via lease updates and during
-synchronization of the databases - including the
-time when the lease was allocated and when it expires. Some clock
-skew between the servers participating in the HA setup usually exists;
-this is acceptable as long as the clock skew is relatively low, compared
-to the lease lifetimes. However, if the clock skew becomes too high, the
-different lease expiration times on different servers may cause the HA
-system to malfunction. For example, one server may consider a lease to
-be expired when it is actually still valid. The lease reclamation
-process may remove a name associated with this lease from the DNS,
-causing problems when the client later attempts to renew the lease.
-
-Each active server monitors the clock skew by comparing its current time
-with the time returned by its partner in response to the heartbeat
-command. This gives a good approximation of the clock skew, although it
-does not take into account the time between the partner sending the response
-and the receipt of this response by the server which sent the
-heartbeat command. If the clock skew exceeds 30 seconds, a warning log
-message is issued. The administrator may correct this problem by
-synchronizing the clocks (e.g. using NTP); the servers should notice the
-clock skew correction and stop issuing the warning.
-
-If the clock skew is not corrected and exceeds 60 seconds, the HA
-service on each of the servers is terminated, i.e. the state machine
-enters the ``terminated`` state. The servers will continue to respond to
-DHCP clients (as in the load-balancing or hot-standby mode), but will
-exchange neither lease updates nor heartbeats and their lease databases
-will diverge. In this case, the administrator should synchronize the
-clocks and restart the servers.
+synchronization of the databases - including the time when the lease was
+allocated and when it expires. Some clock skew between the servers participating
+in the HA setup usually exists; this is acceptable as long as the clock skew is
+relatively low, compared to the lease lifetimes. However, if the clock skew
+becomes too high, the different lease expiration times on different servers may
+cause the HA system to malfunction. For example, one server may consider a lease
+to be expired when it is actually still valid. The lease reclamation process may
+remove a name associated with this lease from the DNS, causing problems when the
+client later attempts to renew the lease.
+
+Each active server monitors the clock skew by comparing its current time with
+the time returned by its partner in response to the heartbeat command. This
+gives a good approximation of the clock skew, although it does not take into
+account the time between the partner sending the response and the receipt of
+this response by the server which sent the heartbeat command. If the clock skew
+exceeds 30 seconds, a warning log message is issued. The administrator may
+correct this problem by synchronizing the clocks (e.g. using NTP); the servers
+should notice the clock skew correction and stop issuing the warning.
+
+If the clock skew is not corrected and exceeds 60 seconds, the HA service on
+each of the servers is terminated, i.e. the state machine enters the
+``terminated`` state. The servers will continue to respond to DHCP clients (as
+in the ``load-balancing`` or ``hot-standby`` mode), but will exchange neither
+lease updates nor heartbeats and their lease databases will diverge. In this
+case, the administrator should synchronize the clocks and restart the servers.
.. note::
.. note::
- The clock skew is only assessed between two active servers, and
- only the active servers enter the ``terminated`` state if the skew is
- too high. The clock skew between active and
- backup servers is not assessed, because active servers do
- not exchange heartbeat messages with backup servers.
+ The clock skew is only assessed between two active servers, and only the
+ active servers enter the ``terminated`` state if the skew is too high. The
+ clock skew between active and backup servers is not assessed, because active
+ servers do not exchange heartbeat messages with backup servers.
.. _ha-https-support:
HTTPS Support
~~~~~~~~~~~~~
-Since Kea 1.9.7, the High Availability hook library supports HTTPS
-via TLS, as described in :ref:`tls`.
+Since Kea 1.9.7, the High Availability hook library supports HTTPS via TLS, as
+described in :ref:`tls`.
The HTTPS configuration parameters are:
-- ``trust-anchor`` - specifies the name of a file or directory
- where the certification authority certificate of a Control Agent can
- be found.
+- ``trust-anchor`` - specifies the name of a file or directory where the
+ certification authority certificate of a Control Agent can be found.
-- ``cert-file`` - specifies the name of the file containing
- the end-entity certificate to use.
+- ``cert-file`` - specifies the name of the file containing the end-entity
+ certificate to use.
-- ``key-file`` - specifies the private key of the end-entity
- certificate to use.
+- ``key-file`` - specifies the private key of the end-entity certificate to use.
-These parameters can be configured at the global and peer
-levels. When configured at both levels the peer value is used, allowing
-common values to be shared.
+These parameters can be configured at the global and peer levels. When
+configured at both levels the peer value is used, allowing common values to be
+shared.
-The three parameters must be either all not specified (HTTPS disabled)
-or all specified (HTTPS enabled). Specification of the empty string is
-considered not specified; this can be used, for instance, to disable
-HTTPS for a particular peer when it is enabled at the global level.
+The three parameters must be either all not specified (HTTPS disabled) or all
+specified (HTTPS enabled). Specification of the empty string is considered not
+specified; this can be used, for instance, to disable HTTPS for a particular
+peer when it is enabled at the global level.
As the High Availability hook library is an HTTPS client, there is no
``cert-required`` parameter in this hook configuration.
-This parameter can be set in Control Agent to require and verify
-client certificate in Client-Server communication. It does not affect
-communication between HA peers at the client side, see below for
-the server side.
+This parameter can be set in Control Agent to require and verify client
+certificate in Client-Server communication. It does not affect communication
+between HA peers at the client side, see below for the server side.
-Before Kea 2.1.7 using HTTPS in HA setup requires use of Control Agent
-on all peers. (See :ref:`tls` for Control Agent TLS configuration).
+Before Kea 2.1.7 using HTTPS in HA setup requires use of Control Agent on all
+peers. (See :ref:`tls` for Control Agent TLS configuration).
Since Kea 2.1.7 the HTTPS server side is supported:
-- the peer entry for the server name is used for the TLS setting.
+- the peer entry for the server name is used for the TLS setting.
-- the new ``require-client-certs`` specifies if client certificates
- are required and verified, i.e. like ``cert-required``. It defaults
- to true and is a HA config (vs peer config) parameter.
+- the new ``require-client-certs`` specifies if client certificates are
+ required and verified, i.e. like ``cert-required``. It defaults to true and
+ is a HA config (vs peer config) parameter.
-Kea 2.1.7 adds to a new security feature with the ``restrict-commands``
-HA config parameter: when it is configured to ``true`` (default is
-``false``) commands which are not used by the hook are rejected.
+Kea 2.1.7 adds to a new security feature with the ``restrict-commands`` HA
+config parameter: when it is configured to ``true`` (default is ``false``)
+commands which are not used by the hook are rejected.
-Following is an example of HA server pair and Control Agent
-configuration for Hot-Standby with TLS.
+Following is an example of HA server pair and Control Agent configuration for
+``hot-standby`` with TLS.
Server 1:
::
Server States
~~~~~~~~~~~~~
-A DHCP server operating within an HA setup runs a state machine, and the
-state of the server can be retrieved by its peers using the
-``ha-heartbeat`` command sent over the RESTful API. If the partner
-server does not respond to the ``ha-heartbeat`` command within the
-specified amount of time, the communication is considered interrupted
-and the server may, depending on the configuration, use additional
-measures (described later in this document) to verify that the partner
-is still operating. If it finds that the partner is not operating, the
-server transitions to the ``partner-down`` state to handle all the
-DHCP traffic directed to the system.
-
-In this case, the surviving server continues to send the
-``ha-heartbeat`` command to detect when the partner wakes up. At that
-time, the partner synchronizes the lease database. When it is again
-ready to operate, the surviving server returns to normal operation, i.e.
-the ``load-balancing`` or ``hot-standby`` state.
+A DHCP server operating within an HA setup runs a state machine, and the state
+of the server can be retrieved by its peers using the ``ha-heartbeat`` command
+sent over the RESTful API. If the partner server does not respond to the
+``ha-heartbeat`` command within the specified amount of time, the communication
+is considered interrupted and the server may, depending on the configuration,
+use additional measures (described later in this document) to verify that the
+partner is still operating. If it finds that the partner is not operating, the
+server transitions to the ``partner-down`` state to handle all the DHCP traffic
+directed to the system.
+
+In this case, the surviving server continues to send the ``ha-heartbeat``
+command to detect when the partner wakes up. At that time, the partner
+synchronizes the lease database. When it is again ready to operate, the
+surviving server returns to normal operation, i.e. the ``load-balancing`` or
+``hot-standby`` state.
The following is the list of all possible server states:
-- ``backup`` - normal operation of the backup server. In this state it
- receives lease updates from the active server(s).
-
-- ``communication-recovery`` - an active server running in load-balancing
- mode may transition to this state when it experiences communication
- issues with a partner server over the control channel. This is an
- intermediate state between the ``load-balancing`` and ``partner-down``
- states. In this state the server continues to respond to DHCP queries
- but does not send lease updates to the partner; lease updates are
- queued and are sent when normal communication is resumed. If
- communication does not resume within the time specified, the primary server
- then transitions to the
- ``partner-down`` state. The ``communication-recovery`` state was
- introduced to ensure reliable DHCP service when both active servers
- remain operational but the communication between them is interrupted
- for a prolonged period of time. Either server can be configured to never
- enter this state by setting the ``delayed-updates-limit`` to 0. (Please refer to
+- ``backup`` - normal operation of the backup server. In this state it receives
+ lease updates from the active server(s).
+
+- ``communication-recovery`` - an active server running in ``load-balancing``
+ mode may transition to this state when it experiences communication issues
+ with a partner server over the control channel. This is an intermediate state
+ between the ``load-balancing`` and ``partner-down`` states. In this state the
+ server continues to respond to DHCP queries but does not send lease updates
+ to the partner; lease updates are queued and are sent when normal
+ communication is resumed. If communication does not resume within the time
+ specified, the primary server then transitions to the ``partner-down`` state.
+ The ``communication-recovery`` state was introduced to ensure reliable DHCP
+ service when both active servers remain operational but the communication
+ between them is interrupted for a prolonged period of time. Either server can
+ be configured to never enter this state by setting the
+ ``delayed-updates-limit`` to 0. (Please refer to
:ref:`ha-load-balancing-config`, later in this chapter, for details on this
- parameter.) Disabling entry into the ``communication-recovery`` state
- causes the server to begin testing for the ``partner-down`` state
- as soon as the server is unable to communicate with its partner.
+ parameter). Disabling entry into the ``communication-recovery`` state causes
+ the server to begin testing for the ``partner-down`` state as soon as the
+ server is unable to communicate with its partner.
.. note::
- In Kea 1.9.4, with the introduction of ``delayed-updates-limit``,
- the default server's behavior
- in load-balancing mode changed. When a server experiences
- communication issues with its partner, it now enters the ``communication-recovery``
- state and queues lease updates until communication is resumed. Prior to
- Kea 1.9.4, a server that could not communicate with its partner in
- ``load-balancing`` mode would immediately begin the transition to
+ In Kea 1.9.4, with the introduction of ``delayed-updates-limit``, the default
+ server's behavior in ``load-balancing`` mode changed. When a server
+ experiences communication issues with its partner, it now enters the
+ ``communication-recovery`` state and queues lease updates until communication
+ is resumed. Prior to Kea 1.9.4, a server that could not communicate with its
+ partner in ``load-balancing`` mode would immediately begin the transition to
the ``partner-down`` state.
-- ``hot-standby`` - normal operation of the active server running in
- the hot-standby mode; both the primary and the standby server are in
- this state during their normal operation. The primary server responds
- to DHCP queries and sends lease updates to the standby server and to
- any backup servers that are present.
+- ``hot-standby`` - normal operation of the active server running in the
+ ``hot-standby`` mode; both the primary and the standby server are in this
+ state during their normal operation. The primary server responds to DHCP
+ queries and sends lease updates to the standby server and to any backup
+ servers that are present.
-- ``load-balancing`` - normal operation of the active server running in
- the load-balancing mode; both the primary and the secondary server
- are in this state during their normal operation. Both servers respond
- to DHCP queries and send lease updates to each other and to any
- backup servers that are present.
+- ``load-balancing`` - normal operation of the active server running in the
+ ``load-balancing`` mode; both the primary and the secondary server are in
+ this state during their normal operation. Both servers respond to DHCP
+ queries and send lease updates to each other and to any backup servers that
+ are present.
- ``in-maintenance`` - an active server transitions to this state as a result
- of being notified by its partner that the administrator requested
- maintenance of the HA setup. The administrator requests the maintenance
- by sending the ``ha-maintenance-start`` command to the server which is supposed
- to take over the responsibility for responding to the DHCP clients while
- the other server is taken offline for maintenance. If the server is
- in the ``in-maintenance`` state it can be safely shut down. The partner
- is in the ``partner-in-maintenance`` state, from which it transitions
- to the ``partner-down`` state immediately after it discovers that the
- server in maintenance has been shut down.
-
-- ``partner-down`` - an active server transitions to this state after
- detecting that its partner (another active server) is offline. The
- server does not transition to this state if only a backup server is
- unavailable. In the ``partner-down`` state the active server responds
- to all DHCP queries, including those queries which are normally
- handled by the server that is now unavailable.
+ of being notified by its partner that the administrator requested maintenance
+ of the HA setup. The administrator requests the maintenance by sending the
+ ``ha-maintenance-start`` command to the server which is supposed to take over
+ the responsibility for responding to the DHCP clients while the other server
+ is taken offline for maintenance. If the server is in the ``in-maintenance``
+ state it can be safely shut down. The partner is in the to the
+ ``partner-down`` state immediately after it discovers that the server in
+ maintenance has been shut down.
+
+- ``partner-down`` - an active server transitions to this state after detecting
+ that its partner (another active server) is offline. The server does not
+ transition to this state if only a backup server is unavailable. In the
+ ``partner-down`` state the active server responds to all DHCP queries,
+ including those queries which are normally handled by the server that is now
+ unavailable.
- ``partner-in-maintenance`` - an active server transitions to this state
- after receiving a ``ha-maintenance-start`` command from the
- administrator. The server in this state becomes responsible
- for responding to all DHCP requests. The server sends a
- ``ha-maintenance-notify`` command to the partner, which should
- enter the ``in-maintenance`` state. The server
- remaining in the ``partner-in-maintenance`` state keeps sending lease
- updates to the partner until it finds that the partner has stopped
- responding to those lease updates, heartbeats, or any other commands.
- In this case, the server in the ``partner-in-maintenance`` state
- transitions to the ``partner-down`` state and keeps responding to
- the queries, but no longer sends lease updates.
-
-- ``passive-backup`` - a primary server running in the passive-backup HA
- mode transitions to this state immediately after it boots up. The
- primary server in this state responds to all DHCP traffic
- and sends lease updates to the backup servers it is connected to. By
- default, the primary server does not wait for acknowledgments from
- the backup servers and responds to a DHCP query right after sending
- lease updates to all backup servers. If any of the lease updates
- fail, a backup server misses the lease update but the DHCP client
- is still provisioned. This default configuration can be changed by
- setting the ``wait-backup-ack`` configuration parameter to ``true``,
- in which case the primary server always waits for the acknowledgements
- and drops the DHCP query if sending any of the corresponding lease
- updates fails. This improves lease database consistency between the
- primary and the secondary. However, if a communication failure between
- the active server and any of the backups occurs, it effectively causes
- the failure of the DHCP service from the DHCP clients' perspective.
-
-- ``ready`` - an active server transitions to this state after
- synchronizing its lease database with an active partner. This state
- indicates to the partner - which may be in the ``partner-down`` state
- - that it should return to normal operation. If and when it does, the
- server in the ``ready`` state also starts normal operation.
-
-- ``syncing`` - an active server transitions to this state to fetch
- leases from the active partner and update the local lease database.
- When in this state, the server issues the ``dhcp-disable`` command to
- disable the DHCP service of the partner from which the leases are
- fetched. The DHCP service is disabled for a maximum time of 60
- seconds, after which it is automatically re-enabled, in case the
- syncing partner was unable to re-enable the service. If the
- synchronization completes successfully, the synchronizing server
- issues the ``ha-sync-complete-notify`` command to notify the partner.
- In most states, the partner re-enables its DHCP service to continue
- responding to the DHCP queries. In the ``partner-down`` state, the
- partner first ensures that communication between the servers
- is re-established before enabling the DHCP service.
- The syncing operation is synchronous; the server waits for an answer
- from the partner and does nothing else while the lease
- synchronization takes place. A server that is configured not to
- synchronize the lease database with its partner, i.e. when the
- ``sync-leases`` configuration parameter is set to ``false``, will
- never transition to this state. Instead, it transitions directly
- from the ``waiting`` state to the ``ready`` state.
-
-- ``terminated`` - an active server transitions to this state when the
- High Availability hook library is unable to further provide reliable
- service and a manual intervention of the administrator is required to
- correct the problem. Various issues with the HA setup may cause the
- server to transition to this state. While in this state, the server
- continues responding to DHCP clients based on the HA mode selected
- (load-balancing or hot-standby), but lease updates are not
- exchanged and heartbeats are not sent. Once a server has entered
- the ``terminated`` state, it remains in this state until it is
+ after receiving a ``ha-maintenance-start`` command from the administrator.
+ The server in this state becomes responsible for responding to all DHCP
+ requests. The server sends a ``ha-maintenance-notify`` command to the partner,
+ which should enter the ``in-maintenance`` state. The server remaining in the
+ ``partner-in-maintenance`` state keeps sending lease updates to the partner
+ until it finds that the partner has stopped responding to those lease updates,
+ heartbeats, or any other commands. In this case, the server in the
+ ``partner-in-maintenance`` state transitions to the ``partner-down`` state
+ and keeps responding to the queries, but no longer sends lease updates.
+
+- ``passive-backup`` - a primary server running in the ``passive-backup`` HA
+ mode transitions to this state immediately after it boots up. The primary
+ server in this state responds to all DHCP traffic and sends lease updates to
+ the backup servers it is connected to. By default, the primary server does
+ not wait for acknowledgments from the backup servers and responds to a DHCP
+ query right after sending lease updates to all backup servers. If any of the
+ lease updates fail, a backup server misses the lease update but the DHCP
+ client is still provisioned. This default configuration can be changed by
+ setting the ``wait-backup-ack`` configuration parameter to ``true``, in which
+ case the primary server always waits for the acknowledgements and drops the
+ DHCP query if sending any of the corresponding lease updates fails. This
+ improves lease database consistency between the primary and the secondary.
+ However, if a communication failure between the active server and any of the
+ backups occurs, it effectively causes the failure of the DHCP service from
+ the DHCP clients' perspective.
+
+- ``ready`` - an active server transitions to this state after synchronizing
+ its lease database with an active partner. This state indicates to the
+ partner (which may be in the ``partner-down`` state) that it should return to
+ normal operation. If and when it does, the server in the ``ready`` state also
+ starts normal operation.
+
+- ``syncing`` - an active server transitions to this state to fetch leases from
+ the active partner and update the local lease database. When in this state,
+ the server issues the ``dhcp-disable`` command to disable the DHCP service of
+ the partner from which the leases are fetched. The DHCP service is disabled
+ for a maximum time of 60 seconds, after which it is automatically re-enabled,
+ in case the syncing partner was unable to re-enable the service. If the
+ synchronization completes successfully, the synchronizing server issues the
+ ``ha-sync-complete-notify`` command to notify the partner. In most states,
+ the partner re-enables its DHCP service to continue responding to the DHCP
+ queries. In the ``partner-down`` state, the partner first ensures that
+ communication between the servers is re-established before enabling the DHCP
+ service. The syncing operation is synchronous; the server waits for an answer
+ from the partner and does nothing else while the lease synchronization takes
+ place. A server that is configured not to synchronize the lease database with
+ its partner, i.e. when the ``sync-leases`` configuration parameter is set to
+ ``false``, will never transition to this state. Instead, it transitions
+ directly from the ``waiting`` state to the ``ready`` state.
+
+- ``terminated`` - an active server transitions to this state when the High
+ Availability hook library is unable to further provide reliable service and a
+ manual intervention of the administrator is required to correct the problem.
+ Various issues with the HA setup may cause the server to transition to this
+ state. While in this state, the server continues responding to DHCP clients
+ based on the HA mode selected (``load-balancing`` or ``hot-standby``), but
+ lease updates are not exchanged and heartbeats are not sent. Once a server
+ has entered the ``terminated`` state, it remains in this state until it is
restarted. The administrator must correct the issue which caused this
situation prior to restarting the server (e.g. synchronize the clocks);
- otherwise, the server will return to the "terminated" state once it
- finds that the issue persists.
-
-- ``waiting`` - each started server instance enters this state. A
- backup server transitions directly from this state to the ``backup``
- state. An active server sends a heartbeat to its partner to check its
- state; if the partner appears to be unavailable, the server
- transitions to the ``partner-down`` state. If the partner is
- available, the server transitions to the ``syncing`` or ``ready``
- state, depending on the setting of the ``sync-leases`` configuration
- parameter. If both servers appear to be in the ``waiting`` state
- (concurrent startup), the primary server transitions to the next
- state first. The secondary or standby server remains in the
- ``waiting`` state until the primary transitions to the ``ready``
- state.
+ otherwise, the server will return to the "terminated" state once it finds
+ that the issue persists.
+
+- ``waiting`` - each started server instance enters this state. A backup server
+ transitions directly from this state to the ``backup`` state. An active
+ server sends a heartbeat to its partner to check its state; if the partner
+ appears to be unavailable, the server transitions to the ``partner-down``
+ state. If the partner is available, the server transitions to the ``syncing``
+ or ``ready`` state, depending on the setting of the ``sync-leases``
+ configuration parameter. If both servers appear to be in the ``waiting``
+ state (concurrent startup), the primary server transitions to the next state
+ first. The secondary or standby server remains in the ``waiting`` state until
+ the primary transitions to the ``ready`` state.
.. note::
- Currently, restarting the HA service from the ``terminated`` state
- requires restarting the DHCP server or reloading its configuration.
+ Currently, restarting the HA service from the ``terminated`` state requires
+ restarting the DHCP server or reloading its configuration.
-Whether the server responds to DHCP queries and which queries it
-responds to is a matter of the server's state, if no administrative
-action is performed to configure the server otherwise. The following
-table provides the default behavior for various states.
+Whether the server responds to DHCP queries and which queries it responds to is
+a matter of the server's state, if no administrative action is performed to
+configure the server otherwise. The following table provides the default
+behavior for various states.
-The ``DHCP Service Scopes`` denote which group of received DHCP queries
-the server responds to in the given state. The HA configuration
-must specify a unique name for each server within the HA setup. This
-document uses the following convention within the provided examples:
-"server1" for a primary server, "server2" for the secondary or
-standby server, and "server3" for the backup server. In real life any
-names can be used as long as they remain unique.
+The ``DHCP Service Scopes`` denote which group of received DHCP queries the
+server responds to in the given state. The HA configuration must specify a
+unique name for each server within the HA setup. This document uses the
+following convention within the provided examples: "server1" for a primary
+server, "server2" for the secondary or standby server, and "server3" for the
+backup server. In real life any names can be used as long as they remain unique.
An in-depth explanation of the scopes can be found below.
.. table:: Default behavior of the server in various HA states
- +------------------------+-----------------+-----------------+-----------------+
- | State | Server Type | DHCP Service | DHCP Service |
- | | | | Scopes |
- +========================+=================+=================+=================+
- | backup | backup server | disabled | none |
- +------------------------+-----------------+-----------------+-----------------+
- | communication-recovery | primary or | enabled | "HA_server1" |
- | | secondary | | or |
- | | (load-balancing | | "HA_server2" |
- | | mode only) | | |
- +------------------------+-----------------+-----------------+-----------------+
- | hot-standby | primary or | enabled | "HA_server1" |
- | | standby | | if primary, |
- | | (hot-standby | | none otherwise |
- | | mode) | | |
- +------------------------+-----------------+-----------------+-----------------+
- | load-balancing | primary or | enabled | "HA_server1" |
- | | secondary | | or |
- | | (load-balancing | | "HA_server2" |
- | | mode) | | |
- +------------------------+-----------------+-----------------+-----------------+
- | in-maintenance | active server | disabled | none |
- +------------------------+-----------------+-----------------+-----------------+
- | partner-down | active server | enabled | all scopes |
- +------------------------+-----------------+-----------------+-----------------+
- | partner-in-maintenance | active server | enabled | all scopes |
- +------------------------+-----------------+-----------------+-----------------+
- | passive-backup | active server | enabled | all scopes |
- +------------------------+-----------------+-----------------+-----------------+
- | ready | active server | disabled | none |
- +------------------------+-----------------+-----------------+-----------------+
- | syncing | active server | disabled | none |
- +------------------------+-----------------+-----------------+-----------------+
- | terminated | active server | enabled | same as in the |
- | | | | load-balancing |
- | | | | or hot-standby |
- | | | | state |
- +------------------------+-----------------+-----------------+-----------------+
- | waiting | any server | disabled | none |
- +------------------------+-----------------+-----------------+-----------------+
-
-In the load-balancing mode there are two scopes specified for the active
-servers: "HA_server1" and "HA_server2". The DHCP queries
-load-balanced to ``server1`` belong to the "HA_server1" scope and the
-queries load-balanced to ``server2`` belong to the "HA_server2" scope.
-If either server is in the ``partner-down`` state, the active
-partner is responsible for serving both scopes.
-
-In the hot-standby mode, there is only one scope - "HA_server1" -
-because only ``server1`` is responding to DHCP queries. If that server
-becomes unavailable, ``server2`` becomes responsible for this scope.
-
-The backup servers do not have their own scopes. In some cases they can
-be used to respond to queries belonging to the scopes of the active
-servers. Also, a backup server which is neither in the partner-down state nor
-in normal operation serves no scopes.
-
-The scope names can be used to associate pools, subnets, and networks
-with certain servers, so that only these servers can allocate addresses or
-prefixes from those pools, subnets, or networks. This is done via the
-client classification mechanism (see :ref:`ha-load-balancing-advanced-config`
-for more details).
+ +------------------------+-----------------+-----------------+----------------+
+ | State | Server Type | DHCP Service | DHCP Service |
+ | | | | Scopes |
+ +========================+=================+=================+================+
+ | backup | backup server | disabled | none |
+ +------------------------+-----------------+-----------------+----------------+
+ | communication-recovery | primary or | enabled | "HA_server1" |
+ | | secondary | | or |
+ | | (load-balancing | | "HA_server2" |
+ | | mode only) | | |
+ +------------------------+-----------------+-----------------+----------------+
+ | hot-standby | primary or | enabled | "HA_server1" |
+ | | standby | | if primary, |
+ | | (hot-standby | | none otherwise |
+ | | mode) | | |
+ +------------------------+-----------------+-----------------+----------------+
+ | load-balancing | primary or | enabled | "HA_server1" |
+ | | secondary | | or |
+ | | (load-balancing | | "HA_server2" |
+ | | mode) | | |
+ +------------------------+-----------------+-----------------+----------------+
+ | in-maintenance | active server | disabled | none |
+ +------------------------+-----------------+-----------------+----------------+
+ | partner-down | active server | enabled | all scopes |
+ +------------------------+-----------------+-----------------+----------------+
+ | partner-in-maintenance | active server | enabled | all scopes |
+ +------------------------+-----------------+-----------------+----------------+
+ | passive-backup | active server | enabled | all scopes |
+ +------------------------+-----------------+-----------------+----------------+
+ | ready | active server | disabled | none |
+ +------------------------+-----------------+-----------------+----------------+
+ | syncing | active server | disabled | none |
+ +------------------------+-----------------+-----------------+----------------+
+ | terminated | active server | enabled | same as in the |
+ | | | | load-balancing |
+ | | | | or hot-standby |
+ | | | | state |
+ +------------------------+-----------------+-----------------+----------------+
+ | waiting | any server | disabled | none |
+ +------------------------+-----------------+-----------------+----------------+
+
+In the ``load-balancing`` mode there are two scopes specified for the active
+servers: "HA_server1" and "HA_server2". The DHCP queries load-balanced to
+``server1`` belong to the "HA_server1" scope and the queries load-balanced to
+``server2`` belong to the "HA_server2" scope. If either server is in the
+``partner-down`` state, the active partner is responsible for serving both
+scopes.
+
+In the ``hot-standby`` mode, there is only one scope - "HA_server1" - because
+only ``server1`` is responding to DHCP queries. If that server becomes
+unavailable, ``server2`` becomes responsible for this scope.
+
+The backup servers do not have their own scopes. In some cases they can be used
+to respond to queries belonging to the scopes of the active servers. Also, a
+backup server which is neither in the partner-down state nor in normal operation
+serves no scopes.
+
+The scope names can be used to associate pools, subnets, and networks with
+certain servers, so that only these servers can allocate addresses or prefixes
+from those pools, subnets, or networks. This is done via the client
+classification mechanism (see :ref:`ha-load-balancing-advanced-config` for more
+details).
.. _ha-scope-transition:
Scope Transition in a Partner-Down Case
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-When one of the servers finds that its partner is unavailable, it starts
-serving clients from both its own scope and the scope of the unavailable
-partner. This is straightforward for new clients, i.e. those sending
-DHCPDISCOVER (DHCPv4) or Solicit (DHCPv6), because those requests are
-not sent to any particular server. The available server responds to
-all such queries when it is in the ``partner-down`` state.
-
-When a client renews a lease, it sends its DHCPREQUEST (DHCPv4) or Renew
-(DHCPv6) message directly to the server which has allocated the lease
-being renewed. If this server is no longer available, the client will
-get no response. In that case, the client continues to use its lease and
-attempts to renew until the rebind timer (T2) elapses. The client then
-enters the rebinding phase, in which it sends a DHCPREQUEST (DHCPv4) or
-Rebind (DHCPv6) message to any available server. The surviving server
-receives the rebinding request and typically extends the
-lifetime of the lease. The client then continues to contact that new
+When one of the servers finds that its partner is unavailable, it starts serving
+clients from both its own scope and the scope of the unavailable partner. This
+is straightforward for new clients, i.e. those sending DHCPDISCOVER (DHCPv4) or
+Solicit (DHCPv6), because those requests are not sent to any particular server.
+The available server responds to all such queries when it is in the
+``partner-down`` state.
+
+When a client renews a lease, it sends its DHCPREQUEST (DHCPv4) or Renew (DHCPv6)
+message directly to the server which has allocated the lease being renewed. If
+this server is no longer available, the client will get no response. In that
+case, the client continues to use its lease and attempts to renew until the
+rebind timer (T2) elapses. The client then enters the rebinding phase, in which
+it sends a DHCPREQUEST (DHCPv4) or Rebind (DHCPv6) message to any available
+server. The surviving server receives the rebinding request and typically
+extends the lifetime of the lease. The client then continues to contact that new
server to renew its lease as appropriate.
-If and when the other server once again becomes available, both active
-servers will eventually transition to the load-balancing or
-hot-standby state, in which they will again be responsible for their
-own scopes. Some clients belonging to the scope of the restarted server
-will try to renew their leases via the surviving server, but this server
-will no longer respond to them; the client will eventually transition
-back to the correct server via the rebinding mechanism.
+If and when the other server once again becomes available, both active servers
+will eventually transition to the ``load-balancing`` or ``hot-standby`` state,
+in which they will again be responsible for their own scopes. Some clients
+belonging to the scope of the restarted server will try to renew their leases
+via the surviving server, but this server will no longer respond to them; the
+client will eventually transition back to the correct server via the rebinding
+mechanism.
.. _ha-load-balancing-config:
Load-Balancing Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The following is the configuration snippet to enable high availability
-on the primary server within the load-balancing configuration. The same
-configuration should be applied on the secondary and backup servers,
-with the only difference that ``this-server-name`` should be set to
-"server2" and "server3" on those servers, respectively.
+The following is the configuration snippet to enable high availability on the
+primary server within the load-balancing configuration. The same configuration
+should be applied on the secondary and backup servers, with the only difference
+that ``this-server-name`` should be set to "server2" and "server3" on those
+servers, respectively.
.. note::
Remember that ``load-balancing`` mode requires the address pools and
- delegated prefix pools to be split between the active servers. During
- normal operation, the servers use non-overlapping pools to avoid
- allocating the same lease to different clients by both instances.
- A server will only use the pool fragments owned by the partner when
- the partner is not running. See the notes in
- :ref:`ha-supported-configurations` highlighting differences between
- the ``load-balancing`` and ``hot-standby`` modes. The semantics of pool
+ delegated prefix pools to be split between the active servers. During normal
+ operation, the servers use non-overlapping pools to avoid allocating the same
+ lease to different clients by both instances. A server will only use the pool
+ fragments owned by the partner when the partner is not running. See the notes
+ in :ref:`ha-supported-configurations` highlighting differences between the
+ ``load-balancing`` and ``hot-standby`` modes. The semantics of pool
partitioning is explained further in this section.
The :ref:`ha-load-balancing-advanced-config` section provides advanced
pool-partitioning examples.
}]
}
-Two hook libraries must be loaded to enable HA:
-``libdhcp_lease_cmds.so`` and ``libdhcp_ha.so``. The latter implements
-the HA feature, while the former enables control commands required by HA
-to fetch and manipulate leases on the remote servers. In the example
-provided above, it is assumed that Kea libraries are installed in the
-``/usr/lib`` directory. If Kea is not installed in the /usr directory,
-the hook libraries locations must be updated accordingly.
+Two hook libraries must be loaded to enable HA: ``libdhcp_lease_cmds.so`` and
+``libdhcp_ha.so``. The latter implements the HA feature, while the former
+enables control commands required by HA to fetch and manipulate leases on the
+remote servers. In the example provided above, it is assumed that Kea libraries
+are installed in the ``/usr/lib`` directory. If Kea is not installed in the
+``/usr`` directory, the hook libraries locations must be updated accordingly.
The HA configuration is specified within the scope of ``libdhcp_ha.so``.
-Note that while the top-level parameter ``high-availability`` is a list,
-only a single entry is currently supported.
-
-The following are the global parameters which control the server's
-behavior with respect to HA:
-
-- ``this-server-name`` - is a unique identifier of the server within
- this HA setup. It must match with one of the servers specified within
- the ``peers`` list.
-
-- ``mode`` - specifies an HA mode of operation. The currently supported
- modes are ``load-balancing`` and ``hot-standby``.
-
-- ``heartbeat-delay`` - specifies a duration in milliseconds between
- sending the last heartbeat (or other command sent to the partner) and
- the next heartbeat. Heartbeats are sent periodically to gather
- the status of the partner and to verify whether the partner is still
- operating. The default value of this parameter is 10000 ms.
-
-- ``max-response-delay`` - specifies a duration in milliseconds since
- the last successful communication with the partner, after which the
- server assumes that communication with the partner is interrupted.
- This duration should be greater than the ``heartbeat-delay``; typically
- it should be a multiple of ``heartbeat-delay``.
- When the server detects that communication is interrupted, it
- may transition to the ``partner-down`` state (when
- ``max-unacked-clients`` is 0) or trigger the failure-detection
- procedure using the values of the two parameters below. The default
- value of this parameter is 60000 ms.
+Note that while the top-level parameter ``high-availability`` is a list, only a
+single entry is currently supported.
+
+The following are the global parameters which control the server's behavior with
+respect to HA:
+
+- ``this-server-name`` - is a unique identifier of the server within this HA
+ setup. It must match with one of the servers specified within the ``peers``
+ list.
+
+- ``mode`` - specifies an HA mode of operation. The currently supported modes
+ are ``load-balancing`` and ``hot-standby``.
+
+- ``heartbeat-delay`` - specifies a duration in milliseconds between sending
+ the last heartbeat (or other command sent to the partner) and the next
+ heartbeat. Heartbeats are sent periodically to gather the status of the
+ partner and to verify whether the partner is still operating. The default
+ value of this parameter is 10000 ms.
+
+- ``max-response-delay`` - specifies a duration in milliseconds since the last
+ successful communication with the partner, after which the server assumes
+ that communication with the partner is interrupted. This duration should be
+ greater than the ``heartbeat-delay``; typically it should be a multiple of
+ ``heartbeat-delay``. When the server detects that communication is
+ interrupted, it may transition to the ``partner-down`` state (when
+ ``max-unacked-clients`` is 0) or trigger the failure-detection procedure
+ using the values of the two parameters below. The default value of this
+ parameter is 60000 ms.
- ``max-ack-delay`` - is one of the parameters controlling partner
- failure-detection. When communication with the partner is
- interrupted, the server examines the values of the ``secs`` field
- (DHCPv4) or ``Elapsed Time`` option (DHCPv6), which denote how long
- the DHCP client has been trying to communicate with the DHCP server.
- This parameter specifies the maximum time in milliseconds for the
- client to try to communicate with the DHCP server, after which this
- server assumes that the client failed to communicate with the DHCP
- server (is unacknowledged or "unacked"). The default value of this parameter is 10000.
-
-- ``max-unacked-clients`` - specifies how many "unacked" clients are
- allowed (see ``max-ack-delay``) before this server assumes that the
- partner is offline and transitions to the ``partner-down`` state. The
- special value of 0 is allowed for this parameter, which disables the
- failure-detection mechanism. In this case, a server that cannot
- communicate with its partner over the control channel assumes that
- the partner server is down and transitions to the ``partner-down``
- state immediately. The default value of this parameter is 10.
-
-- ``delayed-updates-limit`` - specifies the maximum number of lease
- updates which can be queued while the server is in the
- ``communication-recovery`` state. This parameter was introduced in
- Kea 1.9.4. The special value of 0 configures the server to
- never transition to the ``communication-recovery`` state and the
- server behaves as in earlier Kea versions, i.e. if the server
+ failure-detection. When communication with the partner is interrupted, the
+ server examines the values of the ``secs`` field (DHCPv4) or ``Elapsed Time``
+ option (DHCPv6), which denote how long the DHCP client has been trying to
+ communicate with the DHCP server. This parameter specifies the maximum time
+ in milliseconds for the client to try to communicate with the DHCP server,
+ after which this server assumes that the client failed to communicate with
+ the DHCP server (is unacknowledged or "unacked"). The default value of this
+ parameter is 10000.
+
+- ``max-unacked-clients`` - specifies how many "unacked" clients are allowed
+ (see ``max-ack-delay``) before this server assumes that the partner is
+ offline and transitions to the ``partner-down`` state. The special value of 0
+ is allowed for this parameter, which disables the failure-detection mechanism.
+ In this case, a server that cannot communicate with its partner over the
+ control channel assumes that the partner server is down and transitions to
+ the ``partner-down`` state immediately. The default value of this parameter
+ is 10.
+
+- ``delayed-updates-limit`` - specifies the maximum number of lease updates
+ which can be queued while the server is in the ``communication-recovery``
+ state. This parameter was introduced in Kea 1.9.4. The special value of 0
+ configures the server to never transition to the ``communication-recovery``
+ state and the server behaves as in earlier Kea versions, i.e. if the server
cannot reach its partner, it goes straight into the ``partner-down`` state.
The default value of this parameter is 100.
-The values of ``max-ack-delay`` and ``max-unacked-clients`` must be
-selected carefully, taking into account the specifics of the network in
-which the DHCP servers are operating. The server in question
-may not respond to some DHCP clients following administrative policy, or the server
-may drop malformed queries from clients. Therefore, selecting too
-low a value for the ``max-unacked-clients`` parameter may result in a
-transition to the ``partner-down`` state even though the partner is
-still operating. On the other hand, selecting too high a value may
-result in never transitioning to the ``partner-down`` state if the DHCP
-traffic in the network is very low (e.g. at night), because the number
-of distinct clients trying to communicate with the server could be lower
-than the ``max-unacked-clients`` setting.
-
-In some cases it may be useful to disable the failure-detection
-mechanism altogether, if the servers are located very close to each
-other and network partitioning is unlikely, i.e. failure to respond to
-heartbeats is only possible when the partner is offline. In such cases,
-set ``max-unacked-clients`` to 0.
-
-The ``delayed-updates-limit`` parameter
-is used to enable or disable the ``communication-recovery``
-procedure, and controls the server's behavior in the ``communication-recovery``
-state. This parameter can only be used in the load-balancing mode.
-
-If a server in the ``load-balancing`` state experiences
-communication issues with its partner (a heartbeat or lease update fail),
-the server transitions to the ``communication-recovery`` state. In this
-state, the server keeps responding to DHCP queries but does not send
-lease updates to the partner. The lease updates are queued until
-communication is re-established, to ensure that DHCP service
-remains available even in the event of the communication loss between
-the partners. There may appear to be communication loss when either
-one of the servers has terminated, or when both servers remain available
-but cannot communicate with each other. In the former case, the surviving server will
-follow the normal procedure and should eventually transition to
-the ``partner-down`` state. In the latter case, both servers should
-transition to the ``communication-recovery`` state and should never
-transition to the ``partner-down`` state (if ``max-unacked-clients``
-is set to a non-zero value), because all DHCP queries are answered and
-neither server would see any unacked DHCP queries.
-
-Introduction of the ``communication-recovery`` procedure was
-motivated by issues which may appear when two servers remain online
-but the communication between them remains interrupted for a
-period of time. In earlier Kea versions, the servers having communication
-issues used to drop DHCP packets before transitioning to the
-``partner-down`` state. In some cases they both transitioned to the
-``partner-down`` state, which could potentially result in allocations
-of the same IP addresses or delegated prefixes to different clients
-by the respective servers. By entering the intermediate ``communication-recovery``
-state, these problems are avoided.
-
-If a server in the ``communication-recovery`` state re-establishes
-communication with its partner, it tries to send the partner all
-of the outstanding lease updates it has queued. This is done
-synchronously and may take a considerable amount of time before the server
-transitions to the ``load-balancing`` state and resumes normal operation.
+The values of ``max-ack-delay`` and ``max-unacked-clients`` must be selected
+carefully, taking into account the specifics of the network in which the DHCP
+servers are operating. The server in question may not respond to some DHCP
+clients following administrative policy, or the server may drop malformed
+queries from clients. Therefore, selecting too low a value for the
+``max-unacked-clients`` parameter may result in a transition to the
+``partner-down`` state even though the partner is still operating. On the other
+hand, selecting too high a value may result in never transitioning to the
+``partner-down`` state if the DHCP traffic in the network is very low (e.g. at
+night), because the number of distinct clients trying to communicate with the
+server could be lower than the ``max-unacked-clients`` setting.
+
+In some cases it may be useful to disable the failure-detection mechanism
+altogether, if the servers are located very close to each other and network
+partitioning is unlikely, i.e. failure to respond to heartbeats is only possible
+when the partner is offline. In such cases, set ``max-unacked-clients`` to 0.
+
+The ``delayed-updates-limit`` parameter is used to enable or disable the
+``communication-recovery`` procedure, and controls the server's behavior in the
+``communication-recovery`` state. This parameter can only be used in the
+``load-balancing`` mode.
+
+If a server in the ``load-balancing`` state experiences communication issues
+with its partner (a heartbeat or lease update fail), the server transitions to
+the ``communication-recovery`` state. In this state, the server keeps responding
+to DHCP queries but does not send lease updates to the partner. The lease
+updates are queued until communication is re-established, to ensure that DHCP
+service remains available even in the event of the communication loss between
+the partners. There may appear to be communication loss when either one of the
+servers has terminated, or when both servers remain available but cannot
+communicate with each other. In the former case, the surviving server will
+follow the normal procedure and should eventually transition to the
+``partner-down`` state. In the latter case, both servers should transition to
+the ``communication-recovery`` state and should never transition to the
+``partner-down`` state (if ``max-unacked-clients`` is set to a non-zero value),
+because all DHCP queries are answered and neither server would see any unacked
+DHCP queries.
+
+Introduction of the ``communication-recovery`` procedure was motivated by issues
+which may appear when two servers remain online but the communication between
+them remains interrupted for a period of time. In earlier Kea versions, the
+servers having communication issues used to drop DHCP packets before
+transitioning to the ``partner-down`` state. In some cases they both
+transitioned to the ``partner-down`` state, which could potentially result in
+allocations of the same IP addresses or delegated prefixes to different clients
+by the respective servers. By entering the intermediate
+``communication-recovery`` state, these problems are avoided.
+
+If a server in the ``communication-recovery`` state re-establishes communication
+with its partner, it tries to send the partner all of the outstanding lease
+updates it has queued. This is done synchronously and may take a considerable
+amount of time before the server transitions to the ``load-balancing`` state and
+resumes normal operation.
The maximum number of lease updates which can be queued in the
``communication-recovery`` state is controlled by ``delayed-updates-limit``.
-If the limit is exceeded, the server stops queuing lease updates and
-performs a full database synchronization after re-establishing the
-connection with the partner, instead of sending outstanding lease updates
-before transitioning to the ``load-balancing`` state. Even if the limit is
-exceeded, the server in the ``communication-recovery`` state remains
-responsive to DHCP clients.
+If the limit is exceeded, the server stops queuing lease updates and performs a
+full database synchronization after re-establishing the connection with the
+partner, instead of sending outstanding lease updates before transitioning to
+the ``load-balancing`` state. Even if the limit is exceeded, the server in the
+``communication-recovery`` state remains responsive to DHCP clients.
It may be preferable to set higher values of ``delayed-updates-limit`` when
-there is a risk of prolonged communication interruption between the
-servers and when the lease database is large, to avoid costly
-lease-database synchronization. On the other hand, if the lease
-database is small, the time required to send outstanding lease updates
-may be longer than the lease-database synchronization. In such cases it
-may be better to use a lower value, e.g. 10. The default value of 100
-is a reasonable compromise and should work well in
-most deployments with moderate traffic.
+there is a risk of prolonged communication interruption between the servers and
+when the lease database is large, to avoid costly lease-database synchronization.
+On the other hand, if the lease database is small, the time required to send
+outstanding lease updates may be longer than the lease-database synchronization.
+In such cases it may be better to use a lower value, e.g. 10. The default value
+of 100 is a reasonable compromise and should work well in most deployments with
+moderate traffic.
.. note::
This parameter is new and values for it that work well in some environments
- may not work well in others. Feedback from users will help us build a
- better working set of recommendations.
+ may not work well in others. Feedback from users will help us build a better
+ working set of recommendations.
The ``peers`` parameter contains a list of servers within this HA setup.
-This configuration must contain at least one primary and one secondary
-server. It may also contain an unlimited number of backup servers. In
-this example, there is one backup server which receives lease updates
-from the active servers.
+This configuration must contain at least one primary and one secondary server.
+It may also contain an unlimited number of backup servers. In this example,
+there is one backup server which receives lease updates from the active servers.
Since Kea version 1.9.0, basic HTTP authentication is available
to protect the Kea control agent against local attackers.
- ``name`` - specifies a unique name for the server.
-- ``url`` - specifies the URL to be used to contact this server over
- the control channel. Other servers use this URL to send control
- commands to that server.
+- ``url`` - specifies the URL to be used to contact this server over the
+ control channel. Other servers use this URL to send control commands to that
+ server.
-- ``basic-auth-user`` - specifies the user ID for basic HTTP
- authentication. If not specified or specified as an empty string,
- no authentication header will be added to HTTP transactions.
- It must not contain the colon (:) character.
+- ``basic-auth-user`` - specifies the user ID for basic HTTP authentication. If
+ not specified or specified as an empty string, no authentication header will
+ be added to HTTP transactions. It must not contain the colon (:) character.
- ``basic-auth-password`` - specifies the password for basic HTTP
- authentication. This parameter is ignored when the user ID is not specified or is empty.
- The password is optional; if not specified, an empty password is used.
+ authentication. This parameter is ignored when the user ID is not specified
+ or is empty. The password is optional; if not specified, an empty password is
+ used.
- ``basic-auth-password-file`` - is an alternative to ``basic-auth-password``:
- instead of presenting the password in the configuration file it is
- specified in the file indicated by this parameter.
+ instead of presenting the password in the configuration file it is specified
+ in the file indicated by this parameter.
-- ``role`` - denotes the role of the server in the HA setup. The
- following roles are supported in the load-balancing configuration:
- ``primary``, ``secondary``, and ``backup``. There must be exactly one
- primary and one secondary server in the load-balancing setup.
+- ``role`` - denotes the role of the server in the HA setup. The following
+ roles are supported in the load-balancing configuration: ``primary``,
+ ``secondary``, and ``backup``. There must be exactly one primary and one
+ secondary server in the load-balancing setup.
-- ``auto-failover`` - a boolean value which denotes whether a server
- detecting a partner's failure should automatically start serving the
- partner's clients. The default value of this parameter is ``true``.
+- ``auto-failover`` - a boolean value which denotes whether a server detecting
+ a partner's failure should automatically start serving the partner's clients.
+ The default value of this parameter is ``true``.
-In our example configuration above, both active servers can allocate leases
-from the subnet "192.0.3.0/24". This subnet contains two address pools:
+In our example configuration above, both active servers can allocate leases from
+the subnet "192.0.3.0/24". This subnet contains two address pools:
"192.0.3.100 - 192.0.3.150" and "192.0.3.200 - 192.0.3.250", which are
-associated with HA server scopes using client classification. When
-``server1`` processes a DHCP query, it uses the first pool for lease
-allocation. Conversely, when ``server2`` processes a DHCP query it uses
-the second pool. If either of the servers is in the ``partner-down``
-state, the other can serve leases from both pools; it selects the pool which
-is appropriate for the received query. In other words, if the query
-would normally be processed by ``server2`` but this server is not
-available, ``server1`` allocates the lease from the pool of
-"192.0.3.200 - 192.0.3.250". The Kea control agent in front of
-``server3`` requires basic HTTP authentication, and authorizes the
-user ID "foo" with the password "bar".
+associated with HA server scopes using client classification. When ``server1``
+processes a DHCP query, it uses the first pool for lease allocation. Conversely,
+when ``server2`` processes a DHCP query it uses the second pool. If either of
+the servers is in the ``partner-down`` state, the other can serve leases from
+both pools; it selects the pool which is appropriate for the received query. In
+other words, if the query would normally be processed by ``server2`` but this
+server is not available, ``server1`` allocates the lease from the pool of
+"192.0.3.200 - 192.0.3.250". The Kea control agent in front of ``server3``
+requires basic HTTP authentication, and authorizes the user ID "foo" with the
+password "bar".
.. note::
- The ``url`` schema can be ``http`` or ``https``, but since Kea
- version 1.9.6 the ``https`` schema requires a TLS setup.
- The hostname part must be an IPv4 address or an IPv6 address between square
- brackets, e.g. ``http://[2001:db8::1]:8080/``. Names are not
- accepted.
+ The ``url`` schema can be ``http`` or ``https``, but since Kea version 1.9.6
+ the ``https`` schema requires a TLS setup. The hostname part must be an IPv4
+ address or an IPv6 address between square brackets, e.g.
+ ``http://[2001:db8::1]:8080/``. Names are not accepted.
.. _ha-load-balancing-advanced-config:
Load Balancing with Advanced Classification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-In the previous section, we provided an example of a load-balancing
-configuration with client classification limited to the "HA_server1"
-and "HA_server2" classes, which are dynamically assigned to the
-received DHCP queries. In many cases, HA is needed in deployments
-which already use some other client classification.
-
-Suppose there is a system which classifies devices into two groups:
-phones and laptops, based on some classification criteria specified in the
-Kea configuration file. Both types of devices are allocated leases from
-different address pools. Introducing HA in load-balancing mode
-results in a further split of each of those pools, as each server
-allocates leases for some phones and some laptops. This requires each of
-the existing pools to be split between "HA_server1" and
-"HA_server2", so we end up with the following classes:
+In the previous section, we provided an example of a ``load-balancing``
+configuration with client classification limited to the "HA_server1" and
+"HA_server2" classes, which are dynamically assigned to the received DHCP
+queries. In many cases, HA is needed in deployments which already use some other
+client classification.
+
+Suppose there is a system which classifies devices into two groups: "phones" and
+"laptops", based on some classification criteria specified in the Kea
+configuration file. Both types of devices are allocated leases from different
+address pools. Introducing HA in ``load-balancing`` mode results in a further
+split of each of those pools, as each server allocates leases for some phones
+and some laptops. This requires each of the existing pools to be split between
+"HA_server1" and "HA_server2", so we end up with the following classes:
- "phones_server1"
- "laptops_server1"
- "phones_server2"
- "laptops_server2"
-The corresponding server configuration, using advanced classification
-(and the ``member`` expression), is provided below. For brevity's sake, the
-HA hook library configuration has been removed from this example.
+The corresponding server configuration, using advanced classification (and the
+``member`` expression), is provided below. For brevity's sake, the HA hook
+library configuration has been removed from this example.
::
}],
}
-The configuration provided above splits the address range into four
-pools: two pools dedicated to "HA_server1" and two to "HA_server2". Each server
-can assign leases to both phones and laptops. Both groups of devices are
-assigned addresses from different pools. The "HA_server1" and
-"HA_server2" classes are built-in (see
-:ref:`classification-using-vendor`) and do not need to be declared.
+The configuration provided above splits the address range into four pools: two
+pools dedicated to "HA_server1" and two to "HA_server2". Each server can assign
+leases to both phones and laptops. Both groups of devices are assigned addresses
+from different pools. The "HA_server1" and "HA_server2" classes are built-in
+(see :ref:`classification-using-vendor`) and do not need to be declared.
They are assigned dynamically by the HA hook library as a result of the
-load-balancing algorithm. "phones_*" and "laptop_*" evaluate to
-``true`` when the query belongs to a given combination of other classes,
-e.g. "HA_server1" and "phones". The pool is selected accordingly as
-a result of such an evaluation.
+``load-balancing`` algorithm. "phones_*" and "laptop_*" evaluate to ``true``
+when the query belongs to a given combination of other classes, e.g. "HA_server1"
+and "phones". The pool is selected accordingly as a result of such an evaluation.
-Consult :ref:`classify` for details on how to use the ``member``
-expression and class dependencies.
+Consult :ref:`classify` for details on how to use the ``member`` expression and
+class dependencies.
.. _ha-hot-standby-config:
~~~~~~~~~~~~~~~~~~~~~~~~~
The following is an example configuration of the primary server in a
-hot-standby configuration:
+``hot-standby`` configuration:
::
}]
}
-This configuration is very similar to the load-balancing configuration
-described in :ref:`ha-load-balancing-config`, with a few notable
-differences.
+This configuration is very similar to the load-balancing configuration described
+in :ref:`ha-load-balancing-config`, with a few notable differences.
-The ``mode`` is now set to ``hot-standby``, in which only one server
-responds to DHCP clients. If the primary server is online, it responds
-to all DHCP queries. The ``standby`` server takes over all DHCP traffic
-only if it discovers that the primary is unavailable.
+The ``mode`` is now set to ``hot-standby``, in which only one server responds to
+DHCP clients. If the primary server is online, it responds to all DHCP queries.
+The ``standby`` server takes over all DHCP traffic only if it discovers that the
+primary is unavailable.
-In this mode, the non-primary active server is called ``standby`` and
-that is its role.
+In this mode, the non-primary active server is called ``standby`` and that is
+its role.
Finally, because there is always only one server responding to DHCP queries,
-there is only one scope - "HA_server1" - in use within pool
-definitions. In fact, the ``client-class`` parameter could be removed
-from this configuration without harm, because there can be no conflicts
-in lease allocations by different servers as they do not allocate leases
-concurrently. The ``client-class`` remains in this example mostly for
-demonstration purposes, to highlight the differences between the
-hot-standby and load-balancing modes of operation.
+there is only one scope - "HA_server1" - in use within pool definitions. In fact,
+the ``client-class`` parameter could be removed from this configuration without
+harm, because there can be no conflicts in lease allocations by different
+servers as they do not allocate leases concurrently. The ``client-class``
+remains in this example mostly for demonstration purposes, to highlight the
+differences between the ``hot-standby`` and ``load-balancing`` modes of
+operation.
.. _ha-passive-backup-config:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The following is an example configuration file for the primary server in a
-passive-backup configuration:
+``passive-backup`` configuration:
::
}]
}
-The configurations of three peers are included: one for the primary and
-two for the backup servers.
+The configurations of three peers are included: one for the primary and two for
+the backup servers.
-Many of the parameters present in the load-balancing
-and hot-standby configuration examples are not relevant in the passive-backup
-mode, thus they are not specified here. For example: ``heartbeat-delay``,
+Many of the parameters present in the ``load-balancing`` and ``hot-standby``
+configuration examples are not relevant in the ``passive-backup`` mode, thus
+they are not specified here. For example: ``heartbeat-delay``,
``max-unacked-clients``, and others related to the automatic failover mechanism
-should not be specified in the passive-backup mode.
-
-``wait-backup-ack``
-is a boolean parameter not present in previous examples. It defaults to ``false`` and
-must not be modified in the load-balancing and hot-standby modes. In the passive-backup
-mode this parameter can be set to ``true``, which causes the primary server to expect
-acknowledgments to the lease updates from the backup servers prior to responding
-to the DHCP client. It ensures that the lease has propagated to all servers before
-the client is given the lease, but it poses a risk of losing a DHCP service if
-there is a communication problem with one of the backup servers. This setting
-also increases the latency of the DHCP response, because of the time that the
-primary spends waiting for the acknowledgements. We recommend that the
-``wait-backup-ack`` setting be left at its default value (``false``) if the DHCP service reliability
-is more important than consistency of the lease information between the
-primary and the backups, and in all cases when the DHCP service latency should
-be minimal.
+should not be specified in the ``passive-backup`` mode.
+
+The ``wait-backup-ack`` is a boolean parameter not present in previous examples.
+It defaults to ``false`` and must not be modified in the ``load-balancing`` and
+``hot-standby`` modes. In the ``passive-backup`` mode this parameter can be set
+to ``true``, which causes the primary server to expect acknowledgments to the
+lease updates from the backup servers prior to responding to the DHCP client. It
+ensures that the lease has propagated to all servers before the client is given
+the lease, but it poses a risk of losing a DHCP service if there is a
+communication problem with one of the backup servers. This setting also
+increases the latency of the DHCP response, because of the time that the primary
+spends waiting for the acknowledgements. We recommend that the
+``wait-backup-ack`` setting be left at its default value (``false``) if the DHCP
+service reliability is more important than consistency of the lease information
+between the primary and the backups, and in all cases when the DHCP service
+latency should be minimal.
.. note::
- Currently, active servers place lease updates to be sent to peers onto internal
- queues (one queue per peer/URL). In passive-backup mode, active servers do not
- wait for lease updates to be acknowledged; thus during times of heavy client
- traffic it is possible for the number of lease updates queued for transmission
- to accumulate faster than they can be delivered. As client traffic lessens the
- queues begin to empty. Since Kea 2.0.0, active servers monitor the size of
- these queues and emit periodic warnings (see HTTP_CILENT_QUEUE_SIZE_GROWING
- in :ref:`kea-messages`)
- if they perceive a queue as growing too quickly. The warnings cease once
- the queue size begins to shrink. These messages are intended as a bellwether
- and seeing them sporadically during times of heavy traffic load does not
- necessarily indicate a problem. If, however, they occur continually during
- times of routine traffic load, they likely indicate potential mismatches in
- server capabilities and/or configuration; this should be investigated, as
- the size of the queues may eventually impair an active server's ability to
- respond to clients in a timely manner.
+ Currently, active servers place lease updates to be sent to peers onto
+ internal queues (one queue per peer/URL). In ``passive-backup`` mode, active
+ servers do not wait for lease updates to be acknowledged; thus during times
+ of heavy client traffic it is possible for the number of lease updates queued
+ for transmission to accumulate faster than they can be delivered. As client
+ traffic lessens the queues begin to empty. Since Kea 2.0.0, active servers
+ monitor the size of these queues and emit periodic warnings (see
+ HTTP_CILENT_QUEUE_SIZE_GROWING in :ref:`kea-messages`) if they perceive a
+ queue as growing too quickly. The warnings cease once the queue size begins
+ to shrink. These messages are intended as a bellwether and seeing them
+ sporadically during times of heavy traffic load does not necessarily indicate
+ a problem. If, however, they occur continually during times of routine
+ traffic load, they likely indicate potential mismatches in server
+ capabilities and/or configuration; this should be investigated, as the size
+ of the queues may eventually impair an active server's ability to respond to
+ clients in a timely manner.
.. _ha-sharing-lease-info:
Lease Information Sharing
~~~~~~~~~~~~~~~~~~~~~~~~~
-An HA-enabled server informs its active partner about allocated or
-renewed leases by sending appropriate control commands, and the partner
-updates the lease information in its own database. When the server
-starts up for the first time or recovers after a failure, it
-synchronizes its lease database with its partner. These two mechanisms
-guarantee consistency of the lease information between the servers and
-allow the designation of one of the servers to handle the entire DHCP
-traffic load if the other server becomes unavailable.
-
-In some cases, though, it is desirable to disable lease updates and/or
-database synchronization between the active servers, if the exchange of
-information about the allocated leases is performed using some other
-mechanism. Kea supports various database types that can be used to store
-leases, including MySQL and PostgreSQL. Those databases include built-in
-solutions for data replication which are often used by Kea administrators
-to provide redundancy.
-
-The HA hook library supports such scenarios by disabling lease updates
-over the control channel and/or lease-database synchronization, leaving
-the server to rely on the database replication mechanism. This is
-controlled by the two boolean parameters ``send-lease-updates`` and
-``sync-leases``, whose values default to ``true``:
+An HA-enabled server informs its active partner about allocated or renewed
+leases by sending appropriate control commands, and the partner updates the
+lease information in its own database. When the server starts up for the first
+time or recovers after a failure, it synchronizes its lease database with its
+partner. These two mechanisms guarantee consistency of the lease information
+between the servers and allow the designation of one of the servers to handle
+the entire DHCP traffic load if the other server becomes unavailable.
+
+In some cases, though, it is desirable to disable lease updates and/or database
+synchronization between the active servers, if the exchange of information about
+the allocated leases is performed using some other mechanism. Kea supports
+various database types that can be used to store leases, including MySQL and
+PostgreSQL. Those databases include built-in solutions for data replication
+which are often used by Kea administrators to provide redundancy.
+
+The HA hook library supports such scenarios by disabling lease updates over the
+control channel and/or lease-database synchronization, leaving the server to
+rely on the database replication mechanism. This is controlled by the two
+boolean parameters ``send-lease-updates`` and ``sync-leases``, whose values
+default to ``true``:
::
}
-In the most typical use case, both parameters are set to the same value,
-i.e. both are ``false`` if database replication is in use, or both are
-``true`` otherwise. Introducing two separate parameters to control lease
-updates and lease-database synchronization is aimed at possible special
-use cases; for example, when synchronization is performed by copying a
-lease file (therefore ``sync-leases`` is set to ``false``), but lease
-updates should be conducted as usual (``send-lease-updates`` is set to
-``true``). It should be noted that Kea does not natively support such
-use cases, but users may develop their own scripts and tools around Kea
-to provide such mechanisms. The HA hook library configuration is
-designed to maximize flexibility of administration.
+In the most typical use case, both parameters are set to the same value, i.e.
+both are ``false`` if database replication is in use, or both are ``true``
+otherwise. Introducing two separate parameters to control lease updates and
+lease-database synchronization is aimed at possible special use cases; for
+example, when synchronization is performed by copying a lease file (therefore
+``sync-leases`` is set to ``false``), but lease updates should be conducted as
+usual (``send-lease-updates`` is set to ``true``). It should be noted that Kea
+does not natively support such use cases, but users may develop their own
+scripts and tools around Kea to provide such mechanisms. The HA hook library
+configuration is designed to maximize flexibility of administration.
.. _ha-syncing-page-limit:
Controlling Lease-Page Size Limit
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-An HA-enabled server initiates synchronization of the lease database
-after downtime or upon receiving the ``ha-sync`` command. The server
-uses commands described in :ref:`command-lease4-get-page` and
-:ref:`command-lease6-get-page` to fetch
-leases from its partner server (lease queries). The size of the results
-page (the maximum number of leases to be returned in a single response
-to one of these commands) can be controlled via configuration of the HA hook
-library. Increasing the page size decreases the number of lease
-queries sent to the partner server, but it causes the partner server to
-generate larger responses, which lengthens transmission time as well as
-increases memory and CPU utilization on both servers. Decreasing the
-page size helps to decrease resource utilization, but requires more
-lease queries to be issued to fetch the entire lease database.
-
-The default value of the ``sync-page-limit`` command controlling the
-page size is 10000. This means that the entire lease database can be
-fetched with a single command if the size of the database is equal to or
-less than 10000 lines.
+An HA-enabled server initiates synchronization of the lease database after
+downtime or upon receiving the ``ha-sync`` command. The server uses commands
+described in :ref:`command-lease4-get-page` and :ref:`command-lease6-get-page`
+to fetch leases from its partner server (lease queries). The size of the results
+page (the maximum number of leases to be returned in a single response to one of
+these commands) can be controlled via configuration of the HA hook library.
+Increasing the page size decreases the number of lease queries sent to the
+partner server, but it causes the partner server to generate larger responses,
+which lengthens transmission time as well as increases memory and CPU
+utilization on both servers. Decreasing the page size helps to decrease resource
+utilization, but requires more lease queries to be issued to fetch the entire
+lease database.
+
+The default value of the ``sync-page-limit`` command controlling the page size
+is 10000. This means that the entire lease database can be fetched with a single
+command if the size of the database is equal to or less than 10000 lines.
.. _ha-syncing-timeouts:
~~~~~~~~
In deployments with a large number of clients connected to the network,
-lease-database synchronization after a server failure may be a
-time-consuming operation. The synchronizing server must gather all
-leases from its partner, which yields a large response over the RESTful
-interface. The server receives leases using the paging mechanism
-described in :ref:`ha-syncing-page-limit`. Before the page of leases is fetched,
-the synchronizing server sends a ``dhcp-disable`` command to disable the
-DHCP service on the partner server. If the service is already disabled,
-this command resets the timeout for the DHCP service being disabled,
-which by default is set to 60 seconds. If fetching a single
-page of leases takes longer than the specified time, the partner server
-assumes that the synchronizing server has died and resumes its DHCP
-service. The connection of the synchronizing server with its partner is
-also protected by the timeout. If the synchronization of a single page
-of leases takes longer than the specified time, the synchronizing server
-terminates the connection and the synchronization fails. Both timeout
-values are controlled by a single configuration parameter,
-``sync-timeout``. The following configuration snippet demonstrates how
-to modify the timeout for automatic re-enabling of the DHCP service on
-the partner server and how to increase the timeout for fetching a single
-page of leases from 60 seconds to 90 seconds:
+lease-database synchronization after a server failure may be a time-consuming
+operation. The synchronizing server must gather all leases from its partner,
+which yields a large response over the RESTful interface. The server receives
+leases using the paging mechanism described in :ref:`ha-syncing-page-limit`.
+Before the page of leases is fetched, the synchronizing server sends a
+``dhcp-disable`` command to disable the DHCP service on the partner server. If
+the service is already disabled, this command resets the timeout for the DHCP
+service being disabled, which by default is set to 60 seconds. If fetching a
+single page of leases takes longer than the specified time, the partner server
+assumes that the synchronizing server has died and resumes its DHCP service. The
+connection of the synchronizing server with its partner is also protected by the
+timeout. If the synchronization of a single page of leases takes longer than the
+specified time, the synchronizing server terminates the connection and the
+synchronization fails. Both timeout values are controlled by a single
+configuration parameter, ``sync-timeout``. The following configuration snippet
+demonstrates how to modify the timeout for automatic re-enabling of the DHCP
+service on the partner server and how to increase the timeout for fetching a
+single page of leases from 60 seconds to 90 seconds:
::
}
-It is important to note that extending this ``sync-timeout`` value may
-sometimes be insufficient to prevent issues with timeouts during
-lease-database synchronization. The control commands travel via the
-Control Agent, which also monitors incoming (with a synchronizing
-server) and outgoing (with a DHCP server) connections for timeouts. The
-DHCP server also monitors the connection from the Control Agent for
-timeouts. Those timeouts cannot currently be modified via configuration;
-extending these timeouts is only possible by modifying them in the Kea
-code and recompiling the server. The relevant constants are located in
+It is important to note that extending this ``sync-timeout`` value may sometimes
+be insufficient to prevent issues with timeouts during lease-database
+synchronization. The control commands travel via the Control Agent, which also
+monitors incoming (with a synchronizing server) and outgoing (with a DHCP server)
+connections for timeouts. The DHCP server also monitors the connection from the
+Control Agent for timeouts. Those timeouts cannot currently be modified via
+configuration; extending these timeouts is only possible by modifying them in
+the Kea code and recompiling the server. The relevant constants are located in
the Kea source at: ``src/lib/config/timeouts.h``.
.. _ha-pause-state-machine:
Pausing the HA State Machine
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The ``high-availability`` state machine includes many different states
-described in detail in :ref:`ha-server-states`. The server
-enters each state when certain conditions are met, most often taking
-into account the partner server's state. In some states the server
-performs specific actions, e.g. synchronization of the lease database in
-the ``syncing`` state, or responding to DHCP queries according to the
-configured mode of operation in the ``load-balancing`` and
-``hot-standby`` states.
-
-By default, transitions between the states are performed automatically
-and the server administrator has no direct control over when the transitions
-take place; in most cases, the administrator does not need such control.
-In some situations, however, the administrator may want to "pause" the
-HA state machine in a selected state to perform some additional
-administrative actions before the server transitions to the next state.
-
-Consider a server failure which results in the loss of the entire lease
-database. Typically, the server rebuilds its lease database when it
-enters the ``syncing`` state by querying the partner server for leases,
-but it is possible that the partner was also experiencing a failure and
-lacks lease information. In this case, it may be required to reconstruct
-lease databases on both servers from some external source, e.g. a backup
-server. If the lease database is to be reconstructed via the RESTful API,
-the servers should be started in the initial, i.e. ``waiting``, state
-and remain in this state while leases are being added. In particular,
-the servers should not attempt to synchronize their lease databases nor
-start serving DHCP clients.
-
-The HA hook library provides configuration parameters and a command to
-control pausing and resuming the HA state machine. The
-following configuration causes the HA state machine to pause in the
-``waiting`` state after server startup.
+The ``high-availability`` state machine includes many different states described
+in detail in :ref:`ha-server-states`. The server enters each state when certain
+conditions are met, most often taking into account the partner server's state.
+In some states the server performs specific actions, e.g. synchronization of the
+lease database in the ``syncing`` state, or responding to DHCP queries according
+to the configured mode of operation in the ``load-balancing`` and ``hot-standby``
+states.
+
+By default, transitions between the states are performed automatically and the
+server administrator has no direct control over when the transitions take place;
+in most cases, the administrator does not need such control. In some situations,
+however, the administrator may want to "pause" the HA state machine in a
+selected state to perform some additional administrative actions before the
+server transitions to the next state.
+
+Consider a server failure which results in the loss of the entire lease database.
+Typically, the server rebuilds its lease database when it enters the ``syncing``
+state by querying the partner server for leases, but it is possible that the
+partner was also experiencing a failure and lacks lease information. In this
+case, it may be required to reconstruct lease databases on both servers from
+some external source, e.g. a backup server. If the lease database is to be
+reconstructed via the RESTful API, the servers should be started in the initial,
+i.e. ``waiting``, state and remain in this state while leases are being added.
+In particular, the servers should not attempt to synchronize their lease
+databases nor start serving DHCP clients.
+
+The HA hook library provides configuration parameters and a command to control
+pausing and resuming the HA state machine. The following configuration causes
+the HA state machine to pause in the ``waiting`` state after server startup.
::
}
-The ``pause`` parameter value ``once`` denotes that the state machine
-should be paused upon the first transition to the ``waiting`` state;
-later transitions to this state will not cause the state machine to
-pause. Two other supported values of the ``pause`` parameter are
-``always`` and ``never``. The latter is the default value for each
-state, which instructs the server never to pause the state machine.
+The ``pause`` parameter value ``once`` denotes that the state machine should be
+paused upon the first transition to the ``waiting`` state; later transitions to
+this state will not cause the state machine to pause. Two other supported values
+of the ``pause`` parameter are ``always`` and ``never``. The latter is the
+default value for each state, which instructs the server never to pause the
+state machine.
-In order to "unpause" the state machine, the ``ha-continue`` command
-must be sent to the paused server. This command does not take any
-arguments. See :ref:`ha-control-commands` for details about commands
-specific to the HA hook library.
+In order to "unpause" the state machine, the ``ha-continue`` command must be
+sent to the paused server. This command does not take any arguments. See
+:ref:`ha-control-commands` for details about commands specific to the HA hook
+library.
-It is possible to configure the state machine to pause in more than one
-state. Consider the following configuration:
+It is possible to configure the state machine to pause in more than one state.
+Consider the following configuration:
::
}
-This configuration instructs the server to pause the state machine every
-time it transitions to the ``ready`` state and upon the first transition
-to the ``partner-down`` state.
+This configuration instructs the server to pause the state machine every time it
+transitions to the ``ready`` state and upon the first transition to the
+``partner-down`` state.
-Refer to :ref:`ha-server-states` for a complete list of
-server states. The state machine can be paused in any of the supported
-states; however, it is not practical to pause in the ``backup`` or
-``terminated`` states because the server never transitions out of these
-states anyway.
+Refer to :ref:`ha-server-states` for a complete list of server states. The state
+machine can be paused in any of the supported states; however, it is not
+practical to pause in the ``backup`` or ``terminated`` states because the server
+never transitions out of these states anyway.
.. note::
- In the ``syncing`` state the server is paused before it makes an
- attempt to synchronize the lease database with a partner. To pause
- the state machine after lease-database synchronization, use the
- ``ready`` state instead.
+ In the ``syncing`` state the server is paused before it makes an attempt to
+ synchronize the lease database with a partner. To pause the state machine
+ after lease-database synchronization, use the ``ready`` state instead.
.. note::
- The state of the HA state machine depends on the state of the
- cooperating server. Therefore,
- pausing the state machine of one server may affect the operation of
- the partner server. For example: if the primary server is paused in
- the ``waiting`` state, the partner server will also remain in the
- ``waiting`` state until the state machine of the primary server is
- resumed and that server transitions to the ``ready`` state.
+ The state of the HA state machine depends on the state of the cooperating
+ server. Therefore, pausing the state machine of one server may affect the
+ operation of the partner server. For example: if the primary server is paused
+ in the ``waiting`` state, the partner server will also remain in the
+ ``waiting`` state until the state machine of the primary server is resumed
+ and that server transitions to the ``ready`` state.
.. _ha-ctrl-agent-config:
Control Agent Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~
-:ref:`kea-ctrl-agent` describes in detail the Kea daemon, which
-provides a RESTful interface to control the Kea servers. The same
-functionality is used by the High Availability hook library to establish
-communication between the HA peers. Therefore, the HA library requires
-that the Control Agent (CA) be started for each DHCP instance within the
-HA setup. If the Control Agent is not started, the peers cannot
-communicate with a particular DHCP server (even if the DHCP
-server itself is online) and may eventually consider this server to be
-offline.
+The :ref:`kea-ctrl-agent` describes in detail the Kea daemon, which provides a
+RESTful interface to control the Kea servers. The same functionality is used by
+the High Availability hook library to establish communication between the HA
+peers. Therefore, the HA library requires that the Control Agent (CA) be started
+for each DHCP instance within the HA setup. If the Control Agent is not started,
+the peers cannot communicate with a particular DHCP server (even if the DHCP
+server itself is online) and may eventually consider this server to be offline.
The following is an example configuration for the CA running on the same
machine as the primary server. This configuration is valid for both the
Multi-Threaded Configuration (HA+MT)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-HA peer communication consists of specialized API commands sent between
-HA peers. Prior to Kea 1.9.7, each peer had to be paired with a local
-instance of ``kea-ctrl-agent`` in order to exchange commands. The agent received
-HA commands via HTTP, communicated via Linux socket with the local peer to
-carry out the command, and then sent the response back to the requesting
-peer via HTTP. To send HA commands, each peer opened its own HTTP client
-connection to the URL of each of its peers.
+HA peer communication consists of specialized API commands sent between HA peers.
+Prior to Kea 1.9.7, each peer had to be paired with a local instance of
+``kea-ctrl-agent`` in order to exchange commands. The agent received HA commands
+via HTTP, communicated via Linux socket with the local peer to carry out the
+command, and then sent the response back to the requesting peer via HTTP. To
+send HA commands, each peer opened its own HTTP client connection to the URL of
+each of its peers.
-In Kea 1.9.7 and newer, it is possible to configure HA to use direct multi-
-threaded communication between peers. We refer to this mode as HA+MT.
+In Kea 1.9.7 and newer, it is possible to configure HA to use direct
+multi-threaded communication between peers. We refer to this mode as HA+MT.
With HA+MT enabled, each peer runs its own dedicated, internal HTTP listener
-(i.e. server) which receives and responds to commands directly, thus
-eliminating the need for an agent to carry out HA protocol between
-peers. In addition, both the listener and client components use multi-
-threading to support multiple, concurrent connections between peers. By
-eliminating the agent and executing multiple command exchanges in parallel,
-HA throughput between peers should improve considerably in most situations.
+(i.e. server) which receives and responds to commands directly, thus eliminating
+the need for an agent to carry out HA protocol between peers. In addition, both
+the listener and client components use multi-threading to support multiple,
+concurrent connections between peers. By eliminating the agent and executing
+multiple command exchanges in parallel, HA throughput between peers should
+improve considerably in most situations.
The following parameters have been added to the HA configuration, to support
HA+MT operation:
-- ``enable-multi-threading`` - enables or disables multi-threading HA
- peer communication (HA+MT). Kea core multi-threading
- must be enabled for HA+MT to operate. When ``false`` (the default),
- the server operates as in earlier versions, relying on ``kea-ctrl-agent`` and using
- single-threaded HTTP client processing.
-
-- ``http-dedicated-listener`` - enables or disables the creation of a
- dedicated, internal HTTP listener through which the server receives HA
- messages from its peers. The internal listener replaces the role of
- ``kea-ctrl-agent`` traffic, allowing peers to send their HA commands directly
- to each other. The listener listens on the peer's ``url``. When
- false (the default), the server relies on ``kea-ctrl-agent``. This parameter
- has been provided largely for flexibility and testing; running HA+MT without
- dedicated listeners enabled will substantially limit HA throughput.
-
-- ``http-listener-threads`` - indicates the maximum number of threads the dedicated listener
- should use. A value of 0 instructs the server to use the same number of threads
- that the Kea core is using for DHCP multi-threading. The default is 0.
-
-- ``http-client-threads`` - indicates the maximum number of threads that should be used
- to send HA messages to its peers. A value of 0 instructs the server to use
- the same number of threads that the Kea core is using for DHCP multi-threading.
+- ``enable-multi-threading`` - enables or disables multi-threading HA peer
+ communication (HA+MT). Kea core multi-threading must be enabled for HA+MT to
+ operate. When ``false`` (the default), the server operates as in earlier
+ versions, relying on ``kea-ctrl-agent`` and using single-threaded HTTP client
+ processing.
+
+- ``http-dedicated-listener`` - enables or disables the creation of a dedicated,
+ internal HTTP listener through which the server receives HA messages from its
+ peers. The internal listener replaces the role of ``kea-ctrl-agent`` traffic,
+ allowing peers to send their HA commands directly to each other. The listener
+ listens on the peer's ``url``. When ``false`` (the default), the server
+ relies on ``kea-ctrl-agent``. This parameter has been provided largely for
+ flexibility and testing; running HA+MT without dedicated listeners enabled
+ will substantially limit HA throughput.
+
+- ``http-listener-threads`` - indicates the maximum number of threads the
+ dedicated listener should use. A value of 0 instructs the server to use the
+ same number of threads that the Kea core is using for DHCP multi-threading.
The default is 0.
+- ``http-client-threads`` - indicates the maximum number of threads that should
+ be used to send HA messages to its peers. A value of 0 instructs the server
+ to use the same number of threads that the Kea core is using for DHCP
+ multi-threading. The default is 0.
+
These parameters are grouped together under a map element, ``multi-threading``,
as illustrated below:
// This is the configuration of this server instance.
{
"name": "server1",
- // This specifies the URL of our server instance. Since the
- // HA+MT uses direct connection, the DHCPv4 server open its own
- // socket. Note it must be different than the one used by the
- // CA (typically 8000). In this example, 8001 is used.
+ // This specifies the URL of our server instance.
+ // Since the HA+MT uses direct connection, the
+ // DHCPv4 server open its own socket. Note it must
+ // be different than the one used by the CA
+ // (typically 8000). In this example, 8001 is used.
"url": "http://192.0.2.1:8001/",
- // This server is primary. The other one must be secondary.
+ // This server is primary. The other one must be
+ // secondary.
"role": "primary"
},
// This is the configuration of our HA peer.
{
"name": "server2",
- // This specifies the URL of our server instance. Since the
- // HA+MT uses direct connection, the DHCPv4 server open its own
- // socket. Note it must be different than the one used by the
- // CA (typically 8000). In this example, 8001 is used.
+ // This specifies the URL of our server instance.
+ // Since the HA+MT uses direct connection, the
+ // DHCPv4 server open its own socket. Note it must
+ // be different than the one used by the CA
+ // (typically 8000). In this example, 8001 is used.
"url": "http://192.0.2.2:8001/",
// The partner is a secondary. Our is primary.
"role": "secondary"
...
-In the example above, HA+MT is enabled with four threads for the listener
-and four threads for the client.
+In the example above, HA+MT is enabled with four threads for the listener and
+four threads for the client.
.. note::
- It is essential to configure the ports correctly. One common mistake
- is to configure CA to listen on port 8000 and also configure dedicated listeners on port 8000.
- In such a configuration, the communication will still work over CA,
+ It is essential to configure the ports correctly. One common mistake is to
+ configure CA to listen on port 8000 and also configure dedicated listeners on
+ port 8000. In such a configuration, the communication will still work over CA,
but it will be slow and the DHCP server will fail to bind sockets.
- Administrators should ensure that dedicated listeners use a different
- port (8001 is a suggested alternative); if ports are misconfigured
- or the ports dedicated to CA are used, the performance bottlenecks
- caused by the single-threaded nature of CA and the sequential nature of
- the UNIX socket that connects CA to DHCP servers will nullify any performance gains offered by HA+MT.
+ Administrators should ensure that dedicated listeners use a different port
+ (8001 is a suggested alternative); if ports are misconfigured or the ports
+ dedicated to CA are used, the performance bottlenecks caused by the
+ single-threaded nature of CA and the sequential nature of the UNIX socket
+ that connects CA to DHCP servers will nullify any performance gains offered
+ by HA+MT.
.. _ha-parked-packet-limit:
lease, it will "park" the client response until the peer acknowledges the lease
update. At that point, the server will "unpark" the response and send it to the
client. This applies to client queries which cause lease changes, such as
-DHCPREQUEST for DHCPv4 and Request, Renew, and Rebind for DHCPv6. It does not apply
-to DHPCDISCOVERs (v4) or Solicits (v6).
+DHCPREQUEST for DHCPv4 and Request, Renew, and Rebind for DHCPv6. It does not
+apply to DHPCDISCOVERs (v4) or Solicits (v6).
There is a global parameter, ``parked-packet-limit``, that may be used to limit
the number of responses that may be parked at any given time. This acts as a
-form of congestion handling and protects the server from being swamped when
-the volume of client queries is outpacing the server's ability to respond. Once
-the limit is reached, the server emits a log and drops any new responses
-until parking spaces are available.
+form of congestion handling and protects the server from being swamped when the
+volume of client queries is outpacing the server's ability to respond. Once the
+limit is reached, the server emits a log and drops any new responses until
+parking spaces are available.
In general, smaller values for the parking lot limit are likely to cause more
drops but with shorter response times. Larger values are likely to result in
.. warning::
- Using too small a value may result in an unnecessarily high drop rate,
- while using too large a value may lead to response times that are
- simply too long to be useful. A value of 0, while allowed, disables the
- limit altogether, but this is highly discouraged as it may lead to Kea servers
- becoming unresponsive to clients. Choosing the best value is very
- site-specific; we recommend users initially leave it at the default value of 256 and observe
+ Using too small a value may result in an unnecessarily high drop rate, while
+ using too large a value may lead to response times that are simply too long
+ to be useful. A value of 0, while allowed, disables the limit altogether, but
+ this is highly discouraged as it may lead to Kea servers becoming
+ unresponsive to clients. Choosing the best value is very site-specific; we
+ recommend users initially leave it at the default value of 256 and observe
how the system behaves over time with varying load conditions.
::
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Having a pair of servers providing High Availability allows for controlled
-shutdown and maintenance of those servers without disrupting the DHCP
-service. For example, an administrator can perform an upgrade of one of
-the servers while the other one continues to respond to DHCP queries.
-When the first server is upgraded and back online, the upgrade can be performed for
-the second server.
-
-A typical problem reported with early versions
-of the High Availability hook library was that the administrator did not
-have direct control over the state of the DHCP server. Shutting down
-one of the servers for maintenance did not necessarily cause the other
-server to start responding to all DHCP queries, because the
-failure-detection algorithm described in :ref:`ha-scope-transition` requires that
-the partner not respond for a configured period of time and,
-depending on the configuration, may also require that a number of DHCP
-requests not be responded to for a specified period of time. The
-maintenance procedure, however, requires that the administrator be able
-to instruct one of the servers to instantly start serving all DHCP clients,
-and the other server to instantly stop serving any DHCP clients, so it
-can be safely shut down.
-
-The maintenance feature of the High Availability hook library addresses
-this situation. The ``ha-maintenance-start`` command was introduced to allow
-the administrator to put the pair of the active servers in states in which
-one of them is responding to all DHCP queries and the other one is awaiting
-shutdown.
-
-Suppose that the HA setup includes two active servers, ``server1``
-and ``server2``, and the latter needs to be shut down for maintenance.
+shutdown and maintenance of those servers without disrupting the DHCP service.
+For example, an administrator can perform an upgrade of one of the servers while
+the other one continues to respond to DHCP queries. When the first server is
+upgraded and back online, the upgrade can be performed for the second server.
+
+A typical problem reported with early versions of the High Availability hook
+library was that the administrator did not have direct control over the state of
+the DHCP server. Shutting down one of the servers for maintenance did not
+necessarily cause the other server to start responding to all DHCP queries,
+because the failure-detection algorithm described in :ref:`ha-scope-transition`
+requires that the partner not respond for a configured period of time and,
+depending on the configuration, may also require that a number of DHCP requests
+not be responded to for a specified period of time. The maintenance procedure,
+however, requires that the administrator be able to instruct one of the servers
+to instantly start serving all DHCP clients, and the other server to instantly
+stop serving any DHCP clients, so it can be safely shut down.
+
+The maintenance feature of the High Availability hook library addresses this
+situation. The ``ha-maintenance-start`` command was introduced to allow the
+administrator to put the pair of the active servers in states in which one of
+them is responding to all DHCP queries and the other one is awaiting shutdown.
+
+Suppose that the HA setup includes two active servers, ``server1`` and
+``server2``, and the latter needs to be shut down for maintenance.
The administrator can send the ``ha-maintenance-start`` command to ``server1``,
-as this is the server which is going to handle the DHCP traffic while the
-other one is offline. ``server1`` responds with an error if its state
-or the partner's state does not allow for a maintenance shutdown: for example,
-if maintenance is not supported for the backup server or if the server is
-in the ``terminated`` state. Also, an error is returned if the ``ha-maintenance-start``
-request was already sent to the other server.
-
-Upon receiving the ``ha-maintenance-start`` command, ``server1``
-sends the ``ha-maintenance-notify`` command to ``server2`` to put it
-in the ``in-maintenance`` state. If ``server2`` confirms, ``server1``
-transitions to the ``partner-in-maintenance`` state. This is similar
-to the ``partner-down`` state, except that in the ``partner-in-maintenance``
-state ``server1`` continues to send lease updates to ``server2`` until
-the administrator shuts down ``server2``. ``server1`` now responds to all
-DHCP queries.
-
-The administrator can now safely shut down ``server2`` in the
-``in-maintenance`` state and perform any necessary maintenance actions. While
-``server2`` is offline, ``server1`` will obviously not be able to communicate
-with its partner, so it will immediately transition to the ``partner-down``
-state; it will continue to respond to all DHCP queries but will
-no longer send lease updates to ``server2``. Restarting ``server2`` after
-the maintenance will trigger normal state negotiation, lease-database
-synchronization, and, ultimately, a transition to the normal ``load-balancing`` or
-``hot-standby`` state. Maintenance can then be performed on ``server1``,
-after sending the ``ha-maintenance-start`` command to ``server2``.
-
-If the ``ha-maintenance-start`` command was sent to the server and the
-server has transitioned to the ``partner-in-maintenance`` state, it is
-possible to transition both it and its partner back to their previous states
-to resume the normal operation of the HA pair. This is achieved by
-sending the ``ha-maintenance-cancel`` command to the server that is
-in the ``partner-in-maintenance`` state. However, if the server has
-already transitioned to the ``partner-down`` state as a result of
-detecting that the partner is offline, canceling the maintenance
-is no longer possible. In that case, it is necessary to restart the other server
-and allowing it to complete its normal state negotiation process.
+as this is the server which is going to handle the DHCP traffic while the other
+one is offline. ``server1`` responds with an error if its state or the partner's
+state does not allow for a maintenance shutdown: for example, if maintenance is
+not supported for the backup server or if the server is in the ``terminated``
+state. Also, an error is returned if the ``ha-maintenance-start`` request was
+already sent to the other server.
+
+Upon receiving the ``ha-maintenance-start`` command, ``server1`` sends the
+``ha-maintenance-notify`` command to ``server2`` to put it in the
+``in-maintenance`` state. If ``server2`` confirms, ``server1`` transitions to
+the ``partner-in-maintenance`` state. This is similar to the ``partner-down``
+state, except that in the ``partner-in-maintenance`` state ``server1`` continues
+to send lease updates to ``server2`` until the administrator shuts down
+``server2``. ``server1`` now responds to all DHCP queries.
+
+The administrator can now safely shut down ``server2`` in the ``in-maintenance``
+state and perform any necessary maintenance actions. While ``server2`` is
+offline, ``server1`` will obviously not be able to communicate with its partner,
+so it will immediately transition to the ``partner-down`` state; it will
+continue to respond to all DHCP queries but will no longer send lease updates to
+``server2``. Restarting ``server2`` after the maintenance will trigger normal
+state negotiation, lease-database synchronization, and, ultimately, a transition
+to the normal ``load-balancing`` or ``hot-standby`` state. Maintenance can then
+be performed on ``server1``, after sending the ``ha-maintenance-start`` command
+to ``server2``.
+
+If the ``ha-maintenance-start`` command was sent to the server and the server
+has transitioned to the ``partner-in-maintenance`` state, it is possible to
+transition both it and its partner back to their previous states to resume the
+normal operation of the HA pair. This is achieved by sending the
+``ha-maintenance-cancel`` command to the server that is in the
+``partner-in-maintenance`` state. However, if the server has already
+transitioned to the ``partner-down`` state as a result of detecting that the
+partner is offline, canceling the maintenance is no longer possible. In that
+case, it is necessary to restart the other server and allowing it to complete
+its normal state negotiation process.
Upgrading from Older HA Versions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To upgrade from an older HA hook library to the current version, the
-administrator must shut down one of the servers and rely on the
-failover mechanism to force the online server to transition to the
-``partner-down`` state and start serving all DHCP clients. Once the hook
-library on the first server is upgraded to a current version, the
-``ha-maintenance-start`` command can be used to upgrade the second server.
-
-In such a case, shut down the server running the old version. Next,
-send the ``ha-maintenance-start`` command to the server that has been
-upgraded. This server should
-immediately transition to the ``partner-down`` state as it cannot
-communicate with its offline partner. In the ``partner-down``
-state the first (upgraded) server will respond to all DHCP requests, allowing the
-administrator to perform the upgrade on the second server.
+administrator must shut down one of the servers and rely on the failover
+mechanism to force the online server to transition to the ``partner-down`` state
+and start serving all DHCP clients. Once the hook library on the first server is
+upgraded to a current version, the ``ha-maintenance-start`` command can be used
+to upgrade the second server.
+
+In such a case, shut down the server running the old version. Next, send the
+``ha-maintenance-start`` command to the server that has been upgraded. This
+server should immediately transition to the ``partner-down`` state as it cannot
+communicate with its offline partner. In the ``partner-down`` state the first
+(upgraded) server will respond to all DHCP requests, allowing the administrator
+to perform the upgrade on the second server.
.. note::
- Do not send the ``ha-maintenance-start`` command while the server
- running the old hook library is still online. The server receiving
- this command will return an error.
+ Do not send the ``ha-maintenance-start`` command while the server running the
+ old hook library is still online. The server receiving this command will
+ return an error.
.. _ha-control-commands:
Control Commands for High Availability
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Even though the HA hook library is designed to automatically resolve
-issues with DHCP service interruptions by redirecting the DHCP traffic
-to a surviving server and synchronizing the lease database as
-needed, it may be useful for the administrator to have more control
-over both servers' behavior. In particular, it may be useful to be able to
-trigger lease-database synchronization on demand, or
-to manually set the HA scopes that are being served.
+Even though the HA hook library is designed to automatically resolve issues with
+DHCP service interruptions by redirecting the DHCP traffic to a surviving server
+and synchronizing the lease database as needed, it may be useful for the
+administrator to have more control over both servers' behavior. In particular,
+it may be useful to be able to trigger lease-database synchronization on demand,
+or to manually set the HA scopes that are being served.
-The backup server can sometimes be used to handle DHCP traffic
-if both active servers are down. The backup server does not perform the
-failover function automatically; thus, in order to use the backup server
-to respond to DHCP queries, the server administrator must enable this
-function manually.
+The backup server can sometimes be used to handle DHCP traffic if both active
+servers are down. The backup server does not perform the failover function
+automatically; thus, in order to use the backup server to respond to DHCP
+queries, the server administrator must enable this function manually.
-The following sections describe commands supported by the HA hook
-library which are available for the administrator.
+The following sections describe commands supported by the HA hook library which
+are available for the administrator.
.. _command-ha-sync:
The ``ha-sync`` Command
-----------------------
-The ``ha-sync`` command instructs the server to synchronize its local
-lease database with the selected peer. The server fetches all leases
-from the peer and updates any locally stored leases which are older
-than those fetched. It also creates new leases when any of those fetched
-do not exist in the local database. All leases that are not returned by
-the peer but are in the local database are preserved. The database
-synchronization is unidirectional; only the database on the server to
-which the command has been sent is updated. To synchronize the
+The ``ha-sync`` command instructs the server to synchronize its local lease
+database with the selected peer. The server fetches all leases from the peer and
+updates any locally stored leases which are older than those fetched. It also
+creates new leases when any of those fetched do not exist in the local database.
+All leases that are not returned by the peer but are in the local database are
+preserved. The database synchronization is unidirectional; only the database on
+the server to which the command has been sent is updated. To synchronize the
peer's database, a separate ``ha-sync`` command must be issued to that peer.
-Database synchronization may be triggered for both active and backup
-server types. The ``ha-sync`` command has the following structure
-(in a DHCPv4 example):
+Database synchronization may be triggered for both active and backup server
+types. The ``ha-sync`` command has the following structure (in a DHCPv4 example):
::
}
}
-When the server receives this command it first disables the DHCP service
-of the server from which it will be fetching leases, by sending the
-``dhcp-disable`` command to that server. The ``max-period`` parameter
-specifies the maximum duration (in seconds) for which the DHCP service
-should be disabled. If the DHCP service is successfully disabled, the
-synchronizing server fetches leases from the remote server by issuing
-one or more ``lease4-get-page`` commands. When the lease-database
-synchronization is complete, the synchronizing server sends the
+When the server receives this command it first disables the DHCP service of the
+server from which it will be fetching leases, by sending the ``dhcp-disable``
+command to that server. The ``max-period`` parameter specifies the maximum
+duration (in seconds) for which the DHCP service should be disabled. If the DHCP
+service is successfully disabled, the synchronizing server fetches leases from
+the remote server by issuing one or more ``lease4-get-page`` commands. When the
+lease-database synchronization is complete, the synchronizing server sends the
``dhcp-enable`` command to the peer to re-enable its DHCP service.
-The ``max-period`` value should be sufficiently long to guarantee that
-it does not elapse before the synchronization is completed. Otherwise,
-the DHCP server will automatically enable its DHCP function while the
-synchronization is still in progress. If the DHCP server subsequently
-allocates any leases during the synchronization, those new (or updated)
-leases will not be fetched by the synchronizing server, leading to
-database inconsistencies.
+The ``max-period`` value should be sufficiently long to guarantee that it does
+not elapse before the synchronization is completed. Otherwise, the DHCP server
+will automatically enable its DHCP function while the synchronization is still
+in progress. If the DHCP server subsequently allocates any leases during the
+synchronization, those new (or updated) leases will not be fetched by the
+synchronizing server, leading to database inconsistencies.
.. _command-ha-scopes:
The ``ha-scopes`` Command
-------------------------
-This command allows an administrator to modify the HA scopes being
-served. Consult :ref:`ha-load-balancing-config` and
-:ref:`ha-hot-standby-config` to learn which scopes are
-available for the different HA modes of operation. The ``ha-scopes`` command
-has the following structure (in a DHCPv4 example):
+This command allows an administrator to modify the HA scopes being served.
+Consult :ref:`ha-load-balancing-config` and :ref:`ha-hot-standby-config` to
+learn which scopes are available for the different HA modes of operation. The
+``ha-scopes`` command has the following structure (in a DHCPv4 example):
::
}
}
-This command configures the server to handle traffic from both the
-"HA_server1" and "HA_server2" scopes. To disable all scopes
-specify an empty list:
+This command configures the server to handle traffic from both the "HA_server1"
+and "HA_server2" scopes. To disable all scopes specify an empty list:
::
The ``ha-continue`` Command
---------------------------
-This command is used to resume the operation of the paused HA state
-machine, as described in :ref:`ha-pause-state-machine`. It takes no arguments, so the
+This command is used to resume the operation of the paused HA state machine, as
+described in :ref:`ha-pause-state-machine`. It takes no arguments, so the
command structure is simply:
::
The ``ha-heartbeat`` Command
----------------------------
-The :ref:`ha-server-states` section describes how the ``ha-heartbeat`` command is
-used by a pair of active HA servers to detect one partner's failure. This command, however,
-can also be sent by the system administrator to one or both servers to check their
-HA state. This allows a monitoring
-system to be deployed on the HA enabled servers to periodically check whether they are operational
-or whether any manual intervention is required. The ``ha-heartbeat`` command takes no
-arguments:
+The :ref:`ha-server-states` section describes how the ``ha-heartbeat`` command
+is used by a pair of active HA servers to detect one partner's failure. This
+command, however, can also be sent by the system administrator to one or both
+servers to check their HA state. This allows a monitoring system to be deployed
+on the HA enabled servers to periodically check whether they are operational or
+whether any manual intervention is required. The ``ha-heartbeat`` command takes
+no arguments:
::
}
}
-The returned ``state`` value should be one of the values listed in :ref:`ha-server-states`.
-In the example above, the ``partner-down`` state is returned, which indicates that
-the server which responded to the command believes that its partner is offline;
-thus, it is serving all DHCP requests sent to the servers. To ensure that
-the partner is indeed offline, the administrator should send the ``ha-heartbeat``
-command to the second server. If sending the command fails, e.g. due to an inability
-to establish a TCP connection to the Control Agent, or if the Control Agent reports
-issues with communication with the DHCP server, it is very likely that the server
-is not running.
+The returned ``state`` value should be one of the values listed in
+:ref:`ha-server-states`. In the example above, the ``partner-down`` state is
+returned, which indicates that the server which responded to the command
+believes that its partner is offline; thus, it is serving all DHCP requests sent
+to the servers. To ensure that the partner is indeed offline, the administrator
+should send the ``ha-heartbeat`` command to the second server. If sending the
+command fails, e.g. due to an inability to establish a TCP connection to the
+Control Agent, or if the Control Agent reports issues with communication with
+the DHCP server, it is very likely that the server is not running.
The ``date-time`` parameter conveys the server's notion of time.
-The ``unsent-update-count`` value is a cumulative count of all unsent lease updates
-since the server was booted; its value is set to 0 when the server is started.
-It is never reset to 0 during the server's operation, even after the partner
-synchronizes the database. It is incremented by the partner sending the heartbeat
-response when it cannot send the lease update. For example, suppose the failure is a result of a
-temporary communication interruption. In that case, the partner receiving the
-``partner-down`` heartbeat response tracks the value changes and can determine, once communication
-is reestablished, whether there are
-any new lease updates that it did not receive. If the values on both servers do not match,
-it is an indication that the partner should synchronize its lease database.
-A non-zero value itself is not an indication of any present
-issues with lease updates, but a constantly incrementing value is.
+The ``unsent-update-count`` value is a cumulative count of all unsent lease
+updates since the server was booted; its value is set to 0 when the server is
+started. It is never reset to 0 during the server's operation, even after the
+partner synchronizes the database. It is incremented by the partner sending the
+heartbeat response when it cannot send the lease update. For example, suppose
+the failure is a result of a temporary communication interruption. In that case,
+the partner receiving the ``partner-down`` heartbeat response tracks the value
+changes and can determine, once communication is reestablished, whether there
+are any new lease updates that it did not receive. If the values on both servers
+do not match, it is an indication that the partner should synchronize its lease
+database. A non-zero value itself is not an indication of any present issues
+with lease updates, but a constantly incrementing value is.
The typical response returned by one server when both are
operational is:
}
}
-In most cases, the ``ha-heartbeat`` command should be sent to both
-HA-enabled servers to verify the state of the entire HA setup. In particular,
-if one of the servers indicates that it is in the
-``load-balancing`` state, it means that this server is operating as if
-its partner is functional. When a partner goes down, it takes some
-time for the surviving server to realize it. The :ref:`ha-scope-transition`
-section describes the algorithm which the surviving server follows before
-it transitions to the ``partner-down`` state. If the ``ha-heartbeat`` command
-is sent during the time window between the failure of one of the servers and the
-transition of the surviving server to the ``partner-down`` state, the response
-from the surviving server does not reflect the failure. Resending the command
-detects the failure once the surviving server has entered the ``partner-down``
-state.
+In most cases, the ``ha-heartbeat`` command should be sent to both HA-enabled
+servers to verify the state of the entire HA setup. In particular, if one of the
+servers indicates that it is in the ``load-balancing`` state, it means that this
+server is operating as if its partner is functional. When a partner goes down,
+it takes some time for the surviving server to realize it. The
+:ref:`ha-scope-transition` section describes the algorithm which the surviving
+server follows before it transitions to the ``partner-down`` state. If the
+``ha-heartbeat`` command is sent during the time window between the failure of
+one of the servers and the transition of the surviving server to the
+``partner-down`` state, the response from the surviving server does not reflect
+the failure. Resending the command detects the failure once the surviving server
+has entered the ``partner-down`` state.
.. note:
- Always send the ``ha-heartbeat`` command to both active HA servers
- to check the state of the entire HA setup. Sending it to only one of the
- servers may not reflect issues with one of the servers that just began.
+ Always send the ``ha-heartbeat`` command to both active HA servers to check
+ the state of the entire HA setup. Sending it to only one of the servers may
+ not reflect issues with one of the servers that just began.
.. _command-ha-status-get:
--------------------------
``status-get`` is a general-purpose command supported by several Kea daemons,
-not only the DHCP servers. However, when sent to a DHCP server with HA enabled, it
-can be used to get insight into the details of the HA-specific server status.
-Not only does the response contain
-the status information of the server receiving this command, but also the
-information about its partner if it is available.
+not only the DHCP servers. However, when sent to a DHCP server with HA enabled,
+it can be used to get insight into the details of the HA-specific server status.
+Not only does the response contain the status information of the server
+receiving this command, but also the information about its partner if it is
+available.
The following is an example response to the ``status-get`` command, including
the HA status of two load-balancing servers:
}
}
-The ``high-availability`` argument is a list which currently comprises
-only one element.
-
-The ``ha-servers`` map contains two structures: ``local`` and ``remote``. The former
-contains the status information of the server which received the command, while the
-latter contains the status information known to the local server about the
-partner. The ``role`` of the partner server is gathered from the local
-configuration file, and thus should always be available. The remaining
-status information, such as ``last-scopes`` and ``last-state``, is not available
-until the local server communicates with the remote by successfully sending
-the ``ha-heartbeat`` command. If at least one such communication has taken place,
-the returned value of the ``in-touch`` parameter is set to ``true``. By examining
-this value, the command's sender can determine whether the information about
-the remote server is reliable.
+The ``high-availability`` argument is a list which currently comprises only one
+element.
+
+The ``ha-servers`` map contains two structures: ``local`` and ``remote``. The
+former contains the status information of the server which received the command,
+while the latter contains the status information known to the local server about
+the partner. The ``role`` of the partner server is gathered from the local
+configuration file, and thus should always be available. The remaining status
+information, such as ``last-scopes`` and ``last-state``, is not available until
+the local server communicates with the remote by successfully sending the
+``ha-heartbeat`` command. If at least one such communication has taken place,
+the returned value of the ``in-touch`` parameter is set to ``true``. By
+examining this value, the command's sender can determine whether the information
+about the remote server is reliable.
The ``last-scopes`` and ``last-state`` parameters contain information about the
-HA scopes served by the partner and its state. This information
-is gathered during the heartbeat command exchange, so it may not be
-accurate if a communication problem occurs between the partners and this
-status information is not refreshed. In such a case, it may be useful to
-send the ``status-get`` command to the partner server directly to check
-its current state. The ``age`` parameter specifies the age
-of the information from the partner, in seconds.
+HA scopes served by the partner and its state. This information is gathered
+during the heartbeat command exchange, so it may not be accurate if a
+communication problem occurs between the partners and this status information is
+not refreshed. In such a case, it may be useful to send the ``status-get``
+command to the partner server directly to check its current state. The ``age``
+parameter specifies the age of the information from the partner, in seconds.
The ``communication-interrupted`` boolean value indicates whether the server
receiving the ``status-get`` command (the local server) has been unable to
communicate with the partner longer than the duration specified as
-``max-response-delay``. In such a situation, the active servers are
-considered to be in the ``communication-interrupted`` state. At this point,
-the local server may start monitoring
-the DHCP traffic directed to the partner to see if the partner is
-responding to this traffic. More about the failover procedure can be found
-in :ref:`ha-load-balancing-config`.
-
-The ``connecting-clients``, ``unacked-clients``, ``unacked-clients-left``,
-and ``analyzed-packets`` parameters were introduced along with the
-``communication-interrupted`` parameter and they
-convey useful information about the state of the DHCP traffic monitoring
-in the ``communication-interrupted`` state. Once the server leaves the
-``communication-interrupted`` state, these parameters are all reset to 0.
+``max-response-delay``. In such a situation, the active servers are considered
+to be in the ``communication-interrupted`` state. At this point, the local
+server may start monitoring the DHCP traffic directed to the partner to see if
+the partner is responding to this traffic. More about the failover procedure can
+be found in :ref:`ha-load-balancing-config`.
+
+The ``connecting-clients``, ``unacked-clients``, ``unacked-clients-left``, and
+``analyzed-packets`` parameters were introduced along with the
+``communication-interrupted`` parameter and they convey useful information about
+the state of the DHCP traffic monitoring in the ``communication-interrupted``
+state. Once the server leaves the ``communication-interrupted`` state, these
+parameters are all reset to 0.
These parameters have the following meaning in the ``communication-interrupted``
state:
-- ``connecting-clients`` - this is the number of different clients which have attempted
- to get a lease from the remote server. These clients are differentiated by
- their MAC address and client identifier (in DHCPv4) or DUID (in DHCPv6).
- This number includes "unacked" clients (for which the "secs" field or
- "elapsed time" value exceeded the ``max-response-delay``).
-
-- ``unacked-clients`` - this is the number of different clients which have been considered
- "unacked", i.e. the clients which have been trying to get the lease longer
- than the value of the "secs" field, or for which the "elapsed time" exceeded the
- ``max-response-delay`` setting.
-
-- ``unacked-clients-left`` - this indicates the number of additional clients which have to be
- considered "unacked" before the server enters the ``partner-down`` state.
- This value decreases when the ``unacked-clients`` value increases. The
- local server enters the ``partner-down`` state when this value
- decreases to 0.
-
-- ``analyzed-packets`` - this is the total number of packets directed to the partner
- server and analyzed by the local server since entering the communication
- interrupted state. It includes retransmissions from the same clients.
-
-Monitoring these values helps to predict when the local server will
-enter the ``partner-down`` state or to understand why the server has not yet entered this
+- ``connecting-clients`` - this is the number of different clients which have
+ attempted to get a lease from the remote server. These clients are
+ differentiated by their MAC address and client identifier (in DHCPv4) or DUID
+ (in DHCPv6). This number includes "unacked" clients (for which the "secs"
+ field or "elapsed time" value exceeded the ``max-response-delay``).
+
+- ``unacked-clients`` - this is the number of different clients which have been
+ considered "unacked", i.e. the clients which have been trying to get the
+ lease longer than the value of the "secs" field, or for which the
+ "Elapsed Time" exceeded the ``max-response-delay`` setting.
+
+- ``unacked-clients-left`` - this indicates the number of additional clients
+ which have to be considered "unacked" before the server enters the
+ ``partner-down`` state. This value decreases when the ``unacked-clients``
+ value increases. The local server enters the ``partner-down`` state when this
+ value decreases to 0.
+
+- ``analyzed-packets`` - this is the total number of packets directed to the
+ partner server and analyzed by the local server since entering the
+ communication interrupted state. It includes retransmissions from the same
+ clients.
+
+Monitoring these values helps to predict when the local server will enter the
+``partner-down`` state or to understand why the server has not yet entered this
state.
-The ``ha-mode`` parameter returns the HA mode of operation selected using the ``mode`` parameter
-in the configuration file. It can hold one of the following values:
-``load-balancing``, ``hot-standby``, or ``passive-backup``.
+The ``ha-mode`` parameter returns the HA mode of operation selected using the
+``mode`` parameter in the configuration file. It can hold one of the following
+values: ``load-balancing``, ``hot-standby``, or ``passive-backup``.
The ``status-get`` response has the format described above only in the
-``load-balancing`` and ``hot-standby`` modes. In the ``passive-backup``
-mode the ``remote`` map is not included in the response because in this
-mode there is only one active server (local). The response includes no
-information about the status of the backup servers.
+``load-balancing`` and ``hot-standby`` modes. In the ``passive-backup`` mode the
+``remote`` map is not included in the response because in this mode there is
+only one active server (local). The response includes no information about the
+status of the backup servers.
.. _command-ha-maintenance-start:
The ``ha-maintenance-start`` Command
------------------------------------
-This command is used to initiate transition of the server's partner into
-the ``in-maintenance`` state and the transition of the server receiving the
-command into the ``partner-in-maintenance`` state. See the
-:ref:`ha-maintenance` section for details.
+This command is used to initiate transition of the server's partner into the
+``in-maintenance`` state and the transition of the server receiving the command
+into the ``partner-in-maintenance`` state. See the :ref:`ha-maintenance` section
+for details.
::
The ``ha-maintenance-cancel`` Command
-------------------------------------
-This command is used to cancel the maintenance previously initiated using
-the ``ha-maintenance-start`` command. The server receiving this command
-will first send ``ha-maintenance-notify``, with the ``cancel`` flag set
-to ``true``, to its partner. Next, the server reverts from the
-``partner-in-maintenance`` state to its previous state. See the
-:ref:`ha-maintenance` section for details.
+This command is used to cancel the maintenance previously initiated using the
+``ha-maintenance-start`` command. The server receiving this command will first
+send ``ha-maintenance-notify``, with the ``cancel`` flag set to ``true``, to its
+partner. Next, the server reverts from the ``partner-in-maintenance`` state to
+its previous state. See the :ref:`ha-maintenance` section for details.
::
The ``ha-maintenance-notify`` Command
-------------------------------------
-This command is sent by the server receiving the ``ha-maintenance-start``
-or the ``ha-maintenance-cancel`` command to its partner, to cause the
-partner to transition to the ``in-maintenance`` state or to revert from this
-state to a previous state. See the :ref:`ha-maintenance` section for details.
+This command is sent by the server receiving the ``ha-maintenance-start`` or the
+``ha-maintenance-cancel`` command to its partner, to cause the partner to
+transition to the ``in-maintenance`` state or to revert from this state to a
+previous state. See the :ref:`ha-maintenance` section for details.
::
.. warning::
- The ``ha-maintenance-notify`` command is not meant to be used by
- system administrators. It is used for internal communication between
- a pair of HA-enabled DHCP servers. Direct use of this command is not
- supported and may produce unintended consequences.
+ The ``ha-maintenance-notify`` command is not meant to be used by system
+ administrators. It is used for internal communication between a pair of
+ HA-enabled DHCP servers. Direct use of this command is not supported and may
+ produce unintended consequences.
.. _command-ha-reset:
The ``ha-reset`` Command
------------------------
-This command causes the server to reset its High Availability state machine
-by transitioning it to the ``waiting`` state. A partner in the
+This command causes the server to reset its High Availability state machine by
+transitioning it to the ``waiting`` state. A partner in the
``communication-recovery`` state may send this command to cause the server
-to synchronize its lease database. Database synchronization is required
-when the partner has failed to send all lease database updates after
-re-establishing connection after a temporary connection failure. It is also
-required when the ``delayed-updates-limit`` is exceeded, when the server is
-in the ``communication-recovery`` state.
+to synchronize its lease database. Database synchronization is required when the
+partner has failed to send all lease database updates after re-establishing
+connection after a temporary connection failure. It is also required when the
+``delayed-updates-limit`` is exceeded, when the server is in the
+``communication-recovery`` state.
-A server administrator may send the command to reset a misbehaving state
-machine.
+A server administrator may send the command to reset a misbehaving state machine.
This command includes no arguments:
"text": "HA state machine reset."
}
-If the server receiving this command is already in the ``waiting`` state,
-the command has no effect.
+If the server receiving this command is already in the ``waiting`` state, the
+command has no effect.
.. _command-ha-sync-complete-notify:
---------------------------------------
A server sends this command to its partner to signal that it has completed
-lease-database synchronization. The partner may enable its DHCP service if
-it can allocate new leases in its current state. The partner does not enable
-the DHCP service in the ``partner-down`` state until it sends a successful
-heartbeat test to its partner server. If the connection is still
-unavailable, the server in the ``partner-down`` state enables its own DHCP service
-to continue responding to clients.
+lease-database synchronization. The partner may enable its DHCP service if it
+can allocate new leases in its current state. The partner does not enable the
+DHCP service in the ``partner-down`` state until it sends a successful heartbeat
+test to its partner server. If the connection is still unavailable, the server
+in the ``partner-down`` state enables its own DHCP service to continue
+responding to clients.
This command includes no arguments:
.. warning::
- The ``ha-sync-complete-notify`` command is not meant to be used by
- system administrators. It is used for internal communication between
- a pair of HA-enabled DHCP servers. Direct use of this command is not
- supported and may produce unintended consequences.
+ The ``ha-sync-complete-notify`` command is not meant to be used by system
+ administrators. It is used for internal communication between a pair of
+ HA-enabled DHCP servers. Direct use of this command is not supported and may
+ produce unintended consequences.