Unattended terminated state and a reboot
Consider the following case. The clocks on two HA-enabled servers diverge and the clock skew eventually exceeds 60 seconds. As a result, both servers transition to the terminated state. In this state, the servers continue serving DHCP clients but do not exchange the lease updates nor heartbeats. An administrator neglects to correct the clocks and one of the servers reboots. The server enters the waiting
state and remains in this state hoping that the other server is restarted so they can continue the lease database synchronization and start normal operation. However, the server is unaware that its reboot was not triggered in the course of fixing the clocks, so it will wait for the partner endlessly (or until the administrator comes to work in the morning). The waiting server is not responding to the DHCP traffic until then.
This situation should not occur in a setup where NTP has been enabled. It also should not occur if there is a proper monitoring to detect that the clocks diverge early enough. However, there are chances this situation may happen when all of this is neglected.
The proposed solution is to apply a timeout (could even be several to 10 minutes long) for a server in the waiting state. If the transition of its partner does not occur until this timeout elapses, the server in the waiting state transitions back to the terminated state and continues serving the clients. The waiting server MUST NOT transition to the waiting state immediately after it detects that its partner is in the terminated state to allow enough time to the administrator to reboot the server sequentially after correcting the clocks.