Author (in the Kea ARM or as a Wiki piece or KB article), better guidance on HA and HA + backup configurations
This is follow-on from customer questions and requests for advice on setting up and managing resilient Kea HA environments.
See discussions and material in Support ticket #15378 and Support ticket #15334.
Although the concepts of how to configure HA as load-balancing or primary/standby as well as 'and you can add a backup server too) are covered in the Kea Administrator Reference Manual, there's no overall discussion about different strategies for deploying resilient server configurations, how to test, monitor, what to consider, and what to actually do, in case of a failure scenario.
It would be excellent to have a document (or paper, or series of KB articles) that cover things like:
- Different resilient configurations
- Pros and Cons of each
- Implications for routers/relays
- Disaster scenarios and recovery from them - including what to do with relays, routers, server addressing, route advertisements (for wheeling in a replacement server at another location)
- How to promote a backup server to primary/sole server if the HA pair 'vanish'
- How to change HA server states manually (in a controlled manner, with guaranteed state change) for maintenance/upgrades
- How to test your HA environment
- Where/how anycast might fit in
- Where/how load-balancers fit in - and specific considerations for those
- Server environment replication - entire VMs
- Lease file replication
- Multiple Kea servers sharing the same leases back-end
- Multiple Kea servers sharing the same reservations back-end
- All the 'what if?'s
And so on...