Disabling high-availability hook leaves "DHCP service is globally disabled" after reconfiguration
Describe the bug
Running "config-set" on Kea to disable HA hook can leave DHCP service in a disabled state, refusing to handle any DHCP requests.
To Reproduce
Steps to reproduce the behavior:
-
Start kea-dhcpv4 with HA hook enabled and configured to have primary/secondary role, which will transition it to WAITING state, disabling DHCP service until partner host can be contacted.
-
Run "config-set" API command to update Kea configuration with HA hook disabled in that new configuration.
-
Try testing DHCP service using any client.
-
Verbose Kea logs should report something like "DHCP service is globally disabled".
Expected behavior
- HA hook does not leave DHCP service in a disabled state after unloading.
- DHCP service working normally after reconfiguration that disables HA hook.
Environment:
- Kea version: 1.8.2
- OS: Alpine Linux 3.13
- Built and used with memfile backend.
Additional Information
Running this as a testing setup in a docker containers, with python script configuring Kea.
Not sure if issue is 100% reproducible, but happened to me at least a couple of times already.
Attached logs:
Full configuration loaded each time should be available in the attached kea-ha-service-disabling.debug-all.log file, in DHCP4_CONFIG_RECEIVED lines, don't think it has anything special or very relevant, aside from that hook being present in one configuration and removed in the next one.
Attached kea-ha-service-disabling.debug-brief.log is a filtered version of "debug-all" file, which I believe illustrates the issue well, without any extra noise in it, with even shorter gist being:
2021-02-06T18:39:01.624 a DEBUG kea-dhcp4.dhcp4 DHCP4_CONFIG_RECEIVED ...
2021-02-06T18:39:01.658 a INFO kea-dhcp4.ha-hooks HA_LOCAL_DHCP_DISABLE local DHCP service is disabled while the a is in the WAITING state
2021-02-06T18:39:01.658 a INFO kea-dhcp4.ha-hooks HA_SERVICE_STARTED started high availability service in load-balancing mode as secondary server
...
2021-02-06T18:39:01.820 a INFO kea-dhcp4.commands COMMAND_RECEIVED Received command 'config-set'
2021-02-06T18:39:01.820 a DEBUG kea-dhcp4.dhcp4 DHCP4_CONFIG_RECEIVED ...
2021-02-06T18:39:01.839 a INFO kea-dhcp4.ha-hooks HA_DEINIT_OK unloading High Availability hooks library successful
...
2021-02-06T18:40:05.371 a DEBUG kea-dhcp4.packets DHCP4_BUFFER_RECEIVED received buffer from 0.0.0.0:68 to 255.255.255.255:67 over interface ens3
2021-02-06T18:40:05.371 a DEBUG kea-dhcp4.bad-packets DHCP4_PACKET_DROP_0008 [hwtype=1 ], cid=[no info], tid=0x0: DHCP service is globally disabled
Also, I've seen note on new dhcp-enable/dhcp-disable API commands in the recent ChangeLog entry:
1858. [bug] razvan
The DHCP service can be independently enabled or disabled by
the user command, by the database connection mechanics or
by the HA library. The DHCP service is disabled when any
of those originators disables the service, and it is enabled
when all those who previously disabled the service enable it.
But "when all those who previously disabled the service enable it" seem to imply that it won't be a good workaround for this bug if HA hook indeed does not re-enable the service when removed - it'll stay disabled regardless of API command, or require to masquerade that command as if it was from HA hook, which seem to be suboptimal in a number of ways.
Extra Question
Can someone knowledgeable with Kea internals think of a good workaround with a current stable Kea version?
I.e. is there a good way to reliably disable HA hook at an arbitrary time without risk of leaving DHCP in a broken state?
I can think that maybe HA hook should be first transitioned into passive-backup or somesuch state, from which it explicitly enables DHCP on deinit (assuming that it does, didn't look at the code to confirm), but maybe there's a simplier hack/workaround?
Thanks in advance for any suggestions.
Contacting you
Will try to monitor replies here.