Stork Server Continually Disconnects from Agents
I have a Stork server monitoring two Kea servers (no BIND involved). Everything works properly, except that, constantly, the Stork server shows communication interruption events for both servers' Kea daemons:
https://stork-server/api/events:
{
"items": [
{
"createdAt": "2022-06-29T14:32:11.782Z",
"id": 51365,
"level": 1,
"text": "Communication with <daemon id=\"4\" name=\"dhcp4\" appId=\"2\" appType=\"kea\"> of <app id=\"2\" name=\"Secondary\" type=\"kea\" version=\"1.8.2\"> resumed"
},
{
"createdAt": "2022-06-29T14:32:11.763Z",
"id": 51364,
"level": 1,
"text": "Communication with <daemon id=\"2\" name=\"dhcp4\" appId=\"1\" appType=\"kea\"> of <app id=\"1\" name=\"Primary\" type=\"kea\" version=\"1.8.2\"> resumed"
},
{
"createdAt": "2022-06-29T14:31:51.615Z",
"id": 51363,
"level": 2,
"text": "Communication with <daemon id=\"4\" name=\"dhcp4\" appId=\"2\" appType=\"kea\"> of <app id=\"2\" name=\"Secondary\" type=\"kea\" version=\"1.8.2\"> failed"
},
{
"createdAt": "2022-06-29T14:31:51.599Z",
"id": 51362,
"level": 2,
"text": "Communication with <daemon id=\"2\" name=\"dhcp4\" appId=\"1\" appType=\"kea\"> of <app id=\"1\" name=\"Primary\" type=\"kea\" version=\"1.8.2\"> failed"
},
{
"createdAt": "2022-06-29T14:31:11.677Z",
"id": 51361,
"level": 1,
"text": "Communication with <daemon id=\"4\" name=\"dhcp4\" appId=\"2\" appType=\"kea\"> of <app id=\"2\" name=\"Secondary\" type=\"kea\" version=\"1.8.2\"> resumed"
},
{
"createdAt": "2022-06-29T14:31:11.658Z",
"id": 51360,
"level": 1,
"text": "Communication with <daemon id=\"2\" name=\"dhcp4\" appId=\"1\" appType=\"kea\"> of <app id=\"1\" name=\"Primary\" type=\"kea\" version=\"1.8.2\"> resumed"
},
{
"createdAt": "2022-06-29T14:30:51.574Z",
"id": 51359,
"level": 2,
"text": "Communication with <daemon id=\"4\" name=\"dhcp4\" appId=\"2\" appType=\"kea\"> of <app id=\"2\" name=\"Secondary\" type=\"kea\" version=\"1.8.2\"> failed"
},
{
"createdAt": "2022-06-29T14:30:51.561Z",
"id": 51358,
"level": 2,
"text": "Communication with <daemon id=\"2\" name=\"dhcp4\" appId=\"1\" appType=\"kea\"> of <app id=\"1\" name=\"Primary\" type=\"kea\" version=\"1.8.2\"> failed"
},
{
"createdAt": "2022-06-29T14:30:11.578Z",
"id": 51357,
"level": 1,
"text": "Communication with <daemon id=\"4\" name=\"dhcp4\" appId=\"2\" appType=\"kea\"> of <app id=\"2\" name=\"Secondary\" type=\"kea\" version=\"1.8.2\"> resumed"
},
{
"createdAt": "2022-06-29T14:30:11.561Z",
"id": 51356,
"level": 1,
"text": "Communication with <daemon id=\"2\" name=\"dhcp4\" appId=\"1\" appType=\"kea\"> of <app id=\"1\" name=\"Primary\" type=\"kea\" version=\"1.8.2\"> resumed"
}
],
"total": 51365
}
Notice that there is a predictable delay between the messages. I don't know if this is because Stork polls every 20 seconds or if it's something else.
While the agents are reported as "unreachable," I can still cURL them on port 8080 (I don't get any interesting data, of course, but I can reach it - it doesn't time out). The logs on the server from journalctl -xeu isc-stork-server
don't show anything useful except the event pasted earlier, along with warnings about reservation-get-page
being unsupported. On the agent, I see no issues except an occasional "Problem connecting to dhcp daemon: forwarding socket is not configured for the server type dhcp6" - but we don't use DHCP6.
As for the web interface, I see:
The Kea logs themselves show the commands, but no errors that I can find:
2022-06-29 07:50:50.674 INFO [kea-dhcp4.commands/443501.139831042103424] COMMAND_RECEIVED Received command 'version-get'
2022-06-29 07:50:50.675 INFO [kea-dhcp4.commands/443501.139831042103424] COMMAND_RECEIVED Received command 'status-get'
2022-06-29 07:50:50.677 INFO [kea-dhcp4.commands/443501.139831042103424] COMMAND_RECEIVED Received command 'config-get'
2022-06-29 07:50:52.273 INFO [kea-dhcp4.commands/443501.139831042103424] COMMAND_RECEIVED Received command 'reservation-get-page'
2022-06-29 07:50:52.275 ERROR [kea-dhcp4.callouts/443501.139831042103424] HOOKS_CALLOUT_ERROR error returned by callout on hook 3 registered by library with index $reservation_get_page (callout address 0x7f2cebb0b540) (callout duration 1.420 ms)
2022-06-29 07:50:57.627 INFO [kea-dhcp4.commands/443501.139831042103424] COMMAND_RECEIVED Received command 'statistic-get-all'
2022-06-29 07:50:57.636 INFO [kea-dhcp4.commands/443501.139831042103424] COMMAND_RECEIVED Received command 'subnet4-list'
2022-06-29 07:52:32.466 INFO [kea-dhcp4.commands/443501.139831042103424] COMMAND_RECEIVED Received command 'stat-lease4-get'
2022-06-29 07:52:32.467 INFO [kea-dhcp4.stat-cmds-hooks/443501.139831042103424] STAT_CMDS_LEASE4_GET stat-lease4-get command successful, parameters: [all subnets] rows found: 2
...
All servers are on the same subnet. The Stork agents are listening and responding on port 8080, bound to their LAN IPv4 address. Port 8080 is open on their firewalls and I can cURL it from the Stork server.
Clients:
Version: isc-stork-agent-1.4.0.220531122317-1.x86_64
OS: RHEL 8.5
Server:
Installation type: https://stork.readthedocs.io/en/v1.4.0/install.html#installing-on-centos-rhel-fedora
Version: isc-stork-server-1.4.0.220531122323-1.x86_64
OS: AlmaLinux 9.0
How can this issue be uncovered?