Kea issueshttps://gitlab.isc.org/isc-projects/kea/-/issues2024-03-28T10:34:28Zhttps://gitlab.isc.org/isc-projects/kea/-/issues/3276Kea primary server in "passive backup" freeze/crash on receiving ha-sync2024-03-28T10:34:28ZMarcin GodzinaKea primary server in "passive backup" freeze/crash on receiving ha-syncKea HA server set as `primary` freezes after receiving `ha-sync` command with proper arguments.
The backup server does NOT crash.
Freeze occurs only in `passive-backup` mode.
The problem exists in both v4 and v6. Also, in Memfile and m...Kea HA server set as `primary` freezes after receiving `ha-sync` command with proper arguments.
The backup server does NOT crash.
Freeze occurs only in `passive-backup` mode.
The problem exists in both v4 and v6. Also, in Memfile and mysql/psql lease database.
**Kea versions tested:**
- 2.5.7-git 8c1f22e3fb65225a0279606a8a65962850a5f881
- 2.4.0 release tarball
**Tested systems:**
- Fedora 38 in VM on my local setup.
- Ubuntu 22.04, Alpine 3.16, Fedora 36 on Jenkins build farm.
**To Reproduce**
Steps to reproduce the behavior:
1. Run Kea HA servers in **Passive backup** configuration (tested configuration provided)
2. Wait for servers to connect.
3. Optionally add leases (crashes either way)
4. Send the `ha-sync` command with proper arguments to the primary server. (`"server-name": "server2"` for provided configuration) (Invalid arguments respond with error)
The primary server freezes after receiving a response to the `dhcp-disable` command, sent automatically to the backup server. It does not respond to kea-ctrl agent, keyboard interrupts or SIGHUP
<details><summary>Commands tested to freeze provided config:</summary>
```
{
command": "ha-sync",
"arguments": {
"server-name": "server2"
}
}
```
```
{
command": "ha-sync",
"arguments": {
"server-name": "server1"
}
}
```
```
{
command": "ha-sync",
"arguments": {
"server-name": "server2",
"max-period": 60
}
}
```
</details>
**Configuration**
<details><summary>Primary</summary>
```
{
"Dhcp4": {
"option-data": [],
"hooks-libraries": [
{
"library": "/home/mgodzina/installed/keadev/lib/kea/hooks/libdhcp_ha.so",
"parameters": {
"high-availability": [
{
"peers": [
{
"auto-failover": true,
"name": "server1",
"role": "primary",
"url": "http://192.168.56.102:8003/"
},
{
"auto-failover": true,
"name": "server2",
"role": "backup",
"url": "http://192.168.56.103:8003/"
}
],
"state-machine": {
"states": []
},
"mode": "passive-backup",
"this-server-name": "server1",
"multi-threading": {
"enable-multi-threading": true,
"http-dedicated-listener": true,
"http-listener-threads": 0,
"http-client-threads": 0
}
}
]
}
},
{
"library": "/home/mgodzina/installed/keadev/lib/kea/hooks/libdhcp_lease_cmds.so"
}
],
"shared-networks": [],
"subnet4": [
{
"subnet": "192.168.50.0/24",
"pools": [
{
"pool": "192.168.50.1-192.168.50.200"
}
],
"interface": "enp0s9"
}
],
"interfaces-config": {
"interfaces": [
"enp0s9"
]
},
"control-socket": {
"socket-type": "unix",
"socket-name": "/home/mgodzina/installed/keadev/var/run/kea/control_socket"
},
"renew-timer": 1000,
"rebind-timer": 2000,
"valid-lifetime": 4000,
"loggers": [
{
"name": "kea-dhcp4",
"output-options": [
{
"output": "/home/mgodzina/installed/keadev/var/log/kea.log"
}
],
"severity": "DEBUG",
"debuglevel": 99
}
],
"lease-database": {
"type": "memfile"
}
}
}
```
</details>
<details><summary>Backup</summary>
```
{
"Dhcp4": {
"option-data": [],
"hooks-libraries": [
{
"library": "/home/mgodzina/installed/keadev/lib/kea/hooks/libdhcp_ha.so",
"parameters": {
"high-availability": [
{
"peers": [
{
"auto-failover": true,
"name": "server1",
"role": "primary",
"url": "http://192.168.56.102:8003/"
},
{
"auto-failover": true,
"name": "server2",
"role": "backup",
"url": "http://192.168.56.103:8003/"
}
],
"state-machine": {
"states": []
},
"mode": "passive-backup",
"this-server-name": "server2",
"multi-threading": {
"enable-multi-threading": true,
"http-dedicated-listener": true,
"http-listener-threads": 0,
"http-client-threads": 0
}
}
]
}
},
{
"library": "/home/mgodzina/installed/keadev/lib/kea/hooks/libdhcp_lease_cmds.so"
}
],
"shared-networks": [],
"subnet4": [
{
"subnet": "192.168.50.0/24",
"pools": [
{
"pool": "192.168.50.1-192.168.50.200"
}
],
"interface": "enp0s9"
}
],
"interfaces-config": {
"interfaces": [
"enp0s9"
]
},
"control-socket": {
"socket-type": "unix",
"socket-name": "/home/mgodzina/installed/keadev/var/run/kea/control_socket"
},
"renew-timer": 1000,
"rebind-timer": 2000,
"valid-lifetime": 4000,
"loggers": [
{
"name": "kea-dhcp4",
"output-options": [
{
"output": "/home/mgodzina/installed/keadev/var/log/kea.log"
}
],
"severity": "DEBUG",
"debuglevel": 99
}
],
"lease-database": {
"type": "memfile"
}
}
}
```
</details>
**Logs**
<details><summary>Primary server log tail</summary>
```
2024-02-28 16:20:13.417 DEBUG [kea-dhcp4.commands/2096.139741364354944] COMMAND_SOCKET_CONNECTION_OPENED Opened socket 38 for incoming command connection
2024-02-28 16:20:13.417 DEBUG [kea-dhcp4.commands/2096.139741364354944] COMMAND_SOCKET_READ Received 127 bytes over command socket 38
2024-02-28 16:20:13.417 INFO [kea-dhcp4.commands/2096.139741364354944] COMMAND_RECEIVED Received command 'ha-sync'
2024-02-28 16:20:13.417 DEBUG [kea-dhcp4.callouts/2096.139741364354944] HOOKS_CALLOUTS_BEGIN begin all callouts for hook $ha_sync
2024-02-28 16:20:13.417 DEBUG [kea-dhcp4.http/2096.139741364354944] HTTP_CLIENT_REQUEST_SEND sending HTTP request POST / HTTP/1.1 to http://192.168.56.103:8003/
2024-02-28 16:20:13.417 DEBUG [kea-dhcp4.http/2096.139741364354944] HTTP_CLIENT_REQUEST_SEND_DETAILS detailed information about request sent to http://192.168.56.103:8003/:
POST / HTTP/1.1
Host: 192.168.56.103
Content-Length: 86
Content-Type: application/json
{ "arguments": { "origin": 2000 }, "command": "dhcp-disable", "service": [ "dhcp4" ] }
2024-02-28 16:20:13.417 INFO [kea-dhcp4.ha-hooks/2096.139741364354944] HA_SYNC_START server1: starting lease database synchronization with server2
2024-02-28 16:20:13.417 DEBUG [kea-dhcp4.http/2096.139741364354944] HTTP_SERVER_RESPONSE_RECEIVED received HTTP response from http://192.168.56.103:8003/
2024-02-28 16:20:13.417 DEBUG [kea-dhcp4.http/2096.139741364354944] HTTP_SERVER_RESPONSE_RECEIVED_DETAILS detailed information about well-formed response received from http://192.168.56.103:8003/:
HTTP/1.1 200 OK
Content-Length: 54
Content-Type: application/json
Date: Wed, 28 Feb 2024 15:20:13 GMT
[ { "result": 0, "text": "DHCPv4 service disabled" } ]
```
</details>
<details><summary>Backup server log snippet with timeout:</summary>
```
2024-02-28 16:20:13.413 DEBUG [kea-dhcp4.http/20519.140151306917568] HTTP_REQUEST_RECEIVE_START start receiving request from 192.168.56.102 with timeout 10
2024-02-28 16:20:13.413 DEBUG [kea-dhcp4.http/20519.140151306917568] HTTP_DATA_RECEIVED received 179 bytes from 192.168.56.102
2024-02-28 16:20:13.413 DEBUG [kea-dhcp4.http/20519.140151306917568] HTTP_CLIENT_REQUEST_RECEIVED received HTTP request from 192.168.56.102
2024-02-28 16:20:13.413 DEBUG [kea-dhcp4.http/20519.140151306917568] HTTP_CLIENT_REQUEST_RECEIVED_DETAILS detailed information about well-formed request received from 192.168.56.102:
POST / HTTP/1.1
Host: 192.168.56.103
Content-Length: 86
Content-Type: application/json
{ "arguments": { "origin": 2000 }, "command": "dhcp-disable", "service": [ "dhcp4" ] }
2024-02-28 16:20:13.413 INFO [kea-dhcp4.commands/20519.140151306917568] COMMAND_RECEIVED Received command 'dhcp-disable'
2024-02-28 16:20:13.413 DEBUG [kea-dhcp4.callouts/20519.140151306917568] HOOKS_CALLOUTS_BEGIN begin all callouts for hook command_processed
2024-02-28 16:20:13.413 DEBUG [kea-dhcp4.callouts/20519.140151306917568] HOOKS_CALLOUT_CALLED hooks library with index 1 has called a callout on hook command_processed that has address 0x7f778767ffe0 (callout duration: 0.000 ms)
2024-02-28 16:20:13.413 DEBUG [kea-dhcp4.callouts/20519.140151306917568] HOOKS_CALLOUTS_COMPLETE completed callouts for hook command_processed (total callouts duration: 0.000 ms)
2024-02-28 16:20:13.413 DEBUG [kea-dhcp4.http/20519.140151306917568] HTTP_SERVER_RESPONSE_SEND sending HTTP response HTTP/1.1 200 OK to 192.168.56.102
2024-02-28 16:20:13.413 DEBUG [kea-dhcp4.http/20519.140151306917568] HTTP_SERVER_RESPONSE_SEND_DETAILS detailed information about response sent to 192.168.56.102:
HTTP/1.1 200 OK
Content-Length: 54
Content-Type: application/json
Date: Wed, 28 Feb 2024 15:20:13 GMT
[ { "result": 0, "text": "DHCPv4 service disabled" } ]
2024-02-28 16:20:17.831 DEBUG [kea-dhcp4.dhcpsrv/20519.140151383601024] DHCPSRV_TIMERMGR_RUN_TIMER_OPERATION running operation for timer: reclaim-expired-leases
2024-02-28 16:20:17.831 DEBUG [kea-dhcp4.alloc-engine/20519.140151383601024] ALLOC_ENGINE_V4_LEASES_RECLAMATION_START starting reclamation of expired leases (limit = 100 leases or 250 milliseconds)
2024-02-28 16:20:17.831 DEBUG [kea-dhcp4.dhcpsrv/20519.140151383601024] DHCPSRV_MEMFILE_GET_EXPIRED4 obtaining maximum 101 of expired IPv4 leases
2024-02-28 16:20:17.832 DEBUG [kea-dhcp4.alloc-engine/20519.140151383601024] ALLOC_ENGINE_V4_LEASES_RECLAMATION_COMPLETE reclaimed 0 leases in 0.033 ms
2024-02-28 16:20:17.832 DEBUG [kea-dhcp4.alloc-engine/20519.140151383601024] ALLOC_ENGINE_V4_NO_MORE_EXPIRED_LEASES all expired leases have been reclaimed
2024-02-28 16:20:17.832 DEBUG [kea-dhcp4.dhcpsrv/20519.140151383601024] DHCPSRV_TIMERMGR_START_TIMER starting timer: reclaim-expired-leases
2024-02-28 16:20:21.840 DEBUG [kea-dhcp4.dhcpsrv/20519.140151383601024] DHCPSRV_TIMERMGR_RUN_TIMER_OPERATION running operation for timer: flush-reclaimed-leases
2024-02-28 16:20:21.840 DEBUG [kea-dhcp4.alloc-engine/20519.140151383601024] ALLOC_ENGINE_V4_RECLAIMED_LEASES_DELETE begin deletion of reclaimed leases expired more than 3600 seconds ago
2024-02-28 16:20:21.840 DEBUG [kea-dhcp4.dhcpsrv/20519.140151383601024] DHCPSRV_MEMFILE_DELETE_EXPIRED_RECLAIMED4 deleting reclaimed IPv4 leases that expired more than 3600 seconds ago
2024-02-28 16:20:21.840 DEBUG [kea-dhcp4.alloc-engine/20519.140151383601024] ALLOC_ENGINE_V4_RECLAIMED_LEASES_DELETE_COMPLETE successfully deleted 0 expired-reclaimed leases
2024-02-28 16:20:21.840 DEBUG [kea-dhcp4.dhcpsrv/20519.140151383601024] DHCPSRV_TIMERMGR_START_TIMER starting timer: flush-reclaimed-leases
2024-02-28 16:20:27.852 DEBUG [kea-dhcp4.dhcpsrv/20519.140151383601024] DHCPSRV_TIMERMGR_RUN_TIMER_OPERATION running operation for timer: reclaim-expired-leases
2024-02-28 16:20:27.852 DEBUG [kea-dhcp4.alloc-engine/20519.140151383601024] ALLOC_ENGINE_V4_LEASES_RECLAMATION_START starting reclamation of expired leases (limit = 100 leases or 250 milliseconds)
2024-02-28 16:20:27.852 DEBUG [kea-dhcp4.dhcpsrv/20519.140151383601024] DHCPSRV_MEMFILE_GET_EXPIRED4 obtaining maximum 101 of expired IPv4 leases
2024-02-28 16:20:27.852 DEBUG [kea-dhcp4.alloc-engine/20519.140151383601024] ALLOC_ENGINE_V4_LEASES_RECLAMATION_COMPLETE reclaimed 0 leases in 0.032 ms
2024-02-28 16:20:27.852 DEBUG [kea-dhcp4.alloc-engine/20519.140151383601024] ALLOC_ENGINE_V4_NO_MORE_EXPIRED_LEASES all expired leases have been reclaimed
2024-02-28 16:20:27.852 DEBUG [kea-dhcp4.dhcpsrv/20519.140151383601024] DHCPSRV_TIMERMGR_START_TIMER starting timer: reclaim-expired-leases
2024-02-28 16:20:37.891 DEBUG [kea-dhcp4.dhcpsrv/20519.140151383601024] DHCPSRV_TIMERMGR_RUN_TIMER_OPERATION running operation for timer: reclaim-expired-leases
2024-02-28 16:20:37.892 DEBUG [kea-dhcp4.alloc-engine/20519.140151383601024] ALLOC_ENGINE_V4_LEASES_RECLAMATION_START starting reclamation of expired leases (limit = 100 leases or 250 milliseconds)
2024-02-28 16:20:37.892 DEBUG [kea-dhcp4.dhcpsrv/20519.140151383601024] DHCPSRV_MEMFILE_GET_EXPIRED4 obtaining maximum 101 of expired IPv4 leases
2024-02-28 16:20:37.892 DEBUG [kea-dhcp4.alloc-engine/20519.140151383601024] ALLOC_ENGINE_V4_LEASES_RECLAMATION_COMPLETE reclaimed 0 leases in 0.027 ms
2024-02-28 16:20:37.892 DEBUG [kea-dhcp4.alloc-engine/20519.140151383601024] ALLOC_ENGINE_V4_NO_MORE_EXPIRED_LEASES all expired leases have been reclaimed
2024-02-28 16:20:37.892 DEBUG [kea-dhcp4.dhcpsrv/20519.140151383601024] DHCPSRV_TIMERMGR_START_TIMER starting timer: reclaim-expired-leases
2024-02-28 16:20:43.433 DEBUG [kea-dhcp4.http/20519.140151315310272] HTTP_IDLE_CONNECTION_TIMEOUT_OCCURRED closing persistent connection with 192.168.56.102 as a result of a timeout
2024-02-28 16:20:43.433 DEBUG [kea-dhcp4.http/20519.140151315310272] HTTP_CONNECTION_STOP stopping HTTP connection from 192.168.56.102
```
</details>
[gdb.txt](/uploads/de79e56462885f7947eab90267f7a658/gdb.txt)kea2.5.8Marcin SiodelskiMarcin Siodelskihttps://gitlab.isc.org/isc-projects/kea/-/issues/2339Memory leak in HA scenario with backup server down2023-09-07T14:02:26ZBranimir RajtarMemory leak in HA scenario with backup server down---
name: Memory leak in HA scenario with backup server down
about: Memory loss is created on running instances
---
**Describe the bug**
HA mode is configured with three servers (primary, secondary, backup) and is serving clients. Whe...---
name: Memory leak in HA scenario with backup server down
about: Memory loss is created on running instances
---
**Describe the bug**
HA mode is configured with three servers (primary, secondary, backup) and is serving clients. When the backup server becomes unavailable, the primary and secondary experience a continuous memory leak which is manifested as a continuous increase in RSS memory use for the isc-kea-dhcp4-server process. The size of the memory leak is in direct correlation with the number of active clients - the larger number, the greater the memory leak. Once the backup server is deleted from the configuration or it becomes active again, there is no more memory increase, but the old memory is not freed.
**To Reproduce**
Steps to reproduce the behavior:
1. Run KEA (DHCP4 only) in HA scenario with two load-balancing servers (primary and secondary) and a single backup server
2. Start serving clients (40k in our scenario) and monitoring RSS usage for the KEA server process
3. Disable backup server
4. Verify that RSS usage is increasing continuously
5. Enable backup server
6. Verify that RSS usage is stable
**Expected behavior**
The servers should not have any memory leaks.
**Environment:**
- Kea version: 1.8.2, 2.0.2
- OS: Ubuntu 18.04
- Memfile
- libdhcp_lease_cmds, libdhcp_stat_cmds, libdhcp_ha
**Additional Information**
```
{
"Dhcp4": {
"dhcp-queue-control": {
"enable-queue": true,
"queue-type": "kea-ring4",
"capacity": 256
},
"interfaces-config": {
"interfaces": [
"eth1"
],
"dhcp-socket-type": "udp"
},
"control-socket": {
"socket-type": "unix",
"socket-name": "/tmp/kea-dhcp4-ctrl.sock"
},
"lease-database": {
"type": "memfile",
"persist": true,
"name": "/var/lib/kea/dhcp4.leases",
"lfc-interval": 3600,
"port": 0
},
"expired-leases-processing": {
"reclaim-timer-wait-time": 10,
"flush-reclaimed-timer-wait-time": 25,
"hold-reclaimed-time": 3600,
"max-reclaim-leases": 100,
"max-reclaim-time": 250,
"unwarned-reclaim-cycles": 5
},
"renew-timer": 60,
"rebind-timer": 100,
"valid-lifetime": 120,
"option-data": [],
"hooks-libraries": [
{
"library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_lease_cmds.so",
"parameters": {}
},
{
"library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_stat_cmds.so"
},
{
"library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_ha.so",
"parameters": {
"high-availability": [
{
"this-server-name": "server3",
"mode": "load-balancing",
"heartbeat-delay": 3000,
"max-response-delay": 7000,
"max-ack-delay": 7000,
"max-unacked-clients": 20,
"peers": [
{
"name": "server2",
"url": "http://<XXX>:8080/",
"role": "secondary",
"auto-failover": true
},
{
"name": "server1",
"url": "http://<YYY>:8080/",
"role": "primary",
"auto-failover": true
},
{
"name": "server3",
"url": "http://<ZZZ>:8080/",
"role": "backup",
"auto-failover": true
}
]
}
]
}
}
],
"option-def": [
{
"name": "classless-static-route",
"code": 121,
"space": "dhcp4",
"type": "record",
"array": true,
"record-types": "uint8, uint8"
}
],
"client-classes": [
// anonymized
],
"subnet4": [
// anonymized
],
"reservations": [],
"loggers": [
{
"name": "kea-dhcp4",
"output_options": [
{
"output": "syslog"
}
],
"severity": "error",
"debuglevel": 0
}
]
}
}
```
**Contacting you**
Email/Github, telephone is available after contactnext-stable-2.6https://gitlab.isc.org/isc-projects/kea/-/issues/1345Ability to always-respond to all requests in HA active-active mode to support...2021-01-22T13:30:51ZEwald van GeffenAbility to always-respond to all requests in HA active-active mode to support anycast DHCPMy impression is that ISC KEA doesn't always respond to all requests. I think this is due to the 1/n split.
I run two KEA instances sharing a single BGP anycast /32 IP prefix. DHCP Requests get routed via a DHCP relay towards the closes...My impression is that ISC KEA doesn't always respond to all requests. I think this is due to the 1/n split.
I run two KEA instances sharing a single BGP anycast /32 IP prefix. DHCP Requests get routed via a DHCP relay towards the closest ISC KEA instance according to BGP. Load balancing is externally handled. This means KEA should respond to all requests it receives and not impose any load-balancing logic.
I think this is where the magic happens [1]
From my understanding active_servers needs to reflect the current server instance id (pri,sec).
[1] https://github.com/isc-projects/kea/blob/457111f9db051723ff9f8e7fb621872d0aa10363/src/hooks/dhcp/high_availability/query_filter.cc#L316outstandinghttps://gitlab.isc.org/isc-projects/kea/-/issues/3250Unattended terminated state and a reboot2024-03-27T21:53:51ZMarcin SiodelskiUnattended terminated state and a rebootConsider the following case. The clocks on two HA-enabled servers diverge and the clock skew eventually exceeds 60 seconds. As a result, both servers transition to the terminated state. In this state, the servers continue serving DHCP cl...Consider the following case. The clocks on two HA-enabled servers diverge and the clock skew eventually exceeds 60 seconds. As a result, both servers transition to the terminated state. In this state, the servers continue serving DHCP clients but do not exchange the lease updates nor heartbeats. An administrator neglects to correct the clocks and one of the servers reboots. The server enters the `waiting` state and remains in this state hoping that the other server is restarted so they can continue the lease database synchronization and start normal operation. However, the server is unaware that its reboot was not triggered in the course of fixing the clocks, so it will wait for the partner endlessly (or until the administrator comes to work in the morning). The waiting server is not responding to the DHCP traffic until then.
This situation should not occur in a setup where NTP has been enabled. It also should not occur if there is a proper monitoring to detect that the clocks diverge early enough. However, there are chances this situation may happen when all of this is neglected.
The proposed solution is to apply a timeout (could even be several to 10 minutes long) for a server in the waiting state. If the transition of its partner does not occur until this timeout elapses, the server in the waiting state transitions back to the terminated state and continues serving the clients. The waiting server MUST NOT transition to the waiting state immediately after it detects that its partner is in the terminated state to allow enough time to the administrator to reboot the server sequentially after correcting the clocks.
[SF1598](https://isc.lightning.force.com/lightning/r/Case/500S6000003jBs3IAE/view)kea2.5.8https://gitlab.isc.org/isc-projects/kea/-/issues/479HA peer should drop leases not present on the partner during sync2022-11-02T15:10:19ZMarcin SiodelskiHA peer should drop leases not present on the partner during syncLet's suppose there are two HA peers A and B. The peer B dies. While the peer B is offline, the admin sends `lease4-del` command to the A. The peer B starts up and synchronizes its lease database with A. It correctly adds new leases and ...Let's suppose there are two HA peers A and B. The peer B dies. While the peer B is offline, the admin sends `lease4-del` command to the A. The peer B starts up and synchronizes its lease database with A. It correctly adds new leases and updates existing leases based on the list received from A. However, it doesn't remove the lease deleted on A while it was offline. The server admin would need to send `lease4-del` command to B to remove the lease.
In order to address this problem we have to fetch all leases from the B's backend and iterate over them to see if they are also present on A. In order to do so, we will have to keep the local copy of leases received from A. For Memfile, MySQL and Postgres we could do it more efficiently by comparing ranges of leases as they are ordered by IP addresses. After comparing a range of leases we could simply drop the local copy of the lease ranges. However, this won't work for Cassandra which returns leases out of order. In the Cassandra case we will have to collect all leases returned by the peer.backlogMarcin SiodelskiMarcin Siodelskihttps://gitlab.isc.org/isc-projects/kea/-/issues/382Propagate lease updates between HA peers2022-11-02T15:08:41ZMarcin SiodelskiPropagate lease updates between HA peersHigh Availability setup includes at least two servers paired to provide reliable service. We have the lease_cmds hooks library which is utilized by the HA hooks library to send lease updates between the peers. Sometimes, though, an admin...High Availability setup includes at least two servers paired to provide reliable service. We have the lease_cmds hooks library which is utilized by the HA hooks library to send lease updates between the peers. Sometimes, though, an administrator may want to update lease information via the control channel, e.g. remove stale lease. Currently, he'd need to send appropriate command to all HA peers that (potentially) share the lease information. It is useful to be able to send the command to only one of the HA peers and let it propagate it down to other servers. For that, the HA peer would need to somehow identify that the command has been sent by the administrator rather than the HA peer, otherwise it would trigger circular updates.
The details how to implement it are TBD.backloghttps://gitlab.isc.org/isc-projects/kea/-/issues/681Synchronize reservations between HA partners2022-11-02T16:23:36ZGhost UserSynchronize reservations between HA partnersI have bought Kea Premium hook package and I am using it for IP reservation but I have a problem not sure if that's how should be or not.
I am running kea DHCP in HA(Active/HotStandBY)- When I add a reservation on Active node it doesn't...I have bought Kea Premium hook package and I am using it for IP reservation but I have a problem not sure if that's how should be or not.
I am running kea DHCP in HA(Active/HotStandBY)- When I add a reservation on Active node it doesn't get replicated to HotStandBy node. Due to this, I am unable to use my hot standby node. Can you please have a look asap.
And what will happen if I add some reservation while other node is down - will they get replication when the node comes back online?
Kea DHCP 1.5outstandinghttps://gitlab.isc.org/isc-projects/kea/-/issues/76Update leases on 'dashboard server' without running HA2022-11-02T15:08:41ZGhost UserUpdate leases on 'dashboard server' without running HAOne of our GSOC students is working on a Kea dashboard, based on the GLASS project, a dashboard for ISC DHCP. The dashboard requires access to a local lease file so it can continuously or frequently update stats about pool utilization, e...One of our GSOC students is working on a Kea dashboard, based on the GLASS project, a dashboard for ISC DHCP. The dashboard requires access to a local lease file so it can continuously or frequently update stats about pool utilization, etc. It seems like the ideal way to do this is to push lease file updates to the dashboard server.
It seems we can use the 'backup server' feature of HA, but without the HA support. So, we would want a mode that doesn't check for a valid HA configuration and an HA partner. Also, we would want this feature to not require the premium HA package.backloghttps://gitlab.isc.org/isc-projects/kea/-/issues/3290Clarify application of the ha-scopes command in the actual deployments2024-03-14T15:02:26ZMarcin GodzinaClarify application of the ha-scopes command in the actual deployments`ha-scopes` command can modify servers scopes without changing its role and other HA parameters.
It can be a powerful tool, but its use can put the server in a state that will be very confusing for the Administrator.
I think this comman...`ha-scopes` command can modify servers scopes without changing its role and other HA parameters.
It can be a powerful tool, but its use can put the server in a state that will be very confusing for the Administrator.
I think this command requires more documentation and warnings about its usage.
For example: \
We have a hot standby pair and send the `ha-scopes` command to the `standby` server, enabling scopes of both servers.
This results in `primary` and `standby` servers replying to DHCP traffic. But the second server still reports as in a `standby` state.
This can lead to massive confusion for Administrators.kea2.6.0https://gitlab.isc.org/isc-projects/kea/-/issues/3252HA multiple relationships and RADIUS reselect are incompatible2024-03-27T15:26:47ZFrancis DupontHA multiple relationships and RADIUS reselect are incompatibleNothing trivial can be done to fix this other to drop the first query (RADIUS hook parks the query at the subnet select callout and knows the right subnet when the RADIUS response is received). For other queries using cached RADIUS infor...Nothing trivial can be done to fix this other to drop the first query (RADIUS hook parks the query at the subnet select callout and knows the right subnet when the RADIUS response is received). For other queries using cached RADIUS information the correctness relies on the order of the HA and RADIUS hooks (RADIUS before HA).kea2.6.0https://gitlab.isc.org/isc-projects/kea/-/issues/3246DHCPRELEASE and lease expiration in active-standby HA setup2024-03-27T12:53:48ZPeter DaviesDHCPRELEASE and lease expiration in active-standby HA setupDHCPvRELEASE lease expiration in active-standby HA setup
Kea 2.5.5
When a client sends a DHCPRELEASE message to a Kea primary HA server, the expired
lease processing settings are honoured.
However, the primary updates the...DHCPvRELEASE lease expiration in active-standby HA setup
Kea 2.5.5
When a client sends a DHCPRELEASE message to a Kea primary HA server, the expired
lease processing settings are honoured.
However, the primary updates the failover server with instructions to delete the
lease.
This leads to a divergence of lease data between the two servers.
[SF00001636](https://isc.lightning.force.com/lightning/r/Case/500S6000004XPRy/view)kea2.5.8https://gitlab.isc.org/isc-projects/kea/-/issues/3226HA lease updates do not create an accounting entry in v62024-01-25T15:00:10ZAndrei Pavelandrei@isc.orgHA lease updates do not create an accounting entry in v6In v6, HA lease updates are done with the `lease6-bulk-apply` command which is not handled in the `command_processed` RADIUS callout.
This is unlike v4 which does create accounting entries for HA lease updates sent via `lease4-update`.In v6, HA lease updates are done with the `lease6-bulk-apply` command which is not handled in the `command_processed` RADIUS callout.
This is unlike v4 which does create accounting entries for HA lease updates sent via `lease4-update`.next-stable-2.6https://gitlab.isc.org/isc-projects/kea/-/issues/3206subnet-get commands should fetch leases for selected subnets with pagination2024-03-22T13:15:53ZMarcin Siodelskisubnet-get commands should fetch leases for selected subnets with paginationIn HA, we use lease commands to synchronize the database. The lease commands fetch all leases with pagination. However, in the hub-and-spoke model it would be useful to fetch the leases only for selected subnets because the relationships...In HA, we use lease commands to synchronize the database. The lease commands fetch all leases with pagination. However, in the hub-and-spoke model it would be useful to fetch the leases only for selected subnets because the relationships are partitioned by subnet. Today, all leases have to be fetched by each relationship and those that do not belong to the relationship are discarded. This is inefficient. One thing to consider is that subnet identifiers are listed explicitly in the commands.next-stable-3.0https://gitlab.isc.org/isc-projects/kea/-/issues/3125HA ignored packets cause DROP statistics counter increment2024-03-27T12:58:00ZDarren AnkneyHA ignored packets cause DROP statistics counter incrementHA_BUFFER6_RECEIVE_NOT_FOR_US increments drop counters.
- This happens at least with a load balancing configuration.
- I think maybe not with hot-standby since I don't think the service logs anything or cares about incoming client pack...HA_BUFFER6_RECEIVE_NOT_FOR_US increments drop counters.
- This happens at least with a load balancing configuration.
- I think maybe not with hot-standby since I don't think the service logs anything or cares about incoming client packets unless it loses contact with the HA peer?
- I cite BUFFER6 above but I'm sure the same is true for DHCPv4.
Possible solutions:
- introduce a new drop status that could be discounted later or part of a different drop statistic?
- Could introduce new status that it is ignored or filtered instead of dropped?
[SF1374](https://isc.lightning.force.com/lightning/r/Case/5007V00002YkO0oQAF/view)kea2.5.8https://gitlab.isc.org/isc-projects/kea/-/issues/2932Kea HA issue with terminating connection2023-11-10T09:50:24ZNick HahnKea HA issue with terminating connectionWe recently migrated our DHCP setup from dhcpd to Kea. It runs on
two servers with hot standby and a memfile backend for the leases. Kea
assigns IP addresses for around 7000 pools.
Over the past few months the HA connection terminated...We recently migrated our DHCP setup from dhcpd to Kea. It runs on
two servers with hot standby and a memfile backend for the leases. Kea
assigns IP addresses for around 7000 pools.
Over the past few months the HA connection terminated in random intervals.
From looking at the logs on the passive node I can see a lot of
'ResourceBusy: IP address ... could not be updated' warnings prior to
the connection terminating. Since multithreading is enabled I suspected
this may be due to the threads encountering a resource lock on the memfile.
I suppose after the lease update fails a few times, the connection is terminated.
Is the 'ResourceBusy' warning the cause for the terminating HA connection and
is there any way to fix the underlying issue? Any ideas on the issue are greatly
appraciated.
Here are the logs from the primary server:
```
Jun 12 15:04:31 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625735366400] ALLOC_ENGINE_V4_ALLOC_FAIL_SUBNET [hwtype=1, cid=[], tid=0x0: failed to allocate an IPv4 lease in the subnet 123.123.123.123/30, subnet-id 30926, shared network (none)
Jun 12 15:04:31 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625735366400] ALLOC_ENGINE_V4_ALLOC_FAIL [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 address after 1 attempt(s)
Jun 12 15:04:31 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625735366400] ALLOC_ENGINE_V4_ALLOC_FAIL_CLASSES [hwtype=1], cid=[], tid=0x0: Failed to allocate an IPv4 address for client with classes: ALL, HA_primary-dhcp, VENDOR_CLASS_MSFT 5.0, UNKNOWN
Jun 12 15:04:39 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625726973696] ALLOC_ENGINE_V4_ALLOC_FAIL_SUBNET [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 lease in the subnet 123.123.123.123/30, subnet-id 30926, shared network (none)
Jun 12 15:04:39 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625726973696] ALLOC_ENGINE_V4_ALLOC_FAIL [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 address after 1 attempt(s)
Jun 12 15:04:39 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625726973696] ALLOC_ENGINE_V4_ALLOC_FAIL_CLASSES [hwtype=1], cid=[], tid=0x0: Failed to allocate an IPv4 address for client with classes: ALL, HA_primary-dhcp, VENDOR_CLASS_MSFT 5.0, UNKNOWN
Jun 12 15:04:45 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.ha-hooks.139625718580992] HA_LEASE_UPDATE_CONFLICT [hwtype=1], cid=[], tid=0x0: lease update to standby-dhcp (http://dhcp-2:8001/) returned conflict status code: ResourceBusy: IP address:123.123.123.123 could not be updated. (error code 4)
Jun 12 15:04:56 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625735366400] ALLOC_ENGINE_V4_ALLOC_FAIL_SUBNET [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 lease in the subnet 123.123.123.123/30, subnet-id 30926, shared network (none)
Jun 12 15:04:56 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625735366400] ALLOC_ENGINE_V4_ALLOC_FAIL [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 address after 1 attempt(s)
Jun 12 15:04:56 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625735366400] ALLOC_ENGINE_V4_ALLOC_FAIL_CLASSES [hwtype=1], cid=[], tid=0x0: Failed to allocate an IPv4 address for client with classes: ALL, HA_primary-dhcp, VENDOR_CLASS_MSFT 5.0, UNKNOWN
Jun 12 15:05:28 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625726973696] ALLOC_ENGINE_V4_ALLOC_FAIL_SUBNET [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 lease in the subnet 123.123.123.123/30, subnet-id 30926, shared network (none)
Jun 12 15:05:28 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625726973696] ALLOC_ENGINE_V4_ALLOC_FAIL [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 address after 1 attempt(s)
Jun 12 15:05:28 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625726973696] ALLOC_ENGINE_V4_ALLOC_FAIL_CLASSES [hwtype=1], cid=[], tid=0x0: Failed to allocate an IPv4 address for client with classes: ALL, HA_primary-dhcp, VENDOR_CLASS_MSFT 5.0, UNKNOWN
Jun 12 15:05:31 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625752151808] ALLOC_ENGINE_V4_ALLOC_FAIL_SUBNET [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 lease in the subnet 123.123.123.123/30, subnet-id 30926, shared network (none)
Jun 12 15:05:31 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625752151808] ALLOC_ENGINE_V4_ALLOC_FAIL [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 address after 1 attempt(s)
Jun 12 15:05:31 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.alloc-engine.139625752151808] ALLOC_ENGINE_V4_ALLOC_FAIL_CLASSES [hwtype=1], cid=[], tid=0x0: Failed to allocate an IPv4 address for client with classes: ALL, HA_primary-dhcp, VENDOR_CLASS_MSFT 5.0, UNKNOWN
Jun 12 15:05:39 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.ha-hooks.139625701795584] HA_LEASE_UPDATE_CONFLICT [hwtype=1], cid=[], tid=0x0: lease update to standby-dhcp (http://dhcp-2:8001/) returned conflict status code: ResourceBusy: IP address:123.123.123.123 could not be updated. (error code 4)
Jun 12 15:05:39 dhcp-1 kea-dhcp4[564812]: WARN [kea-dhcp4.ha-hooks.139625718580992] HA_LEASE_UPDATE_CONFLICT [hwtype=1], cid=[], tid=0x0: lease update to standby-dhcp (http://dhcp-2:8001/) returned conflict status code: ResourceBusy: IP address:123.123.123.123 could not be updated. (error code 4)
Jun 12 15:05:39 dhcp-1 kea-dhcp4[564812]: ERROR [kea-dhcp4.ha-hooks.139625718580992] HA_LEASE_UPDATE_REJECTS_CAUSED_TERMINATION too many rejected lease updates cause the HA service to terminate
Jun 12 15:05:39 dhcp-1 kea-dhcp4[564812]: ERROR [kea-dhcp4.ha-hooks.139625718580992] HA_TERMINATED HA service terminated due to an unrecoverable condition. Check previous error message(s), address the problem and restart!
```
Here are the logs from the standby server:
```
Mar 12 19:25:06 dhcp-2 kea-dhcp4[203037]: WARN [kea-dhcp4.lease-cmds-hooks.139670034884352] LEASE_CMDS_UPDATE4_CONFLICT lease4-update command failed due to conflict (parameters: { "client-id": "", "expire": 1678688706, "force-create": true, "fqdn-fwd": false, "fqdn-rev": false, "hostname": "", "hw-address": "", "ip-address": "", "state": 0, "subnet-id": 2907, "valid-lft": 43200 }, reason: ResourceBusy: IP address:123.123.123.123 could not be updated.)
Mar 12 19:25:06 dhcp-2 kea-dhcp4[203037]: WARN [kea-dhcp4.lease-cmds-hooks.139670009706240] LEASE_CMDS_UPDATE4_CONFLICT lease4-update command failed due to conflict (parameters: { "client-id": "", "expire": 1678688706, "force-create": true, "fqdn-fwd": false, "fqdn-rev": false, "hostname": "", "hw-address": "", "ip-address": "", "state": 0, "subnet-id": 2907, "valid-lft": 43200 }, reason: ResourceBusy: IP address:123.123.123.123 could not be updated.)
Mar 12 19:27:28 dhcp-2 kea-dhcp4[203037]: WARN [kea-dhcp4.lease-cmds-hooks.139670009706240] LEASE_CMDS_UPDATE4_CONFLICT lease4-update command failed due to conflict (parameters: { "client-id": "", "expire": 1678688848, "force-create": true, "fqdn-fwd": false, "fqdn-rev": false, "hostname": "", "hw-address": "", "ip-address": "", "state": 0, "subnet-id": 3812, "valid-lft": 43200 }, reason: ResourceBusy: IP address:123.123.123.123 could not be updated.)
Mar 12 19:32:05 dhcp-2 kea-dhcp4[203037]: WARN [kea-dhcp4.lease-cmds-hooks.139670018098944] LEASE_CMDS_UPDATE4_CONFLICT lease4-update command failed due to conflict (parameters: { "client-id": "", "expire": 1678689125, "force-create": true, "fqdn-fwd": false, "fqdn-rev": false, "hostname": "", "hw-address": "", "ip-address": "", "state": 0, "subnet-id": 274, "valid-lft": 43200 }, reason: ResourceBusy: IP address:123.123.123.123 could not be updated.)
Mar 12 19:32:34 dhcp-2 kea-dhcp4[203037]: WARN [kea-dhcp4.lease-cmds-hooks.139670009706240] LEASE_CMDS_UPDATE4_CONFLICT lease4-update command failed due to conflict (parameters: { "client-id": "", "expire": 1678689154, "force-create": true, "fqdn-fwd": false, "fqdn-rev": false, "hostname": "", "hw-address": "", "ip-address": "", "state": 0, "subnet-id": 113, "valid-lft": 43200 }, reason: ResourceBusy: IP address:123.123.123.123 could not be updated.)
Mar 12 19:32:36 dhcp-2 kea-dhcp4[203037]: ERROR [kea-dhcp4.ha-hooks.139670104323840] HA_TERMINATED HA service terminated due to an unrecoverable condition. Check previous error message(s), address the problem and restart!
Mar 12 22:11:09 dhcp-2 kea-dhcp4[203037]: ERROR [kea-dhcp4.packets.139670138794688] DHCP4_BUFFER_RECEIVE_FAIL error on attempt to receive packet: Truncated DHCPv4 packet (len=0) received, at least 236 is expected.
```
The relevant config is the following on both hosts, differing only in the "this-server-name" property.
```
"hooks-libraries": [{
"library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_lease_cmds.so",
"parameters": {}
},
{
"library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_stat_cmds.so",
"parameters": {}
},
{
"library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_ha.so",
"parameters": {
"high-availability": [{
"this-server-name": "standby-dhcp",
"mode": "hot-standby",
"heartbeat-delay": 10000,
"max-response-delay": 60000,
"max-ack-delay": 5000,
"max-unacked-clients": 5,
"peers": [{
"name": "primary-dhcp",
"url": "http://dhcp-1:8001/",
"role": "primary",
"auto-failover": true
}, {
"name": "standby-dhcp",
"url": "http://dhcp-2:8001/",
"role": "standby",
"auto-failover": true
}]
}]
}
}]
```next-stable-2.6https://gitlab.isc.org/isc-projects/kea/-/issues/2897Cross-check - server should check its HA partner config2023-06-15T13:50:50ZTomek MrugalskiCross-check - server should check its HA partner configHere's an idea for new HA capability. On startup (or when explicit command is called), the server retrieves its partner configuration with `config-get` and checks it for consistency: if the subnets and pools are defined the same way, if ...Here's an idea for new HA capability. On startup (or when explicit command is called), the server retrieves its partner configuration with `config-get` and checks it for consistency: if the subnets and pools are defined the same way, if the subnet-ids match etc.
Right now the doc says those should be the same, with the only difference being server-name, but we don't check it.
What to do with spotted differences is to be determined. We could print a warning, refuse HA connection, shutdown, or even maybe the primary attempt to fix its partner's config.
This is merely an idea. If we like it, the first step would be to turn this into more coherent design. Hence the ~design.backloghttps://gitlab.isc.org/isc-projects/kea/-/issues/2775HA hook's URLs should support DNS resolution with configurable re-resolution2023-04-06T13:43:06ZTobias FlorekHA hook's URLs should support DNS resolution with configurable re-resolution---
name: Feature request
about: Allow using DNS resolution in HA hook's URLs
---
**Some initial questions**
- Are you sure your feature is not already implemented in the latest Kea version? **yes**
- Are you sure what you would like to...---
name: Feature request
about: Allow using DNS resolution in HA hook's URLs
---
**Some initial questions**
- Are you sure your feature is not already implemented in the latest Kea version? **yes**
- Are you sure what you would like to do is not possible using some other mechanisms? **not reasonable**
- Have you discussed your idea on kea-users or kea-dev mailing lists? **no**
**Is your feature request related to a problem? Please describe.**
I am deploying HA Kea on Kubernetes where (using SDNs) pod(/container) IPs are not constant. The hostname can be made persistent though.
Now I can create a Kubernetes service per Pod which will assign a so-called cluster IP which is stable and gets redirected to the pod.
This works alright for HA communicating to the control agent, but not using the dedicated listener.
**Describe the solution you'd like**
Preferably allow using DNS (re-)resolution for HA hook's URLs. Or allow specifying the listener's bind-address.
**Funding its development**
Kea is run by ISC, which is a small non-profit organization without any government funding or any permanent sponsorship organizations. Are you able and willing to participate financially in the development costs? **no**
**Participating in development**
Are you willing to participate in the feature development? ISC team always tries to make a feature as generic as possible, so it can be used in wide variety of situations. That means the proposed solution may be a bit different that you initially thought. Are you willing to take part in the design discussions? Are you willing to test an unreleased engineering code? **yes**
**Contacting you**
preferably via gitlab.backloghttps://gitlab.isc.org/isc-projects/kea/-/issues/2714RFE: HA plugin ability to detect partner inabilty to receive client requests ...2023-07-31T14:12:57ZKevin FlemingRFE: HA plugin ability to detect partner inabilty to receive client requests and transition it to 'partner-down'---
name: Feature request
about: HA plugin ability to detect partner inabilty to receive client requests and transition it to 'partner-down'
---
**Some initial questions**
- Are you sure your feature is not already implemented in the l...---
name: Feature request
about: HA plugin ability to detect partner inabilty to receive client requests and transition it to 'partner-down'
---
**Some initial questions**
- Are you sure your feature is not already implemented in the latest Kea version? Yes
- Are you sure what you would like to do is not possible using some other mechanisms? Yes
- Have you discussed your idea on kea-users or kea-dev mailing lists? Yes
**Is your feature request related to a problem? Please describe.**
(This issue was created as a result of an extensive thread on kea-users)
When the HA plugin is being used in either hot-standby or load-balancing mode, Kea peers are able to notice some forms of communications failures and force the other peers to the 'partner-down' state in order to provide service to clients supported by the other peer.
However, in a situation where client requests are not being delivered to a peer, but it is otherwise fully operational including the peer-to-peer communications link, clients supported by that peer will not be serviced, but the other peer(s) care unable to notice the issue and take action to correct it. This situation could arise when the Kea peers are using separate network links for client traffic and HA traffic, or when the Kea peers are receiving client traffic via a DHCP relay and the relay configuration is incorrect.
**Describe the solution you'd like**
One (or more) opt-in mechanisms that the Kea admin can choose to enhance the ability to detect peer failures to service clients, even when the peer's Kea daemon is otherwise fully operational.
**Describe alternatives you've considered**
Some discussions about external monitoring solutions have occurred, and that is certainly an option which some admins could choose.
**Funding its development**
Kea is run by ISC, which is a small non-profit organization without any government funding or any permanent sponsorship organizations. Are you able and willing to participate financially in the development costs? Yes
**Participating in development**
Are you willing to participate in the feature development? ISC team always tries to make a feature as generic as possible, so it can be used in wide variety of situations. That means the proposed solution may be a bit different that you initially thought. Are you willing to take part in the design discussions? Are you willing to test an unreleased engineering code? Yesnext-stable-2.6https://gitlab.isc.org/isc-projects/kea/-/issues/2708HA pool rebalancing2023-02-02T14:23:33ZTomek MrugalskiHA pool rebalancingThis idea is not new. It was recently brought up by @cathya in Porto (see [notes](https://pad.isc.org/p/porto2022-kea-features-for-stork#L58). The overall concept is to design and implement a mechanism similar to the one in ISC DHCP. Whe...This idea is not new. It was recently brought up by @cathya in Porto (see [notes](https://pad.isc.org/p/porto2022-kea-features-for-stork#L58). The overall concept is to design and implement a mechanism similar to the one in ISC DHCP. When there are two servers in load-balancing, it is possible that one of them will run out of addresses while the other one still has many.
Couple random comments:
- The pool rebalancing would somehow make both partners negotiate the pools and rebalance them.
- Using a hysteresis approach with high/low threshold would prevent the mechanism to go crazy when running out of addresses. We don't want it to go crazy when there's one or two addresses left.
- The pool dynamism would add extra complexity as the modified pool range would need to be stored somewhere that would survive crashes/reboots etc.
This requires a ~design. It's a complicated feature request with a high potential for endless tweaks, conflicting tuning requests etc.
We will do it one day, but this would require a lot of design, testing and tuning.outstandinghttps://gitlab.isc.org/isc-projects/kea/-/issues/2700HA Load-Balancing Network issue detection between Relay and Kea2023-01-26T15:22:15ZMathias AichingerHA Load-Balancing Network issue detection between Relay and KeaHi,
I have already tried to resolve this issue with the kea users community, but it seems not many are using HA Load Balancing.
I have the following problem.
Scenario:
Multiple DHCP-Relays at different sites with both KEA-Servers as DH...Hi,
I have already tried to resolve this issue with the kea users community, but it seems not many are using HA Load Balancing.
I have the following problem.
Scenario:
Multiple DHCP-Relays at different sites with both KEA-Servers as DHCP-Servers. Both servers are available and the load balancing shifts the requests between the two servers.
Incident: Because of a network issue Kea 1 is not available from the clients. The network connection between Kea 1 and Kea 2 still works, so no partner-down state.
Expected behaviour: Kea 2 sees the unacked clients of Kea 1 and sets Kea 1 in partner-down state and handles all requests.
Experienced behaviour: Kea 2 still reports HA_BUFFER4_RECEIVE_NOT_FOR_US and does not handle the requests. Unacked clients is not counted.
Is there a misunderstanding or configuration mistake on my side?
```
{
"library": "/usr/local/lib/kea//hooks/libdhcp_ha.so",
"parameters": {
"high-availability": [
{
"this-server-name": "server2",
"mode": "load-balancing",
"heartbeat-delay": 10000,
"max-response-delay": 60000,
"max-ack-delay": 10000,
"max-unacked-clients": 1,
"delayed-updates-limit": 100,
"peers": [
{
"name": "server1",
"url": "http://192.168.248.1:8080/",
"role": "primary",
"auto-failover": true
},
{
"name": "server2",
"url": "http://192.168.248.2:8080/",
"role": "secondary",
"auto-failover": true
}
]
}
]
}
}
```
Thank you,
Mathiasbacklog