Shared network with different subnets for HR and pool - a client matching a HR is issued lease from pool but with wrong subnet ID
The scenario is a shared network that contains two subnets.
One subnet has no pool addresses, but has a number of host reservations for clients. These clients are matched on the basis of information added by their relay.
The other subnet has the pool addresses for unreserved clients.
Something goes wrong with the mechanism for identifying clients, so the same relay-info is being associated with more than one client. This means that sometimes when a client needs a lease, it is matched to the host reservation per the information added by the relay, but it can't be allocated the address in its host reservation because that address is already in-use.
The Kea server OFFERs instead a pool address from the other subnet in the shared network. This is then written to the leases database, but with the subnet of the host reservations, not the subnet of the unreserved pool. The client operates normally and the Kea server appears not to take issue with this itself (although I anticipate that there might be a problem restarting and loading these subnet/address mis-matched leases). But in an HA environment, the lease update is rejected by the other servers because of the subnet id being incorrect for the address of the lease.
(Note that the above is a production environment issue, but that other circumstances could lead to an address associated with a HR that matches a 'new' client, not actually being available to be granted, so I think we should look at this more widely than just this scenario as presented above. There is also the additional scenario (which I think would take a different code path) where a client is offered the HR address, but then sends back DHCPDECLINE because it detects itself that it is in use locally, even if the Kea server did not issue the lease. Please consider this scenario too when looking at reasons why an address associated with a HR cannot be OFFERed).
Here's the logging by HA when the lease update is rejected
2022-04-28 22:42:39.410 ERROR [kea-dhcp4.callouts/9787.140701841197248] HOOKS_CALLOUT_ERROR error returned by callout on hook $lease4_update registered by library with index 1 (callout address 0x7ff7aa57bf30) (callout duration 0.074 ms) 2022-04-28 22:42:41.546 ERROR [kea-dhcp4.callouts/9787.140701841197248] HOOKS_CALLOUT_ERROR error returned by callout on hook $lease4_update registered by library with index 1 (callout address 0x7ff7aa57bf30) (callout duration 0.066 ms)
2022-04-28 22:30:07.013 WARN [kea-dhcp4.ha-hooks/26493.139690055428288] HA_LEASE_UPDATE_FAILED [hwtype=1 ce:47:47:XX:XX:XX], cid=[01:ce:47:47:XX:XX:XX], tid=0x5fc5fba0: lease update to SERVER-NAME-REDACTED (http://XXX.XXX.XX.XX:8080) failed: The address 10.1.XX.XX does not belong to subnet 192.2.XX.XX/24, subnet-id=6, error code 1
And here's the logging on the active server when it hits the issue with host reservation because another client has already been issued with the address associated with the matching HR:
2022-04-28 22:36:41.489 WARN [kea-dhcp4.alloc-engine/26493.139690055428288] ALLOC_ENGINE_V4_DISCOVER_ADDRESS_CONFLICT [hwtype=1 08:9b:b9:XX:XX:XX], cid=[01:08:9b:b9:XX:XX:XX], tid=0xdcec1449: conflicting reservation for address 10.1.XX.XX with existing lease Address: 10.1.XX.XX
Version 2.0.0 (package)