During traffic spikes that exceed Kea's throughput capacity, handle backlog more effectively
The current Kea implementation processes the inbound socket buffer as a simple queue - first in, first out. When the server is under pressure and not handling client packets as fast as they are arriving, a backlog will build up.
If the situation continues for long enough, the client packets that the server is handling will have already timed-out on the client side, so it is pointless to spend time processing them and moreover wasting time on these old packets prevents the server from handling newer packets until they too have timed out. Effectively, it stops responding to active clients because it never gets through the backlog fast enough to reach the most recent inbounds.
Even though the initial spike in traffic may have subsided, the degraded performance can mean that clients change their behaviour, adding retries to the backlog and/or reverting back to initial discovery - thus increasing the backlog of packets to be processed and making recovery unlikely without restarting the server to clear things down.
We need to handle this situation better so that even when swamped, Kea servers are able to process a proportion of recently-received client packets, instead of none of them because it's 'stuck' with the oldest ones instead.
Suggestions being mooted so far suggest either an independent socket reading thread (or process) to manage the inbound traffic and to pull it off the sockets/interfaces on which the Kea server is listening. This will prevent the UDP buffers from overflowing as well as allowing the socket reader to apply better logic to:
- discarding the oldest client packets in favour of the most recently received
- managing the 'waiting' buffers appropriately to the throughput capacity of the server
Maximum per-server throughput will be highly dependent on both configuration and the choice of back-end (e.g database, or memfile, and if database, how and where etc..) - so it would be good to have the I/O handler be tunable too - not discarding too soon for a fast server and so on.
There's no clear operational mitigation strategy for this, other than ensuring sufficient headroom when provisioning so that there are no peaks in client traffic that can overwhelm the server(s) maximum capacity.
(Notably, increasing inbound UDP buffers is likely to make the situation worse rather than better.)