Marcin Siodelski · 8beee0f3
--- a/Designs/free-lease-queues-design.md
+++ b/Designs/free-lease-queues-design.md
@@ -79,45 +79,35 @@ The `reset` function clears the permutation state and allows to start offering t

 The permutations will be maintained in the pools' allocation states. The random allocator will call their `next` functions to get next available addresses for respective pools until the permutations exhaust addresses. The allocator will randomly pick the pool from those that still have permutations with not offered addresses. When all permutations are exhausted, the allocator will reset them and start giving out the same addresses again.

-## Free Lease Queue Allocator (this section is under some rework)
+## Free Lease Queue Allocator

-The FLQ is a structure holding information about available leases in various IP address/prefix ranges. This structure has the following properties:
- holds the information about free leases for all configured pools for which `free-leases-queue` is `true`,
- can be used as double ended queue for an IP range, i.e. free leases appended to the end of the queue and picked from its front,
- is optimized for appending a reclaimed lease at the end of the address range to which this lease belongs, i.e. the appropriate queue should be quickly found for a reclaimed lease and the lease should be appended to this queue
+### The Concept

-Free leases are populated to the appropriate queues during server's startup or reconfiguration. The server iterates over the entire (or optionally a part of the) IP range and for each address or delegated prefix it checks if it is free (there is no valid lease for it). Each free lease is appended to the FLQ. The order in which the leases are appended to the FLQ is determined by the `IPRangePermutation` instance for the given pool.
+The FLQ allocator state holds a data structure with available leases. The server populates this data structure during the startup or reconfiguration. It is fair to say that this allocator moves the checks whether a lease is available for assignment from the DHCP packet processing stage to the server's reconfiguration state. The server's boot time can be significantly longer, but the lease allocation should be, in some cases, considerably faster. The performance should especially improve for the highly utilized pools (i.e., pools from which many addresses have been already allocated) or fragmented pools (pools with non-contiguously assigned addresses).

-The process of populating free leases is expensive and may significantly delay the server's reconfiguration. Thus, the server should compare the pools before and after reconfiguration and populate free leases only for new or modified pools. If no pools have been added or modified, no leases are populated. The free leases will always be populated when the server is (re)started.
+The free lease queues are populated for each pool when necessary (e.g., when the pool is created or when it is modified). The allocator (precisely its allocation state) sends a query to the lease backend to retrieve all leases in the pool range. For each address or delegated prefix (resource) in the pool, it checks if the backend returned a valid lease. If there is no lease, the resource is inserted into the free lease queue belonging to the pool's state. The free lease queue can shuffle inserted leases using the permutations implemented for the random allocator.

-The `FreeLeaseQueue` class includes, but is not limited to, the following calls for managing and retrieving free leases:
+The free lease queue implementation for a pool comprises two containers internally. Initially, all free leases are inserted into the first container. The second container is empty. When the allocation engine requests a lease assignment, the allocator returns one from the first queue and moves the lease to the second container. The lease is now "offered" - a transient state between "free" and "allocated".

-```c++
-// Used in lease reclamation to append reclaimed lease to the end of the queue.
-bool append(const asiolink::IOAddress& address);
-bool append(const asiolink::IOAddress& prefix, const uint8_t delegated_length);
-
-// Used when free leases are populated to append free lease to the range.
-void append(const AddressRange& range, const asiolink::IOAddress& address);
-void append(const PrefixRange& range, const asiolink::IOAddress& prefix);
-
-// Used during lease allocation to indicate that the given lease is being allocated
-// and should be removed from FLQ.
-bool use(const AddressRange& range, const asiolink::IOAddress& address);
-bool use(const PrefixRange& range, const asiolink::IOAddress& prefix);
-
-// Returns next candidate lease without removing it from FLQ.
-template<typename RangeType>
-asiolink::IOAddress next(const RangeType& range);
-
-// Returns next candidate lease and removes it from the FLQ.
-template<typename RangeType>
-asiolink::IOAddress pop(const RangeType& range);
-```
+The server can allocate a lease in many different ways: in 4-way exchange (e.g., DORA), by a lease_cmds hook library command, by HA lease update (also lease_cmds hook library), or by database replication when multiple servers share the database. When an allocated lease expires, the server should make it available for allocation. Allocations and expirations occur continuously in the server making some leases unavailable for allocation and others available. 
+
+Currently, the allocator does not directly participate in the lease allocation and reclamation. It works for the iterative and random allocators because they are stateless. The FLQ allocator is stateful. It must track the allocations and reclamations to ensure consistency between the free lease queues and the lease database.
+
+In all cases, except the database replication, leases are assigned and reclaimed using the lease manager. Introducing events into the lease manager allows for tracking lease changes. The FLQ allocator can subscribe to the events emitted during lease creation, update, and deletion. For example, when the server allocates a lease, the allocator receives an event and removes the lease from the free lease queue. When the lease is deleted or reclaimed, the allocator puts it back into the free lease queue. This solution will also work for leases managed by the `lease_cmds` hook.
+
+The events mechanism can use simple callbacks. The `EventfulLeaseMgr` should derive from the abstract `LeaseMgr` and provide the mechanics to install callbacks for specific lease manager's calls. Functions such as `addLease4` and `addLease6` should have default implementations in the `EventfulLeaseMgr` but remain pure virtual, so it remains mandatory to implement them in the concrete backend implementations. The concrete implementations should call the default implementations to emit events when callbacks are installed.
+
+Not only are the lease manager events useful for FLQ, but they can also have a variety of other applications. For example, it could log lease changes to track them in Stork.
+
+### Common Database and Database Replication Case
+
+Users often deploy multiple Kea servers connected to a single database instance or a cluster of databases sharing the lease information for redundancy. FLQ can work in this setup, but, like other allocators, it is prone to lease allocation conflicts. Each server maintains its own free lease queue populated when the server starts up. However, the lease queues are not updated when the partner servers allocate leases. The information about the lease is only available in the database. Hence, the allocators will sometimes offer leases that are already allocated by the other servers. This is the same situation we presently have with the iterative allocator.
+
+One way to mitigate this problem is by introducing signaling between the allocation engine and the allocator to notify about failed allocation attempts due to lease conflict. In this case, the allocator could remove such a lease from its queue.

 ### New Lease API Calls

-When populating free leases for an IP range the server must check if the lease exists for each address or delegated prefix in this range. This has significant performance implications for SQL/CQL lease database backends. It is unacceptable to make a separate database query for each address or delegated prefix. Instead, the backends should provide the API calls for fetching all valid leases within a given IP range. Then, the server can iterate over all addresses or delegated prefixes within the IP range and use the in-memory copy of the leases to decide which address/delegated prefix is free and should be recorded within the FLQ. This reduces the number of database queries to one per address range.
+When populating free leases for an IP range the server must check if the lease exists for each address or delegated prefix in this range. This has significant performance implications for SQL lease database backends. It is unacceptable to make a separate database query for each address or delegated prefix. Instead, the backends should provide the API calls for fetching all valid leases within a given IP range. Then, the server can iterate over all addresses or delegated prefixes within the IP range and use the in-memory copy of the leases to decide which address/delegated prefix is free and should be recorded within the FLQ. This reduces the number of database queries to one per address range.

 ### Lease Reclamation

@@ -127,7 +117,7 @@ In case of the FLQ allocation strategy the lease reuse will almost never occur.

 When the server is started and the FLQ is in use, the server must make it clear to the administrator (by issuing a log message) that the periodic lease reclamation must be enabled. The frequency of the lease reclamation is also important. If it is too rare, the FLQ may get exhausted causing the allocation to fail, even though there may be some expired leases to be reused.

-## Optimizations for Large Pools
+### Optimizations for Large Pools

 The FLQ mechanism is aimed at dealing with performance degradations when pools are nearly exhausted. This is rarely the case for very large pools and it is assumed that for these pools the administrators will rather disable the FLQ. However, it is hard to make definitive statements when a pool is too large for using FLQ and when using FLQ is beneficial with acceptable server startup time. It depends on many factors and it may require some experience in the system administration before any well informed decisions can be made. Some administrators may want to leave the FLQ enabled for all pools initially. Some can make configuration errors and leave FLQ enabled for a pool as large as /64. The server should handle these cases gracefully and shouldn't freeze on startup.

@@ -139,4 +129,3 @@ The server stops refilling a given pool with leases when the `IPRangePermutation

 The presented solution is compatible with all supported lease database backends. It can also be used with proprietary lease database backends, should they be created. The backends must implement new calls fetching valid leases for an IP range, described earlier in this design.

-Because the server maintains FLQ in its memory and free leases are populated during startup, reconfiguration or while refilling the FLQ, there is a possibility that the information within the FLQ and the lease database diverge. For example, this is the case when two or more Kea servers are share the lease database and independently allocate leases. One of the servers "thinks" that a lease is free but the other server has allocated this lease to another client. This design does not deal with such conflicts in any special way. It relies on the existing allocation engine implementation which already resolves such conflicts. If it finds that a lease exists for the candidate address or delegated prefix it simply gets the next candidate address from an allocator. The process ends after as many attempts as the size of the pool. In case of the shared database it certainly increases the average number of calls to FLQ to get free lease, on the other hand the risk of such collisions should be partially mitigated by the use permutations.