- Tracking DHCP Leases Design
This is a draft document and we are actively working on it. It is subject to significant changes.
Tracking DHCP Leases Design
Users would like to use Stork for monitoring DHCP leases assigned to the clients by different Kea servers. They would like to see to which clients the leases are assigned, troubleshoot issues with particular leases, search for leases belonging to a given client, and many more. Leases are in the center of interest for a DHCP server operator because a DHCP server's dominant role is to assign them. Failures in lease assignments cause network subscribers to be out of service. An operator needs to resume the service for them as quickly as possible, and it gets harder in large networks with many subnets and many leases. Stork should provide tools to isolate issues with the lease allocations, e.g. search a culprit lease, browse lease information, check who uses the lease and when it expires. Specific use cases are described in detail in further sections.
A lease database may grow very large even for a single DHCP server. The lease database sizes range from hundreds to millions of leases. Kea servers provide an API to perform lease queries. Using this API directly is often convenient when looking for some existing information, e.g. finding a specific lease or finding all leases from a selected subnet. The use of the API also has the advantage that the Kea servers return the up-to-date lease information. However, excessive use of the API commands may impact the Kea server operation when generating large responses. For example, dumping all leases periodically via the API can significantly increase memory consumption. In some cases, it may also impact the DHCP server's performance due to SQL tables locking or pausing the DHCP operation while the server is processing a command in the single-threaded mode. Finally, some information is not directly available in the lease database and may only be produced during an analysis of lease allocations over time and, possibly, a correlation of the leases with some other data. The Kea API is not useful in these cases.
Gathering lease information from Kea during allocations makes it available for processing outside of Kea, depending on the needs. It reduces the load on the Kea servers that no longer have to process the commands, avoids costly conversions to JSON, and provides flexibility in how and where to process the data. However, gathering the lease information puts a constant load on the Kea servers to output the lease information. Also, it requires Stork to process a potentially large amount of data periodically, which may significantly impact Stork's performance.
In large deployments, an administrator may need to find a compromise between the amount of collected data and the selection of utilities available in Stork. It should be fully configurable.
This section lists different use cases for gathering and analyzing the lease information from the Kea servers that appeared in various conversations so far.
Network Bird Eye View - An administrator wants to see a graphical view of all lease allocations in the monitored network. The address space is divided into boxes representing 1-16 IP addresses depending on the subnet size. The box color intensity represents address utilization for the given range of addresses.
Client's Lease State - An administrator would like to see what is happening with a given client's lease. He would typically like to see the complete information about the allocated lease.
Client's Lease History - An administrator may want to know a history of the client's lease allocations or allocation attempts in some cases.
Host Reservation Status - An administrator would like to check if a given reservation is in use.
Host Reservation Last Usage - An administrator would like to check when a given reservation was last used.
Host Reservations Usage History - An administrator may want to see which reservations have never been used. He may want to see which host reservations have been used for the first time within the last 24 hours.
Subnet Leases Heatmap - An admin would like to see how often the clients renew the leases within a subnet and generate a list of chatty clients excessively renewing the leases.
This design strives to optimize transferring leases from Kea servers to Stork. In many cases, having the lease data in Stork rather than Kea is the only way to enable certain use cases, or enabling the use cases would be awkward otherwise. However, some operators may not want to install an additional hook to gather the lease information or consider constantly collecting the lease data an unacceptable overhead. For example, searching for a single lease by IP or MAC address can be achieved by utilizing the Kea command channel. Checking if a lease exists for the particular reservation may also be achieved that way.
Generally, if collecting the lease information is acceptable in the particular deployment, all monitored Kea servers should be configured to dump the lease information via the Stork hook. If collecting the lease information is not desired, none of the Kea servers should dump the lease information. However, we also envisage future requirements for only selected Kea servers to dump the lease information, i.e. a hybrid model. The hybrid model is a future enhancement, and it is out of scope for this document.
The use of direct mode requires that the participating Kea servers have the
lease_cmds hooks library loaded. To find information about the particular lease or a set of leases, Stork will have to send a lease command to all monitored Kea servers and aggregate the results. Stork can temporarily cache this information in its database. The following use cases will be enabled using the direct mode:
- find a lease by IP address or MAC address
- check if the reservation is in use
Collecting Lease Information (Cached Mode)
Many use cases listed above require collecting some lease information. For example, to track the host reservations' usage history, the Stork server must periodically gather and process the lease information from agents. Next, it must update the host reservation information in the local database. This section describes this process.
The next picture shows Kea server using a database as lease storage and sharing the lease information with Stork via a lease file.
An operator sends requests via Stork Server to refresh lease information, or the server refreshes this information periodically. It uses the gRPC channel to ask the agent to provide the latest information. The agent should track what data it has already provided to the server. If this is the first time that the server asks for lease updates, e.g. because it is a new agent, the agent sends all the information it has about the leases. Later on, the agent will only be sending the lease updates since the server's last query.
The Lease File (IPC) is a file having the same structure as the Kea lease file. The "IPC" stands for Inter-Process Communication because Kea uses this file to provide the lease updates to the Stork Agent, which runs as a different process. The Kea is a producer, and the agent is a consumer of the information stored in this file.
Kea stores the leases in the database, but it also outputs them to the IPC file in the same format as the Memfile backend. A new Kea hook library,
Stork, outputs the lease updates to that file. The lease updates include new allocations, renewals, and releases.
Stork hooks library only outputs the leases allocated or otherwise modified during the server's operation. If leases are already present in the database, they are not printed to the IPC File unless requested by the agent. It sends such a request when it loses track of leases or when the agent is a new instance. In that case, the hook library dumps a lease table as CSV. Kea already includes the mechanisms to dump the lease database into a file. We may adapt these existing mechanisms to our needs.
Note: We may use the CSV file dump from the database to improve the HA synchronization efficiency in the future. Rather than synchronize the databases using the command channel, the partner servers could exchange the IPC files.
Before the server reads the leases, it sends a new command implemented in the
Stork hooks library requesting Kea to move the Lease File (IPC) to a Lease File Copy. Kea creates a new Lease File (IPC) and writes all future updates to this file. The Lease File Copy now contains a snapshot of the most recent lease updates. Reading from the copied file rather than from the original file has two advantages:
reading this file does not interfere with Kea writing new lease updates to the lease file,
Stork can delete the copied file after transferring the lease updates to the
Juggling the lease files helps keep them small because they only contain the lease updates issued since the last transfer to the Stork Server.
Transferring Large Lease Databases
Stork transfers lease information from the agents in the CSV format. The transferred lease file may generally contain multiple entries for the same lease. The server must first remove all duplicated entries from the file, only leaving the last entry for each lease, to reduce the database load. Also, ensuring that there are no duplicated entries in the processed file allows for simplifying SQL queries inserting the data.
The database must have two new tables:
lease_update. The latter temporarily holds the lease information copied from the CSV file using the following queries:
DELETE FROM lease_update WHERE app_id=%d; COPY lease_update(address, hw_address, client_id, valid_lifetime, app_id) FROM '%s' WITH CSV HEADER
Next, the server must copy the temporary data to the
lease table using this query:
INSERT INTO lease (address, hw_address, client_id, valid_lifetime, app_id) SELECT * FROM lease_update WHERE app_id = %d ON CONFLICT(address, app_id) DO UPDATE SET valid_lifetime = EXCLUDED.valid_lifetime ...
In #508 (closed), we have implemented a benchmark testing the performance of populating the lease information using the following setup:
- 7 CSV files,
- each CSV file includes 1 million lease updates (leases)
- 7 goroutines, each processing a single CSV file
- initially empty database
- 6 core MacBook pro with SSD
The benchmark loaded 7 million leases in 83s.
When there is a guarantee that the database is empty, the ON CONFLICT clause can be removed. In this case, the total time to populate 7 million leases drops to 42s.
We also tested lease updates processing time for other CSV files' sizes, simulating incremental lease updates to an already populated database. The results are below.
|Leases per File||Total Leases||Time to Process|
New Hook Point
Kea does not implement appropriate hook points for the use case described in this design. The leases4_committed and leases6_committed hook points are only triggered during DHCP packet processing. If a command causes the lease update, the callouts belonging to these hook points aren't invoked.
It is better to create a new hook point triggered whenever a lease change in the database is applied, i.e., creating a new lease, lease update, or lease deletion. It corresponds to the same places where the current code appends the leases to the lease file. We will implement the new hook point in the Lease Manager. It will have at least one argument, a pointer to the lease. A deleted lease will have a valid lifetime set to 0.
The following tasks are low hanging fruits and can quickly enable some new useful functions in Stork. They don't require setting up an infrastructure to transfer the leases from Kea to Stork. They are good candidates for making first steps.
- Add a mechanism to send a query for a lease by IP or MAC address from multiple Keas.
- Add a search box and API calls to find a lease by IP or MAC in the UI.
- Add a host reservation view including the information if the reservation is in use.
The following tasks introduce the cached mode support. This is a foundation for future complex features involving leases processing. They will be implemented after the initial implementation of the direct mode.
- Create new hooks library for Kea to dump lease information
- Update Stork schema: new tables for storing leases
- Parsing and processing CSV lease files
- gRPC commands to fetch CSV files from monitored Kea servers and move to database
- Enable pullers periodically fetching the leases from monitored Keas
- Tracking reservation usage history and display the history in the UI
- Add various lease searches: by different lease fields, all declined leases etc.
Solution for identifying chatty clients based on the lease information is still to be done, so this list lacks the tasks related to it.