This document describes a design for the Lease File Cleanup in Kea (abbreviated "LFC"). It is an accompanying document for the [wiki:LeaseFileCompressionRequirements LFC Requirements]. Please refer to the "requirements" document for the problem statement why the LFC feature is important in Kea.
The most recent updates to this document are a result of reviews by Stephen Morris and Tomek Mrugalski, as well as a result of the discussions over the phone between Marcin and Shawn. The reviews can be found in the kea-dev mailing list archives:
LFC - Lease File Cleanup - A process of removing redundant entries from the file(s) containing information about leases.
''kea-lfc'' - A standalone application performing cleanup of the lease entries in lease files.
Source Files - Files containing redundant lease information, on which the LFC is performed. The source files are not modified by the LFC.
Current Lease File - A lease file being used by the DHCP server to record updates to leases as a result of processing DHCP messages from the client. This file is not cleaned up by the LFC process directly. The cleanup is always performed on the copy of this file, not on the original file.
One of the key [wiki:LeaseFileCompressionRequirements requirements] for the LFC is the ability to recover from an unexpected interruption. Therefore, the lease information must be preserved until the LFC is completed successfully. If it is discovered, during the subsequently executed LFC, that the previous LFC was unsuccessful (e.g. as a result of a server crash or machine reboot) it restarts the cleanup process and discards (possibly incomplete) files created during the unsuccessful run. Using this approach the server preserves the integrity of the lease information. As a consequence, lease information is spread across multiple files.
This section defines naming conventions for all lease files required by the LFC.
Note: the name of the file used by the DHCP server to record current updates to leases (shortly ''Lease File'') is specified by the server administrator in the server configuration file. All other file names used by LFC are based on this name, i.e. are generated by appending a specific suffix to this name. For example, if the Lease File is "kea-leases.csv", the name of the LFC Output File will be "kea-leases.csv.output.
How the name is constructed
specified in the Kea configuration
A file being used by the server to record updates to the leases as a result of processing DHCP messages from the clients. The actual file name (including an absolute path) is specified in the Kea configuration file. This file may be created when the server starts up or is reconfigured. This file is re-created (without any leases) when the LFC begins. The LFC doesn't process this file. It is merely used by the server to write updates to leases.
Previous Lease File
Lease file holding the supplementary information about leases. It is a product of the previous LFC run, i.e. it holds the information from the LFC Output File created during the previous LFC. The lack of this file indicates that no LFC has been performed since the server started to use the Lease File.
Lease File Copy
This file is created as a result of moving (renaming) to .1 at the beginning of the LFC by the DHCP server process. This is used as one of the input files to the ''kea-lfc'' application performing the cleanup of lease files.
LFC Output File
This file is created by the ''kea-lfc'' application and it combines the cleaned-up lease information from the Previous Lease File and the Lease File Copy.
LFC Finish File
This file is created by the ''kea-lfc'' application by moving .output to .completed. This move is performed right after the lease files are processed and it indicates that the clean-up has completed.
LFC Process Overview
The LFC is initiated by the DHCP server process when a specific trigger (e.g. timer) kicks in. The server checks for the presence of the Lease File Copy. If this file is not found, the server ceases DHCP message processing and creates the Lease File Copy by renaming the Lease File. The server then re-creates the Lease File with no lease entries (having only a header) with the same name as the one that the server had been using before the LFC was started. The server resumes the processing of the DHCP messages using this new Lease File. Note that the DHCP server doesn't rename the Lease File to Lease File Copy when it finds that the Lease File Copy exists, because its presence indicates that the previous LFC was not successful (perhaps crashed) and replacing the existing file could cause a data loss.
Regardless if the Lease File Copy existed or not, the server spawns a new process which executes the external program ''kea-lfc'' performing the actual "cleanup" of the Lease File Copy and the Previous Lease File (if it exists). Both files are processed by the application and the output is stored in the LFC Output File. When the program completes the cleanup of both files, it moves the LFC Output File to the LFC Finish File to indicate that both files have been processed. The program then removes both processed files, i.e. Lease File Copy and the Previous Lease File as all useful information from these files has been stored in the LFC Finish File. When the files are deleted, the LFC Finish File is moved to the Previous Lease File.
By checking the presence/absence of the specific files, the ''kea-lfc'' application and the DHCP server may determine the state of the previously executed LFC when it terminated (including crashes).
The cleanup of any file by the ''kea-lfc'' is achieved by loading all lease entries from this file and then writing new lease entries into the output file. The lease entries are loaded in the same way as the Kea Memfile lease database backends loads them (perhaps using the same C++ data structures and functions). All later lease entries for a client replace the earlier lease entries (including those which are unexpired according to the valid lifetime) in the data structure held in the ''kea-lfc'' process memory. Once leases are loaded into memory ''kea-lfc'' iterates over all leases in the data structure and dumps lease information for each of them into the output file.
The ''kea-lfc'' is a standalone application in the Kea code tree which exposes the following command line options:
[-4 | -6] indicates whether DHCPv4 or DHCPv6 lease file is being processed.
specifies a full path to the Previous Lease File
specifies a full path to the Lease File Copy
specifies a full path to the LFC Output File
specifies a full path to the LFC Finish File
specifies a full path to the configuration file used by the DHCP server
specifies a full path to the pid file used by the kea-lfc to determine if another instance cleaning up the same lease files is already running.
-v print version number
-V print extended version number
-h display help
-d debug/verbose mode
The usually contains the path to the current Lease File and this information could be used by the ''kea-lfc'' to generate names for other files. This would eliminate the need for additional parameters. However, as a result of the discussions about the design it has been decided that the command line parameters will be implemented as it gives more flexibility. For example: it is considered to be easier to test the application without the need to generate the Kea configuration file. It is also easier for the administrator run LFC manually by just specifying the location of the files.
In the first versions of the LFC it will only be possible to specify the lease files using the command line options. In future versions, it will also be possible to use the Kea configuration file (and not specify other parameters) in which case the ''kea-lfc'' will parse the lease file name from the configuration file and generate other names internally. In that case, the values specified with the command line options will always take precedence over the values in the Kea configuration file. If the values are neither specified in the command line nor in the configuration file, the default values are used.
The design discussions considered two approaches for the ''kea-lfc'' to perform the lease file cleanup. The first approach assumed that the program walks through the lease entries in the input file and rewrites those entries to the output file, for which expiration time was not reached (when comparing with a current system time). This approach wouldn't require loading all leases to memory before writing them to the output file. It would also allow for preserving "unparsable" information found in the processed file by rewriting this information to the output file as is. The second approach considered was that ''kea-lfc'' first loads all lease entries into memory and for each client replaces previous entries for the lease with later entries. When all leases are loaded into memory, the program dumps them into the output file. The down side of this approach is that it consumes memory for holding all lease information during LFC. It also makes it troublesome to preserve "unparsable" data found in the input file, because by loading all leases from the file into memory, the location of the unparsable data in the input file is lost. The benefit of using the latter approach is that it guarantees that exactly one (the most recent) lease entry is held for each client in the output file, which may significantly reduce the lease file size and makes the LFC process more effective. Therefore, this approach has been chosen.
The ''kea-lfc'' should be written in C++ and should make use of (link with) the Kea libraries to manage the lease files in the CSV format. The state diagram of the ''kea-lfc'' application is depicted on the picture in the next section.
This section presents the state diagram for the ''kea-lfc'' program performing cleanup of the Previous Lease File and the Lease File Copy.
The corresponding (simplified pseudo code) for this state diagram is shown below:
The LFC is started periodically and it can be triggered by the timer expiration or any other trigger that may be implemented in the future, e.g. triggered manually by the server administrator using the Kea command control mechanism (TBI). Typically, the duration of the LFC is much shorter than the period between the startup of two subsequent LFCs. However, the duration of the LFC depends on different factors, e.g. hard drive speed, lease file size, number of expired leases in the lease file, frequency of the renewals etc. In addition, the period between LFC startups is controlled by the administrator depending on his needs. Therefore, it is possible that the LFC may be triggered while a previously started LFC is still in progress. It is possible to implement restrictions on the Kea configuration mechanism to prevent configuration of a "too short" period between LFC startups, but that doesn't address the case when two overlapping LFCs are started using the manual methods.
The ''kea-lfc'' implements locking using a PID file to prevent concurrent runs. When ''kea-lfc'' is started, it checks for the presence of the PID file. If the file exists, it checks if the process with a PID from the PID file is running. If the process is running, the application terminates. If the PID file doesn't exist or no process with a given PID is running, ''kea-lfc'' creates a new PID file with its own PID and continues to run.
The server instance which runs ''kea-lfc'' installs a SIGCHLD signal handler to monitor when ''kea-lfc'' terminates. The server doesn't spawn a new ''kea-lfc'' process as long as the previously started process is running. In case if the server is restarted while the ''kea-lfc'' process is running, the server loses the information about the state of the child processes. When the server starts up it is possible that the LFC process is (still) running and thus it is possible that the LFC process is currently using files which the server needs to load leases from. The server must use the appropriate PID file to check the LFC process ID and make sure that the ''kea-lfc'' process has terminated before leases are loaded. The server has several options in this case:
kill the running kea-lfc process - which is dangerous in case some other process "hijacked" the pid used previously by the kea-lfc,
wait for the process to finish (no matter how long) and then load leases - which may cause an infinite hang of the server in case the process doesn't finish,
wait for the process to finish, with timeout, delete the PID and load leases - which may potentially cause some conflicts, but unlikely
log an error and let the administrator sort it out - the easiest approach, and avoids conflicts
The last two approaches seem to be the most feasible. Ultimately, we should support the third approach, as it avoids manual intervention by the administrator. However, as a first step, the fourth approach is good enough.
The server startup and reconfiguration are the two times when there is a potential conflict between the DHCP server process loading leases from files and ''kea-lfc'' writing to these files or otherwise modifying it. In those cases, the server process is responsible for avoiding the conflicts, as described above. In all other cases, the server leaves it up to ''kea-lfc'' to check for other running instances and exit if required. This assures that the concurrency checks are performed in background and have no impact on processing DHCP messages by the server.
The ''kea-lfc'' must be able to log messages such as: start time, end time, number of all lease entries, number of entries removed, details of invalid entries found and other errors. The ''kea-lfc'' uses the Kea logging mechanism. The logger configuration for the application is specified in the Kea configuration file and the path to this file is supplied by the server process when the ''kea-lfc'' is started via command line. Implementing the logger configuration to parse and set up the logger is an involved task, so the first implementation of the ''kea-lfc'' will simply use the default logging settings: logging to syslog at the INFO logging level.
Required Updates to Kea
This section lists required updates to Kea to support LFC according to the design described in this document. The estimated times needed for implementation of these tasks is given in parens. The given times include the code review.
#3664: Create kea-lfc application with a support for command line options. creating a PID file and checking for another instance being run. (2d)
#3665: kea-lfc: Implement cleanup of the Previous Lease File and Lease File Copy for DHCPv4 leases. (2d)
#3666: kea-lfc: Implement cleanup of the Previous Lease File and Lease File Copy for DHCPv6 leases. (2d)
#3667: kea-lfc: Add logging support (syslog only, no configuration parsing). (1d)
#3668: Implement generic timers support which execute a custom handler function on timer expiration. (1d)
#3669: DHCPv4 server: install timer which triggers kea-lfc every X seconds and monitors the status of the child process. (1d)
#3670: DHCPv6 server: install timer which triggers kea-lfc every X seconds and monitors the status of the child process. (1d)
#3671: Memfile backend: Load leases from multiple (previous and current) files: in a given order. (2d)
#3672: Update kea-guide: include examples of the LFC timers configuration. (1d)
#3674: Lease database backend(s): implement "released" flag for the lease in the lease database (2d)
There are additional tickets submitted after original design had been created:
#3687: Create PIDFile class - this work was isolated from the #3664.