Centralized Configuration Database with Netconf
WARNING: THIS IS AN ARCHIVAL EARLY PROPOSAL. It was written down when Kea 1.5 was in early planning stages. Kea 1.5 implemented YANG/NETCONF support using kea-netconf daemon. The config backend interface was postponed till 1.6. It will not be coupled with netconf. Those will be two independent features.
This proposal describes two desired Kea capabilities: ability to store configuration in a centralized database and ability to manage Kea using netconf. Both of those features can be used on its own. However, there are deployment scenarios where using both of them together give unique advantage.
Note: '''This is a work in progress'''.
1. Centralized storage: Intro
Any modern DHCP, such as Kea, deals with several types of data: leases, host reservations (or other per device specific information), subnets, options and client classes. Also, there's assorted list of additional parameters that change infrequently and are more likely to be considered server deployment configuration (logging, interface names, ip addresses to listen on, etc).
These parameters could be split into the following categories:
- runtime state: leases, allocation engine state, HA state
- network configuration: host reservations, subnets, options, client classes
- server configuration: hook libs, database access, interface names, ip addresses to listen on, logging
One of the major long terms goals of the Kea project is to provide generic stateless DHCP service. Stateless means that the server will not maintain any state locally, but rather will retrieve all configuration parameters from remote database and will keep the state (e.g. leases) in remote database as well.
The server's local configuration file will be almost trivial and will only consist of datbase access credentials. With this approach it will be trivial to spawn additional servers, if needed for whatever reason (e.g. performance). Each new instance will simply connect to the database, retrieve its configuration and start serving DHCP traffic.
At the same time, with [https://tools.ietf.org/html/rfc6241 Netconf] interface being more and more popular solution, Kea project is interested in providing multiple management interfaces: both REST and Netconf.
This is a complex goal and its full implementation will take a long time. However, major parts of the desired functionality can be achieved much faster, if certain compromises are acceptable.
1.1 Design principles
While we already have interfaces for leases and host reservations, the usage patterns will be different for other configuration elements. Leases in particular and to a large degree host reservations are types of data that change frequently. There are scenarios where it makes sense to use the data directly, i.e. Kea to query DB every time an address is chosen as a candidate to check if it's available and whether there is any reservation for this particular address or whether there's a reservation for the client being currently serviced. Admin or automated provisioning system may change this information, so it is reasonable for Kea to not cache it at all, but rather query DB every time to ensure the data freshness.
On the other hand, other information elements, like subnets or client classes, are somewhat unlikely to change frequently. Depending on the deployment model, new subnets may be added, but it is relatively uncommon event for existing subnet or client class to disappear or change its parameters. On the other hand, these types of data are needed for each incoming packets, sometimes many copies rather than one (a good example is subnet selection. It is next to impossible to select the right subnet with the first query). Querying multiple times for subnets, then for client classes for each incoming packet would dramatically reduce performance. As such, a caching mechanism is needed. Such a caching mechanism MUST NOT apply to leases, MAY apply to host reservations (depending on the deployment, it may be helpful or harmful, so must be possible to either disable or enable it) and SHOULD be available for all other configuration elements.
2. Configuration storage
The new mechanism will be complementary to the existing one - keeping configuration in a local config file. Users who are not interested in the DB storage will be able to continue using local config files. This is similar to what you can do with host reservations. While most users use DB to store them, smaller deployments use config file for them.
The following milestones are envisaged:
Generic DB interface for configuration storage. In Kea 1.4 we already have code that is able to store leases, host reservations and options that are related to host reservations. Unfortunately, each DB interface was implemented independently and thus has different API, its usage is not uniform and there is overhead. This should provide the ability to reuse existing DB connection (or establish new one) and provide CRUD interface (Create, Retrieve, Update, Delete). In the future, this approach will be extended to CRUDE. E stands for Event. The mechanism will inform Kea internals that given configuration element has changed. Event support is outside of scope of 1.5 planning.
This interface will be able to operate in two modes: local and DB. Local mode means that the local information provided, e.g. received via Netconf or read from local config file, will be stored in the DB, but otherwise the information from DB will not be retrieved. This mode will be useful for initial DB population by the first Kea instance. Also, it may be used as a dedicated node that retrieves information via available configuration channels (REST, Netconf or local config file) and will copy them to the DB. In other words, the local information will be treated as the ultimate source of truth.
The second mode is DB. It is expected most Kea instances will be operating in this mode. Kea will connect to the DB and retrieve whatever configuration is available and treat it as the ultimate source of truth. If it has any older copies or redundant information, its old local copies may be overwritten with whatever is available in the DB.
Specific DB interface for subnets. This will provide the ability to store subnets will all configuration elements currently allowed in configuration file to be covered. This means the subnet itself, its pools, options, relay information and other. However, it will not cover host reservations as this functionality is already implemented.
Specific DB interface for classes. This is probably out of scope for 1.5 and looks like 1.6 feature.
2.1 DB Layout
To be determined. At least Cassandra will be covered, but most likely MySQL and PostgreSQL as well. There will be a separate (new) table for subnet and pools information. The details will vary from one backend to another, but in general existing tables for dhcp4_options and dhcp6_options will be reused and extended to provide ability to specify per subnet and per pool options.
3. kea-netconf: Netconf configuration interface
There are many ways how Netconf interface could be implemented. It is clear that some, but not all users will want to use Netconf. As such, the best approach is to make it an optional daemon. People interested will that daemon and will benefit from the Netconf interface being exposed. A non-working prototype of such a daemon has been developed during [https://www.isc.org/blogs/kea-and-netconf-towards-automated-configurability/ IETF101 hackathon] in London. This new daemon is called kea-netconf.
As of 1.3, Kea supports storing two configurations: staging (used when reconfiguring) and running (the actual configuration that is committed and used while running). This concept will eventually be mapped to startup and running configurations in Netconf. Kea-netconf will be able to connect to Sysrepo repository that itself exposes Netconf interface. During startup, Kea will load the startup configuration from Netconf and apply to staging configuration in !CfgMgr. Once the whole configuration is applied, the staging configuration will be validated and committed as running configuration, if validation succeeds. Support for this ability is outside of scope for 1.5.
3.1 Scenarios covered by Netconf
The possible scenarios covered by Netconf are vast and very complex. It is not realistic to expect that everything Kea is currently capable of will be exposed via Netconf interface. As such, it is useful to define certain scenarios that are desired to be covered. Once the basic capabilities are available, the list is expected to grow to cover more complex scenarios.
- capability to manage (add/update/delete) IPv4 subnets and pools within those subnets. This must cover the ability to control timers: renewal-timer, rebind-timer and lease-lifetime.
- capability to manage (add/update/delete) host reservations within subnets. This does cover reserving an IP address. It does not cover per client options or client classes.
- capability to manage (add/update/delete) client classes, with options and test expressions.
This does not cover all configuration elements Kea-dhcp4 is capable of. Other aspects will be managed in the traditional way, i.e. loaded from a configuration file. In general, the recommended scenario is to load a configuration file with all other parameters specified (like interface names, logging, hook libraries, etc) and provide subnets, pools, host reservations and classes via netconf interface.
- capability to manage (add/update/delete) IPv6 subnets and pools within those subnets.
- capability to manage (add/update/delete) host reservations withing subnets. This does cover per client options.
3.2 The translator concept
Translating the whole YANG model at once is not realistic due to its complexity. Also, the whole configuration changes very infrequently, so it would be very inefficient to re-apply all parameters. Therefore kea-netconf needs to offer finer granularity. To address this problem, a concept of translators has been invented.
Once the startup is complete, Kea-netconf will install callbacks on specific parts of the YANG model to receive notifications if any configuration element changes. A class providing such a callback will be called a Translator (as it translates parts of YANG model into JSON commands understandable by Kea). kea-netconf will provide the ability to load arbitrary number of such translators. kea-netconf will send those commands over UNIX control channel to kea-dhcp4 and kea-dhcp6. Translators will be mostly defined in hook libraries that will be loaded by kea-netconf. This approach will provide maximum flexibility.
3.3 Sysrepo - a netconf backend
Kea will use [http://www.sysrepo.org/ Sysrepo] software suite to as a Netconf provider. Earlier attempts (see [https://www.isc.org/blogs/kea-and-netconf-towards-automated-configurability/ 2018 hackathon in London], [https://www.isc.org/blogs/ietf-hackathon-in-berlin-kea-and-yangnetconf/ 2016 hackathon in Berlin]) indicate the Sysrepo project is able to provide Netconf protocol interface, the capability to store YANG models, configuration that adheres to those models as well as programmable interface to Netconf.
3.4 Choosing Sysrepo
Kea team has been looking at YANG and NETCONF for several years. The first experience we got was in 2016 during IETF hackathon in Berlin. At that time we were looking to do some fun experiment over a weekend. That's usually the goal of a hackathon - to try out something completely new and see what comes out of it. Back then the Sysrepo code was very unstable and was crashing left and right. But we met the people behind it and they were very responsive - fixing issue, explaining caveats and pointing us in the generally right direction.
Some years have passed, Sysrepo project grew and we got an official request from two customers to provide YANG/NETCONF for Kea. One of the customers was also co-funding the Sysrepo project and offered assistance from the Sysrepo engineering team if necessary.
Kea engineers are not experts in the YANG/NETCONF technology, so we tried to assess whether sysrepo is a reasonable choice. It was appealing for several reasons:
- we knew the code and had more or less working prototype
- there were many projects around Sysrepo - netopeer, libyang
- people involved in libyang and Sysrepo are active in netconf and netmod working groups in IETF. While ISC engineers don't really follow those groups, the involvement in standardization process is a good sign regarding code being updated and long term stability of the project.
- our friends at FRR were adopting Sysrepo and they were talking about packaging Sysrepo dependencies
- we had good relationship with people behind Sysrepo project and could ask for help if there were some serious issues
- Sysrepo implements some pretty interesting concepts. For example: "One of the key features of sysrepo is no-single-point-of-failure design. Under normal conditions, Sysrepo client library communicates with sysrepo daemon, which handles all operations in the datastore. But for cases when sysrepo daemon is not running (e.g. it has not started / initialized yet, or it has crashed for whatever reason), the client library can do most of the data-access functionality by itself. With this approach we can guarantee, that any application will always be able to access its configuration."
There were also some drawbacks that we couldn't know about before we implemented the kea-netconf code:
- the dependencies for Sysrepo are notoriously difficult to install on older systems. Most of our developers used Ubuntu at that time, where the problem was not that apparent. The situation is expected to improve over time.
- Sysrepo does not seem to support RESTCONF yet. We are unsure whether RESTCONF will eventually replace NETCONF, becomes popular as two similar managing protocols or perhaps be forgotten. For the time being, the inquiries we got indicate people are more interested in NETCONF, but our sample set is pretty small.
- Sysrepo changes their API frequently. The build system, defines and sometimes parameters used in API are introduced in backward and forward incompatible ways. From our limited experience we observe some code stabilization, but there are still some new compilation issues when new Sysrepo version emerges.
4. Deployment scenarios
In the context of implementing the config DB storage and netconf at least four deployment scenarios can be considered.
4.1: Single instance with DB storage, managed via REST
This is the most basic scenario. There is one Kea instance that is managed using REST API. This is very similar to typical deployment of Kea 1.3 or 1.4, except the configuration is stored in a database. The benefits of this approach are:
- easier manageability in case of large configurations (e.g. 1000s of subnets)
- the configuration is stored in DB
- configuration can easily be backed up using DB tools
Although this deployment scenario has some merits of its own, it is typically considered a first stepping stone towards more robust deployments that feature stateless DHCP.
4.2: Single instance managed via Netconf
This scenario features a single Kea instance with the Netconf ability. The Kea server (be it DHCPv4 or DHCPv6) exposes Netconf interface. The configuration is generally managed in Netconf format, Kea-netconf picks up any changes in YANG configuration and applies them to Kea-dhcp. The properties of this scenario are:
- All benefits of using Netconf become available (automated management, uniform configuration handling for all devices in a network, configuration validation against YANG model etc).
- the configuration is stored in YANG/netconf
- works well for a single instance, but quickly becomes hard to manage when deploying more servers
4.3: Single instance managed via Netconf with DB storage =
This scenario is a hybrid between 1a) and 1b) - provides Netconf and DB config storage. This scenario will typically be considered a stepping stone towards more advanced deployments.
4.4: Multiple servers sharing DB configuration, REST managed
In this configuration there are several Kea instances that share the same configuration. The properties of this configuration are:
- major scalability and resiliency benefit: several instances can share the same configuration. It provides resiliency against Kea server failure.
- to maintain consistent configuration, there has to be a dedicated node that conducts the configuration updates. Initially it will be a dedicated Kea instance that will not be serving any traffic and its only purpose will be to manage configuration. It will be eventually replaced with kea-config.
4.5: Multiple servers sharing DB configuration, Netconf managed
This scenario is similar to 2a), but uses Netconf interface. The properties of this deployment are:
- one unified Netconf interface that deploys configuration to all Kea servers.
- consistent configuration that is used by all servers.
5. Deploying configuration (PUSH vs PULL model)
In principle, the configuration changes can be managed in two ways. First, once the sysadmin decides to change the configuration, the changes are pushed to each sever that is affected by the change. This approach is called PUSH model. The major benefit of this approach is that it is very efficient. The server does any configuration related tasks only when there is actual change to be applied. The disadvantage of this approach is that the changes need to be pushed to each server separately and in case of multiple servers running, there's a window of opportunity to have some instances running old configuration and some other instances running new configuration.
An opposite approach is to deploy the configuration in a centralized DB storage and let the Kea instances somehow detect the change and retrieve it on their own. This approach is called PULL model. The obvious disadvantage of this method is that Kea instances need to spend additional cycles to determine if the configuration has changes. Since configuration changes is a relatively infrequent event (especially when compared to a packet processing event), there will be some overhead for doing the checks that in most cases will come back negative. The major advantage of this approach is that Kea instances can pick up the changes on their own, thus simplifying configuration chsange procedures. Also, multiple servers can pick the changes in parallel, effectively decreasing amount of time needed to fully migrate to a new configuration.
5.1 PULL mechanism
It is too early to determine details of the configuration change mechanism, but the following idea has been proposed for consideration:
Each subnet is extended with extra integer field called version. Any changes to a subnet will result in its version being increased by 1. The configuration is stored in the DB. Kea instance keeps a copy of the configuration in memory. The packet comes in and Kea processes it until after the subnet selection is done. Kea then retrieves the subnet details from the DB and compares the version retrieved with the copy available in memory. If the version matches, Kea uses the latest configuration and proceeds normally. If there is a mismatch and the DB version is greater, Kea updates its subnet configuration and restarts packet processing.
This approach has several properties:
- the configuration changes are picked up automatically for more common cases
- there is one extra SQL query to be made. This is an overheard, but it's not major.
- this approach does not handle well situations when other subnet changes that could now be picked in subnet selection process.
- answering the question of "what configuration you are running?" is not trivial. The answer could be a series of integers, one for each subnet.
Since this mechanism has performance impact, it would be disabled by default. It could be enabled globally with a configuration parameter.
6. kea-config tool
While in the initial stages the interface between REST, Netconf and database storage could be fulfilled with a dedicated Kea instance, in the long term it will be highly desired to have a separate tool for this. The tool will essentially be a control plane for both kea-dhcp4 and kea-dhcp6 with the data plane (DHCP packet processing routines) cut off.
- extra tool means more validation effort needed (slower test cycle, unlikely to provide radically different results compared to kea-dhcpX);
- in the long term, the tool could evolve towards managing the configuration
- this tool could be used to upload configuration to the centralized DB, retrieve it as a text file (e.g. to inspect customers configuration, to debug issues or to backup specific configuration version)
- there is an overlap between this tool and kea-netconf. It is at least plausible that those tools will be merged some time in the future. However, due to time constraints it is easier to initially develop kea-netconf as a stand-alone tool.