|
|
# Design for coherent statistics when running multiple instances
|
|
|
|
|
|
|
|
|
**Contents:**
|
|
|
|
|
|
* Introduction
|
|
|
* 1 Approaches to Achieving Accurate Data
|
|
|
* 2 Considerations for each Approach
|
|
|
* 3 Approach Selection
|
|
|
* 4 Design Proposal
|
|
|
* Appendix SQL for Tables and Triggers
|
|
|
* Appendix B Multi-threaded Improvement
|
|
|
* Appendix C Design Variant
|
|
|
|
|
|
# Introduction
|
|
|
|
|
|
The primary issue is that assigned and declined lease statistics are not
|
|
|
accurate when multiple Kea servers are using "shared lease storage" without
|
|
|
Kea's HA protocol, as the servers are unaware of each other's contributions.
|
|
|
Herein, where the term HA appears it refers to Kea's HA protocol.
|
|
|
|
|
|
Other lease related statistics, such as relcaimed-declined and reclaimed, are
|
|
|
accurate on a per server basis as they represent running count of those
|
|
|
events since the last time that server reset its statistics.
|
|
|
|
|
|
Users are likely more interested in understanding current lease utilization
|
|
|
therefore this design focuses on the total, assigned, and declined leases per
|
|
|
subnet_id. There are broader issues and greater potential capabilities
|
|
|
regarding statistics, but this design is scoped to the immediate problem
|
|
|
hindering the usability of shared-lease-storage deployment.
|
|
|
|
|
|
Regardless of how accurate data is achieved, virtually any solution calls for
|
|
|
a new hook library which either implements new commands specifically for
|
|
|
fetching the data or overloads the existing "statistic-get" command. This
|
|
|
will be discussed later on.
|
|
|
|
|
|
The first task then, is to determine how to achieve accurate numbers. In
|
|
|
the following sections a number of approaches will be described and then
|
|
|
considerations for each approach examined.
|
|
|
|
|
|
## 1. Approaches to Achieving Accurate Data
|
|
|
|
|
|
### 1.1 Aggregate individual Kea server contributions
|
|
|
|
|
|
Within the kea server code, split each lease statistic into a beginning
|
|
|
reference value and a delta-since-reference value. As long as all servers
|
|
|
have the same reference value then a sum of their deltas should yield the
|
|
|
current total for each statistic. This would require a central gathering
|
|
|
point, or aggregator, to query each server and aggregate the data. The
|
|
|
most natural place for such an activity would be a hook loaded by the
|
|
|
Control Agent, as it is already aware of all of the Kea servers.
|
|
|
(Users could, actually opt to do the aggregation themselves).
|
|
|
|
|
|
### 1.2 Explicit statistic propagation to common statistics storage
|
|
|
|
|
|
Modify Kea to support explicit propagation of specific statistics to a central
|
|
|
point, such has within the lease database. The updates would be cached within
|
|
|
each server and then pushed to central storage at a configurable interval.
|
|
|
This would be best accomplished in a separate thread that used its own
|
|
|
connection to storage, so as not to impede DHCP service. Statistics would be
|
|
|
considered accurate up to the resolution of update interval.
|
|
|
|
|
|
### 1.3 Implicit statistic propagation to common statistics storage
|
|
|
|
|
|
Attach triggers to the lease tables in the lease back ends to update a
|
|
|
new statistics table(s) upon insert, update, or delete. This would ensure
|
|
|
complete agreement between the lease storage and the statistics, and as
|
|
|
with #2, it would yield very fast statistic query times.
|
|
|
|
|
|
### 1.4 Recalculate statistics on demand
|
|
|
|
|
|
Recalculate the statistics from the lease storage on demand. This would use a
|
|
|
query similar to if not identical to that used to recount statistics now, with
|
|
|
the additional variants that support per subnet-id and/or subnet-id range where
|
|
|
clauses.
|
|
|
|
|
|
Next are some of the considerations of each of these approaches.
|
|
|
|
|
|
## 2 Considerations for each Approach [=#point2]
|
|
|
|
|
|
### 2.1 Aggregate individual Kea server contributions
|
|
|
|
|
|
This solution seems to involve the most moving pieces. It would be fairly
|
|
|
intrusive to main repo code in terms for of splitting statistics into
|
|
|
reference points and deltas. It would also require changes to individual
|
|
|
retrieval logic, necessary whether the server is participating in a shared
|
|
|
database access or not because for any given stat, retrieval is now "reference
|
|
|
+ delta". Not necessarily a bad thing, simply an observation.
|
|
|
|
|
|
The primary challenge with this approach is reference drift. Unless the
|
|
|
servers are routinely synced through reconfiguration, restart, or via a
|
|
|
new command to "recount", summing the deltas for a given statistic will
|
|
|
no longer be accurate. There is also an issue of what to when one or
|
|
|
more of servers are offline.
|
|
|
|
|
|
Additionally, by requiring an aggregator to whom each server must send a
|
|
|
response, then each server is burdened on each request and their individual
|
|
|
responses will be the same size as the aggregated result. In response to
|
|
|
a "get all", the amount of information sent to the aggregator on systems
|
|
|
with large numbers of subnets and servers could would be large. The amount
|
|
|
of time it would take each Kea server to generate and ship it's contribution
|
|
|
might be significant.
|
|
|
|
|
|
Lastly, the aggregator must be aware of all of the participating servers,
|
|
|
currently something only a !ControlAgent or client admin would know.
|
|
|
|
|
|
### 2.2 Explicit statistic propagation to common statistics storage
|
|
|
|
|
|
The first hurdle for this approach is to devise an efficient means of
|
|
|
caching the updates and then periodically updating the common storage
|
|
|
without unduly burdening the throughput of the code. Currently statistic
|
|
|
updates are mostly std::map updates, and as such present little burden.
|
|
|
|
|
|
Next there would need to be a way to designate which statistics get cached,
|
|
|
and then the update statistics code modified to cache updates for designated
|
|
|
statistics. This would be pretty intrusive to !StatsMgr code and given the
|
|
|
frequency of statistics updates introducing a hook point into the mix would
|
|
|
likely have serious, negative effects on through put.
|
|
|
|
|
|
Lastly, at some configurable interval the cached updates must be propagated
|
|
|
to the common storage. This would need to be done in a worker thread which
|
|
|
uses its own connection to the storage so as not to interfere with the server's
|
|
|
primary duties. For instance, if the common storage is the lease database
|
|
|
(a logical choice), this could not be done over the same connection as the
|
|
|
!LeaseMgr.
|
|
|
|
|
|
If the caching and updating can be accomplished efficiently, impact on
|
|
|
throughput should be minimal and queries against the central storage should
|
|
|
be quick, as they become simple selects without calculations.
|
|
|
|
|
|
### 2.3 Implicit statistic propagation to common statistics storage
|
|
|
|
|
|
The primary drawback to this approach is the potential degradation in database
|
|
|
throughput. To gauge this potential some testing was conducting using both
|
|
|
command line clients for both MySQL and Postgresql to to insert rows from a
|
|
|
text file into the lease4 table, with and without the trigger installed.
|
|
|
The statistics tables and triggers SQL is shown in Appendix A.
|
|
|
|
|
|
The file format used was a CSV file where each row contains four columns:
|
|
|
lease address, hardware address, subnet_id, and lease state. For the initial
|
|
|
test a file containing 5M records, spread over 1000 subnet_ids, and all three
|
|
|
lease states: default, declined, expired-reclaimed, or roughly 1667 records
|
|
|
per state per subnet_id. The load times are shown below:
|
|
|
|
|
|
MySQL:
|
|
|
a. Without trigger: 0m 40.30s, or 125K inserts per second
|
|
|
b. With trigger: 3m 01.58s, or 27.5K inserts per second
|
|
|
|
|
|
Postgresql:
|
|
|
a. Without trigger: 0m 62.55s, or 79.9K inserts per second
|
|
|
b. With trigger: 3m 48.50s, or 21.8K inserts per second
|
|
|
|
|
|
As can be seen the impact on raw insert throughput is significant. However, the
|
|
|
impact on a running Kea server may not even be visible, as our lease-per-second
|
|
|
rates are currently are well below the rate for raw inserts with triggers cited
|
|
|
above. To gauge the impact, the following tests were conducted running
|
|
|
kea-dhcp4 and kea-dhcp6 on an Ubuntu 16.04 host, and perfdhcp on a Centos 7 VM
|
|
|
on the same host. For each test, the rate recorded was the highest rate that
|
|
|
did not result in dropped packets.
|
|
|
|
|
|
V4 Performance with and without insert trigger:
|
|
|
```
|
|
|
{{{
|
|
|
Memfile with persistence set to false (as a baseline):
|
|
|
Running: perfdhcp -4 -r 10000 -R 9999999 -p 20 178.0.0.1
|
|
|
Rate: 6689.37 4-way exchanges/second, expected rate: 10000
|
|
|
|
|
|
MySQL without trigger installed:
|
|
|
Running: perfdhcp -4 -r 200 -R 9999999 -p 20 178.0.0.1
|
|
|
Rate: 189.079 4-way exchanges/second, expected rate: 200
|
|
|
|
|
|
MySQL with trigger installed:
|
|
|
Running: perfdhcp -4 -r 200 -R 9999999 -p 20 178.0.0.1
|
|
|
Rate: 188.752 4-way exchanges/second, expected rate: 200
|
|
|
|
|
|
Postgresql without trigger installed:
|
|
|
Running: perfdhcp -4 -r 600 -R 9999999 -p 20 178.0.0.1
|
|
|
Rate: 551.868 4-way exchanges/second, expected rate: 600
|
|
|
|
|
|
Postgresql with trigger installed:
|
|
|
Running: perfdhcp -4 -r 600 -R 9999999 -p 20 178.0.0.1
|
|
|
Rate: 556.182 4-way exchanges/second, expected rate: 600
|
|
|
}}}
|
|
|
```
|
|
|
|
|
|
V6 Performance with and without insert triggers:
|
|
|
```
|
|
|
{{{
|
|
|
MEMFILE persistence = false:
|
|
|
Running: perfdhcp -6 -r 10000 -R 9999999 -p 20 -l enp0s10
|
|
|
Rate: 6890.85 4-way exchanges/second, expected rate: 10000
|
|
|
|
|
|
MySQL without trigger installed
|
|
|
Running: perfdhcp -6 -r 200 -R 9999999 -p 20 -l enp0s10
|
|
|
Rate: 188.239 4-way exchanges/second, expected rate: 200
|
|
|
|
|
|
MySQL with trigger installed
|
|
|
Running: perfdhcp -6 -r 200 -R 9999999 -p 20 -l enp0s10
|
|
|
Rate: 188.813 4-way exchanges/second, expected rate: 200
|
|
|
|
|
|
Postgresql without trigger installed
|
|
|
Running: perfdhcp -6 -r 400 -R 9999999 -p 20 -l enp0s10
|
|
|
Rate: 361.503 4-way exchanges/second, expected rate: 400
|
|
|
|
|
|
Postgresql with trigger installed
|
|
|
Running: perfdhcp -6 -r 400 -R 9999999 -p 20 -l enp0s10
|
|
|
Rate: 361.421 4-way exchanges/second, expected rate: 400
|
|
|
}}}
|
|
|
```
|
|
|
|
|
|
As can be seen, there is virtually no degradation of throughput with triggers
|
|
|
installed. This approach should also work for Cassandra though that testing
|
|
|
was not done as part of this initial design effort. It should be explored.
|
|
|
|
|
|
The appeal with this approach is accurate, commonly accessible data, with no
|
|
|
changes to the update statistics code and virtually immediate query response
|
|
|
times. Querying the lease4_stat table by executing a "select * from lease4_stat"
|
|
|
returned 3000 rows in 0.01 seconds or less for both MySQL and Postgresql.
|
|
|
|
|
|
### 2.4 Recalculate statistics on demand
|
|
|
|
|
|
The primary drawback to this issue is the query time required to recount the
|
|
|
statistics. Recounting statistics for all subnets in a system with thousands
|
|
|
of subnets could take several seconds.
|
|
|
|
|
|
Running the recount stats query against the same lease4 table contents created
|
|
|
during trigger testing, returned 3000 rows in 1.73 seconds.
|
|
|
|
|
|
After adding an index of lease4 keyed on subnet_id and state, the query
|
|
|
returned that same 3000 row result, in 0.72 seconds.
|
|
|
|
|
|
To give the test some scale, the lease4 table was reloaded from a file
|
|
|
containing 50M records, spread over 10K subnets.
|
|
|
|
|
|
Running the recount stats query, returns the 30K rows in 8 seconds.
|
|
|
Running it with where clauses added:
|
|
|
|
|
|
a. For a single subnet-id, the query returns 3 rows in 0.01 secs
|
|
|
b. For a range spanning 100 subnets, it returns 303 rows in .11 secs
|
|
|
c. For a range spanning 900 subnets, it returns 2697 rows in .92 secs
|
|
|
|
|
|
The advantages of this approach are no modifications to the statistics update
|
|
|
code, 100% accuracy at the time of the query, and the user's are in direct
|
|
|
control over how much they impact the system by the frequency and breadth of
|
|
|
their queries. There is no impact unless a statistics query is being performed.
|
|
|
|
|
|
### 3 Approach Selection [=#point3]
|
|
|
|
|
|
Of the approaches discussed, the most viable way forward seems to be offered
|
|
|
by Implicit Statistic Propagation (1.3), for the following reasons:
|
|
|
|
|
|
* It provides 100% accuracy of the lease statistics
|
|
|
|
|
|
* Leverages the global coordination aspect of lease changes to and thus the
|
|
|
statistics provided by the lease storage. In other words, it uses databases
|
|
|
to do what they're meant to do.
|
|
|
|
|
|
* It provides a path by which we can produce a usable, near term solution
|
|
|
for users who are already deploying production systems with shared lease
|
|
|
storage. The solution is doable in the 1.4 time frame. Its implementation
|
|
|
is largely a matter of minor extensions to existing code and schemas.
|
|
|
|
|
|
* It should have negligible impact on a server's normal DHCP throughput,
|
|
|
whether it is responding to statistics fetch commands or not. Of course
|
|
|
there are reasonable limits to this. If a server is being bombarded with
|
|
|
constant requests for statistics its ability to do other work will be
|
|
|
hampered unless that work is offloaded to its own thread (this is discussed
|
|
|
in Appendix B).
|
|
|
|
|
|
* It should largely reduce the recount statistics time for users of MySQL,
|
|
|
Postgresql, and Cassandra lease storage whether they're using more than one
|
|
|
Kea server or not.
|
|
|
|
|
|
* Providing HA deployments use the new lease statistics commands in the
|
|
|
!StatLib hook library, the values will be correct. Currently, when an HA peer
|
|
|
gets lease updates from another peer, it uses the Lease Commands hook library
|
|
|
to add, update, delete leases as needed. Those commands do not update
|
|
|
statistics via the !StatsMgr, however, the new triggers will update the lease
|
|
|
stat tables correctly.
|
|
|
|
|
|
Approaches 1.1 and 1.2 are too invasive, have accuracy integrity issues, and
|
|
|
require a lot of new code. Approach 1.4, while perhaps the least amount of
|
|
|
code, would place a considerable run time burden on the responding server,
|
|
|
would not reduce recount times, and do nothing for the accuracy of HA lease
|
|
|
statistics.
|
|
|
|
|
|
## 4 Design Proposal [=#point4]
|
|
|
|
|
|
The solution consists of two areas of change:
|
|
|
|
|
|
1. Changes to the main repo to add the necessary tables and triggers to
|
|
|
schemas, extensions to the LeaseMgrs to execute new queries against the
|
|
|
new tables
|
|
|
|
|
|
2. !Statlib, a new hook library which implements new commands
|
|
|
specifically for fetching lease statistics from lease storage
|
|
|
|
|
|
The implementation will not affect StatsMgr or any existing statistics
|
|
|
commands.
|
|
|
|
|
|
### 4.1 Changes to Main Repo
|
|
|
|
|
|
The changes to the main repo are relatively minor:
|
|
|
|
|
|
#### 4.1.1 New Tables and Triggers
|
|
|
|
|
|
Two new tables will need to be created in the schemas:
|
|
|
|
|
|
a. lease4_stat
|
|
|
|
|
|
subnet_id - ID of the subnet
|
|
|
state - state of the lease
|
|
|
leases - number of leases
|
|
|
primary key (subnet_id, state)
|
|
|
|
|
|
b. lease6_stat
|
|
|
|
|
|
subnet_id - ID of the subnet
|
|
|
type - lease type (IA_NA/IA_PD)
|
|
|
state - state of lease
|
|
|
leases - number of leases
|
|
|
primary key (subnet_id, type, state)
|
|
|
|
|
|
Note that the schema upgrade will need to include SQL to initially populate
|
|
|
these tables from existing lease4 and lease6 table content.
|
|
|
|
|
|
The lease4 and lease6 tables will need three triggers each:
|
|
|
|
|
|
a. after-insert - updates or inserts the lease count for the new lease's
|
|
|
subnet-id/type/state
|
|
|
|
|
|
b. after-update - if the update alters the lease state, if must
|
|
|
decrement the lease count for the old state, and increment the lease
|
|
|
count for the state
|
|
|
|
|
|
c. after-delete - decrement the lease count for the deleted lease's
|
|
|
subnet-id/type/state
|
|
|
|
|
|
#### 4.1.2 Add new SQL queries to !LeaseMgr derivations
|
|
|
|
|
|
These new SQL queries would be select statements against the lease<4/6>_stat
|
|
|
tables, with variants to handle where clauses. For example, the v4 variants
|
|
|
would be as follows:
|
|
|
|
|
|
a. SELECT * FROM lease4_stat
|
|
|
ORDER BY subnet_id, state;
|
|
|
|
|
|
The query above and its v6 form, will actually replace the queries used to
|
|
|
recount lease statistics now. This will dramatically reduce the time
|
|
|
needed to recount statistics at startup and after reconfiguration.
|
|
|
|
|
|
b. SELECT * FROM lease4_stat
|
|
|
WHERE subnet = ?
|
|
|
ORDER BY state;
|
|
|
|
|
|
c. SELECT * FROM lease4_stat
|
|
|
WHERE subnet_id >= ? AND subnet_id <= ?
|
|
|
ORDER BY subnet_id, state;
|
|
|
|
|
|
#### 4.1.3 Extend the !LeaseStatsQuery class and derivations
|
|
|
|
|
|
!LeaseStatsQuery is the abstract class used by !LeaseMgr to glean the lease
|
|
|
state counts used during statistics recount. In short, it executes the query
|
|
|
and then provides a cursor like interface over which the returned rows maybe
|
|
|
iterated. It would need to be extending to support running the new query
|
|
|
variants with selection criteria. This would by done overloading the
|
|
|
LeaseStatsQuery constructor.
|
|
|
|
|
|
a. !LeaseQueryStats(subnet_id)
|
|
|
b. !LeaseQueryStats(subnet_id_beg, subnet_id_end)
|
|
|
|
|
|
#### 4.1.4 Overload !LeaseMgr::startLeaseStatsQuery<4/6>
|
|
|
|
|
|
This function returns a pointer to a newly created LeaseStatsQuery instance.
|
|
|
The instance will have executed the query, with it's "cursor" positioned at
|
|
|
the first row of the results. Variants will be needed to support the where
|
|
|
clause parameters:
|
|
|
|
|
|
a. !startLeaseStatsQuery(subnet_id)
|
|
|
b. !startLeaseStatsQuery(subnet_id_beg, subnet_id_end)
|
|
|
|
|
|
### 4.2 Implement !StatLib, a new, non-premium hook library
|
|
|
|
|
|
Initially, the library would provide the following new commands:
|
|
|
```
|
|
|
{{{
|
|
|
{
|
|
|
"command": "statistic-lease4/6-get",
|
|
|
"arguments": {
|
|
|
"from_storage: true/false,
|
|
|
"subnet-id": x, // optional
|
|
|
"subnet-id-range": // optional
|
|
|
{
|
|
|
"start": x, // id >= x
|
|
|
"end": y // id <= x
|
|
|
}
|
|
|
}
|
|
|
}
|
|
|
}}}
|
|
|
```
|
|
|
Where:
|
|
|
```
|
|
|
from_storage: true - means get fetch values from lease back end.
|
|
|
This is the same query as recount-stats, with or
|
|
|
without a where clause for subnet id.
|
|
|
|
|
|
false - means use local !StatsMgr values
|
|
|
This would be useful for single server use as it
|
|
|
provides a form for per subnet-id stat gathering
|
|
|
|
|
|
subnet-id: optional selection criteria the returns data for the
|
|
|
given subnet
|
|
|
|
|
|
subnet-id-range: optional selection criteria the returns data for a
|
|
|
range the range of subnets: start <= subnet_id <= end
|
|
|
```
|
|
|
|
|
|
The command handlers would be similar to: !LeaseMgr::recountLeaseStats<4/6>.
|
|
|
But rather than updating that statistics in !StatsMgr, it would construct
|
|
|
the command response. The other difference would be honoring the
|
|
|
from_storage parameter. When true, it would use !LeaseStatsQuery rows as
|
|
|
the data source and when false it would iterate through !StatsMgr data.
|
|
|
|
|
|
Response returned would be a "result-set", which mimics the SQL construct:
|
|
|
```
|
|
|
{{{
|
|
|
{
|
|
|
"result": 0,
|
|
|
"text": "<message>",
|
|
|
"arguments": {
|
|
|
{
|
|
|
"result-set": {
|
|
|
"timestamp": "2018-03-22 09:43:30.815371",
|
|
|
"columns": ["subnet_id", "type", "total", "assigned", "declined"],
|
|
|
"rows": [
|
|
|
[0, "IA_NA", 6, 4, 1]
|
|
|
]
|
|
|
}
|
|
|
}
|
|
|
}
|
|
|
}
|
|
|
}}}
|
|
|
```
|
|
|
This will make large responses much more compact and perhaps more readily
|
|
|
digested by clients than the individual observation format used by
|
|
|
existing statistics get commands.
|
|
|
|
|
|
Also note that regardless of the data approach used, total address counts
|
|
|
must come from the !StatsMgr local values until such time as subnet
|
|
|
configuration is supported in the DB or the value is explicitly propagated
|
|
|
to common storage.
|
|
|
|
|
|
## Appendix A SQL for Tables and Triggers
|
|
|
|
|
|
The SQL to create the V4 table and trigger are shown below for both MySQL
|
|
|
and Postgresql follow.
|
|
|
|
|
|
|
|
|
### A.1. MySQL v4 table and insert-trigger SQL
|
|
|
```
|
|
|
{{{
|
|
|
CREATE TABLE lease4_stat (
|
|
|
subnet_id INT UNSIGNED NOT NULL,
|
|
|
state INT UNSIGNED NOT NULL,
|
|
|
leases BIGINT,
|
|
|
PRIMARY KEY (subnet_id, state)
|
|
|
) ENGINE = INNODB;
|
|
|
|
|
|
DELIMITER $$
|
|
|
CREATE TRIGGER stat_lease4_insert AFTER INSERT ON lease4
|
|
|
FOR EACH ROW
|
|
|
BEGIN
|
|
|
UPDATE lease4_stat
|
|
|
SET leases = leases + 1
|
|
|
WHERE subnet_id = NEW.subnet_id AND state = NEW.state;
|
|
|
|
|
|
IF ROW_COUNT() <= 0 THEN
|
|
|
INSERT INTO lease4_stat VALUES (new.subnet_id, new.state, 1);
|
|
|
END IF;
|
|
|
END $$
|
|
|
DELIMITER ; $$
|
|
|
}}}
|
|
|
```
|
|
|
|
|
|
### A.2 MySQL v6 table and insert-trigger SQL
|
|
|
```
|
|
|
{{{
|
|
|
CREATE TABLE lease6_stat (
|
|
|
subnet_id INT UNSIGNED NOT NULL,
|
|
|
lease_type INT UNSIGNED NOT NULL,
|
|
|
state INT UNSIGNED NOT NULL,
|
|
|
leases BIGINT,
|
|
|
PRIMARY KEY (subnet_id, lease_type, state)
|
|
|
) ENGINE = INNODB;
|
|
|
|
|
|
DELIMITER $$
|
|
|
CREATE TRIGGER stat_lease6_insert AFTER INSERT ON lease6
|
|
|
FOR EACH ROW
|
|
|
BEGIN
|
|
|
UPDATE lease6_stat SET leases = leases + 1
|
|
|
WHERE
|
|
|
subnet_id = NEW.subnet_id AND lease_type = NEW.lease_type
|
|
|
AND state = NEW.state;
|
|
|
|
|
|
IF ROW_COUNT() <= 0 THEN
|
|
|
INSERT INTO lease6_stat
|
|
|
VALUES (NEW.subnet_id, NEW.lease_type, NEW.state, 1);
|
|
|
END IF;
|
|
|
END $$
|
|
|
DELIMITER ; $$
|
|
|
}}}
|
|
|
```
|
|
|
|
|
|
### A.3 Postgresql v4 Table and insert-trigger SQL
|
|
|
```
|
|
|
{{{
|
|
|
CREATE TABLE lease4_stat (
|
|
|
subnet_id BIGINT NOT NULL,
|
|
|
state BIGINT NOT NULL,
|
|
|
leases BIGINT,
|
|
|
PRIMARY KEY (subnet_id, state)
|
|
|
);
|
|
|
|
|
|
DROP FUNCTION IF EXISTS proc_stat_lease4_insert() CASCADE;
|
|
|
CREATE FUNCTION proc_stat_lease4_insert () RETURNS trigger AS $stat_lease4_insert$
|
|
|
BEGIN
|
|
|
UPDATE lease4_stat
|
|
|
SET leases = leases + 1
|
|
|
WHERE subnet_id = NEW.subnet_id AND state = NEW.state;
|
|
|
|
|
|
IF NOT FOUND THEN
|
|
|
INSERT INTO lease4_stat VALUES (new.subnet_id, new.state, 1);
|
|
|
END IF;
|
|
|
|
|
|
-- Return is ignored since this is an after insert
|
|
|
RETURN NULL;
|
|
|
END;
|
|
|
$stat_lease4_insert$ LANGUAGE plpgsql;
|
|
|
|
|
|
CREATE TRIGGER stat_lease4_insert
|
|
|
AFTER INSERT ON lease4
|
|
|
FOR EACH ROW EXECUTE PROCEDURE proc_stat_lease4_insert();
|
|
|
}}}
|
|
|
```
|
|
|
|
|
|
### A.4 Postgresql v6 Table and insert-trigger SQL
|
|
|
```
|
|
|
{{{
|
|
|
CREATE TABLE lease6_stat (
|
|
|
subnet_id BIGINT NOT NULL,
|
|
|
lease_type SMALLINT NOT NULL,
|
|
|
state INT8 NOT NULL,
|
|
|
leases BIGINT,
|
|
|
PRIMARY KEY (subnet_id, lease_type, state)
|
|
|
);
|
|
|
|
|
|
CREATE FUNCTION proc_stat_lease6_insert () RETURNS trigger AS $stat_lease6_insert$
|
|
|
BEGIN
|
|
|
UPDATE lease6_stat
|
|
|
SET leases = leases + 1
|
|
|
WHERE
|
|
|
subnet_id = NEW.subnet_id AND lease_type = NEW.lease_type
|
|
|
AND state = NEW.state;
|
|
|
|
|
|
IF NOT FOUND THEN
|
|
|
INSERT INTO lease6_stat
|
|
|
VALUES (NEW.subnet_id, NEW.lease_type, NEW.state, 1);
|
|
|
END IF;
|
|
|
|
|
|
-- Return is ignored since this is an after insert
|
|
|
RETURN NULL;
|
|
|
END;
|
|
|
$stat_lease6_insert$ LANGUAGE plpgsql;
|
|
|
|
|
|
CREATE TRIGGER stat_lease6_insert
|
|
|
AFTER INSERT ON lease6
|
|
|
FOR EACH ROW EXECUTE PROCEDURE proc_stat_lease6_insert();
|
|
|
|
|
|
}}}
|
|
|
```
|
|
|
|
|
|
Note that update and delete triggers were also written and tested for
|
|
|
functionality but not timing, and were omitted for brevity.
|
|
|
|
|
|
## Appendix B Multi-threaded Improvement
|
|
|
|
|
|
A possible improvement for post 1.4 would be to offload the work of statistics
|
|
|
fetching to a separate thread in Kea servers. That would require the following:
|
|
|
|
|
|
### B.1 Separate connection(s) to Lease storage
|
|
|
|
|
|
!StatLib will need to be able to clone the !LeaseMgr() or at least use the Lease
|
|
|
storage access string to open a separate connection to lease storage. This
|
|
|
would allow queries to be executed without tying up the !LeaseMgr's connection.
|
|
|
It would involve some refactoring in the !LeaseMgr class hierarchy and factory,
|
|
|
but this work could also be of benefit to other hook libs that desire
|
|
|
independent connections.
|
|
|
|
|
|
### B.2 Asynchronous Command Handling
|
|
|
|
|
|
Command handling would need to be modified to allow a command to be processed
|
|
|
asynchronously. Thus, when a statistic command is received it the !StatLib
|
|
|
handler can process and respond to the command in a worker thread, while
|
|
|
control is returned to server main thread. This is certainly an ability that
|
|
|
could have wide spread use. This would allow Kea servers to rather seamlessly
|
|
|
respond to statistics requests without hampering their ability to provide DHCP
|
|
|
service.
|
|
|
|
|
|
## Appendix C Design Variant
|
|
|
|
|
|
Should the use of triggers be ruled out, the design detailed above could be
|
|
|
based upon recalculating statistics on demand with a few minor alterations:
|
|
|
|
|
|
### C.1 SQL Queries
|
|
|
The SQL queries added would be variants of the existing recount statistics
|
|
|
queries which run against the lease tables. The v4 queries would be something
|
|
|
like the following:
|
|
|
|
|
|
a. SELECT subnet_id, state, count(state) as state_count
|
|
|
FROM lease4
|
|
|
WHERE subnet_id = ?
|
|
|
GROUP by state
|
|
|
|
|
|
b. SELECT subnet_id, state, count(state) as state_count
|
|
|
FROM lease4
|
|
|
WHERE subnet_id >= ? AND subnet_id <= ?
|
|
|
GROUP BY subnet_id, state ORDER BY subnet_id
|
|
|
|
|
|
### C.2 Add indexes
|
|
|
|
|
|
In order to optimize the query performances additional indexing is recommended.
|
|
|
For lease4 and lease6 tables indexes for subnet_id/state and
|
|
|
subnet_id/type/state respectively are warranted.
|
|
|
|
|
|
There would be no new tables or triggers. The rest of the changes outlined in the
|
|
|
design would still apply. However, the changes outlined in Appendix B, take an a
|
|
|
much greater importance in terms of reducing impact to the server responding the
|
|
|
statistics fetch.
|
|
|
|
|
|
It also does nothing to improve statistics for deployments using our HA.
|
|
|
|
|
|
NOTE: Cassandra triggers may take a bit of work and should they prove problematic it
|
|
|
would be feasible to use the above variant for the initial CQL implementation and
|
|
|
apply triggers later. |