BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2023-11-02T16:43:59Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1287[ISC-Support #7237] additional BIND stats/counters - DLZ invocations2023-11-02T16:43:59ZBrian Conry[ISC-Support #7237] additional BIND stats/counters - DLZ invocationsA customer has requested new stats/counters relating to DLZ, each of which would be per module: one to track the number of invocations and another to track the number of queries answered by the module.
https://support.isc.org/Ticket/Dis...A customer has requested new stats/counters relating to DLZ, each of which would be per module: one to track the number of invocations and another to track the number of queries answered by the module.
https://support.isc.org/Ticket/Display.html?id=7237Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1286[ISC-Support #7237] additional BIND stats/counters - NXDOMAIN redirection2023-11-02T16:43:58ZBrian Conry[ISC-Support #7237] additional BIND stats/counters - NXDOMAIN redirectionA customer has requested stats/counters for the number of times that nxdomain redirection is performed.
Ideally, we should probably have a different counter for each redirection method.
https://support.isc.org/Ticket/Display.html?id=7237A customer has requested stats/counters for the number of times that nxdomain redirection is performed.
Ideally, we should probably have a different counter for each redirection method.
https://support.isc.org/Ticket/Display.html?id=7237Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/102[RT#43428] Silence some 'expected' logging messages2018-02-23T19:57:38ZVicky Riskvicky@isc.org[RT#43428] Silence some 'expected' logging messagesThe user is a high volume hosting provider.
Under attack scenarios, the amount of logging BIND is doing can make a difference.
This user would like to be able to silence some sorts of frequently-recurring messages that are 'expected', ...The user is a high volume hosting provider.
Under attack scenarios, the amount of logging BIND is doing can make a difference.
This user would like to be able to silence some sorts of frequently-recurring messages that are 'expected', because they are basically probing behavior from prospective attackers. An example would be an unsuccessful AXFR from a client that is not permitted to AXFR.
things that would ideally be logged at a higher level:
- Successful AXFR
- Terminated AXFR
- Unsuccessful AXFR from an authorized clientNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2497Add counters for DoH queries2023-12-12T11:00:33ZVicky Riskvicky@isc.orgAdd counters for DoH queriesSince you can support DoH in addition to Do53 on the same BIND instance, it would be really nice to have some way to tell how much DoH traffic you are getting.
- [ ] Add some metric that enables the operator to tell specifically how mu...Since you can support DoH in addition to Do53 on the same BIND instance, it would be really nice to have some way to tell how much DoH traffic you are getting.
- [ ] Add some metric that enables the operator to tell specifically how much DoH traffic they are receiving.
- [ ] Please be sure to document this metric in the ARM and explain what it is measuring as specifically as possible (because we have a problem in general with not having good enough documentation of our metrics).
- [ ] Also, check to ensure that we are updating the existing counters for TCP v4 and TCP v6 queries with the DoH queries (I am told we are not entirely confident that these are always updated for DoH queries in 9.17.10)Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3304Consider dropping the JSON_C_TO_STRING_PRETTY flag used for generating JSON s...2022-04-27T09:43:52ZMichał KępieńConsider dropping the JSON_C_TO_STRING_PRETTY flag used for generating JSON statisticsJSON is a machine-readable format, so there is no need to [generate
statschannel output in a "pretty" form][1]. Quick experiments show that
redundant whitespace adds up to about 40% of the JSON payload produced
by statschannel code.
Ye...JSON is a machine-readable format, so there is no need to [generate
statschannel output in a "pretty" form][1]. Quick experiments show that
redundant whitespace adds up to about 40% of the JSON payload produced
by statschannel code.
Yes, `lib/isc/httpd.c` supports DEFLATE compression via zlib and that
enables massive savings in terms of payload size, but it requires
clients to send the `Accept-Encoding: deflate` HTTP header in order to
kick in, so IMHO HTTP-level compression and producing "minified" JSON
data are tangential mechanisms rather than exclusive alternatives.
Piping "minified" JSON through `jq` allows one to get the "pretty" form
without the extra bandwidth cost.
Some semi-random measurements:
- ~"v9.19", 4 logical CPU cores
$ curl -s http://localhost:8080/json | wc -c
71386
$ curl -s http://localhost:8080/json | jq -c | wc -c
41881
$ curl -s -H "Accept-Encoding: deflate" http://localhost:8080/json | wc -c
4150
- ~"v9.19", 32 logical CPU cores
$ curl -s http://localhost:8080/json | wc -c
391217
$ curl -s http://localhost:8080/json | jq -c | wc -c
227837
$ curl -s -H "Accept-Encoding: deflate" http://localhost:8080/json | wc -c
16721
- ~"v9.16", 4 logical CPU cores
$ curl -s http://localhost:8080/json | wc -c
896972
$ curl -s http://localhost:8080/json | jq -c | wc -c
529643
$ curl -s -H "Accept-Encoding: deflate" http://localhost:8080/json | wc -c
25880
- ~"v9.16", 32 logical CPU cores
$ curl -s http://localhost:8080/json | wc -c
6954362
$ curl -s http://localhost:8080/json | jq -c | wc -c
4106064
$ curl -s -H "Accept-Encoding: deflate" http://localhost:8080/json | wc -c
174557
[1]: https://gitlab.isc.org/isc-projects/bind9/-/blob/fcab10a26ece6419c2f53a2ad82499b4b5ba75c5/bin/named/statschannel.c#L3264-3265Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2948DNSSEC signing statistics do not account for cross-algorithm key ID collisions2023-11-02T17:02:19ZMichał KępieńDNSSEC signing statistics do not account for cross-algorithm key ID collisionsIn https://gitlab.isc.org/isc-private/bind9/-/jobs/2033550, two signing
keys for different signing algorithms have the same key ID:
```
>>> 11-Oct-2021 21:30:08.790 keymgr: keyring: manykeys/RSASHA256/51742 (policy manykeys)
11-Oct-...In https://gitlab.isc.org/isc-private/bind9/-/jobs/2033550, two signing
keys for different signing algorithms have the same key ID:
```
>>> 11-Oct-2021 21:30:08.790 keymgr: keyring: manykeys/RSASHA256/51742 (policy manykeys)
11-Oct-2021 21:30:08.790 keymgr: keyring: manykeys/ECDSAP384SHA384/951 (policy manykeys)
11-Oct-2021 21:30:08.790 keymgr: keyring: manykeys/RSASHA256/37386 (policy manykeys)
>>> 11-Oct-2021 21:30:08.790 keymgr: keyring: manykeys/ECDSAP256SHA256/51742 (policy manykeys)
11-Oct-2021 21:30:08.790 keymgr: keyring: manykeys/ECDSAP256SHA256/23421 (policy manykeys)
11-Oct-2021 21:30:08.790 keymgr: keyring: manykeys/ECDSAP384SHA384/8256 (policy manykeys)
```
While this situation is not considered a key ID collision (because
different algorithms are involved), it messes up the XML/JSON statistics
because these are not keyed by the `<algorithm, ID>` tuple but rather
just by the key ID. In the `statschannel` system test, the
`zones-{json,xml}.pl` helper scripts only process *unique* key IDs,
leaving duplicate entries out of their output files. In this specific
example, this led to:
```diff
$ diff -u zones.expect.8 zones.out.x8
--- zones.expect.8 2021-10-11 23:30:21.000000000 +0200
+++ zones.out.x8 2021-10-11 23:30:23.000000000 +0200
@@ -1,12 +1,10 @@
dnssec-refresh operations 23421: 1
dnssec-refresh operations 37386: 10
dnssec-refresh operations 51742: 1
-dnssec-refresh operations 51742: 10
dnssec-refresh operations 8256: 1
dnssec-refresh operations 951: 10
dnssec-sign operations 23421: 1
dnssec-sign operations 37386: 10
dnssec-sign operations 51742: 1
-dnssec-sign operations 51742: 10
dnssec-sign operations 8256: 1
dnssec-sign operations 951: 10
```Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3379Expose general health indication in stats2023-12-12T05:50:59ZTony FinchExpose general health indication in statsA suggestion from Chris Siebenmann's blog:
https://utcc.utoronto.ca/~cks/space/blog/sysadmin/HaveGeneralHealthMetric
> If your system is reasonably decent sized, it probably has some sort of logging framework that categorizes log messag...A suggestion from Chris Siebenmann's blog:
https://utcc.utoronto.ca/~cks/space/blog/sysadmin/HaveGeneralHealthMetric
> If your system is reasonably decent sized, it probably has some sort of logging framework that categorizes log messages by both subsystem and broad level of alarmingness. Add a hook into your logging system so that you track the last time a message was emitted for a given subsystem at a given priority level, and expose these times (with level and subsystem) as metrics. Then people like me can put together monitoring for things like 'the Prometheus TSDB has logged warnings or above within the last five minutes'.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1793failed query to a `forward only` forwarder increments `serverquota` counter (...2023-11-02T16:58:14ZCathy Almondfailed query to a `forward only` forwarder increments `serverquota` counter (spilled due to server quota)As observed in [Support ticket #16297](https://support.isc.org/Ticket/Display.html?id=16297)
I was inspecting the stats output and was very surprised to see this:
` 13779 spilled due to server quota`
The server in questi...As observed in [Support ticket #16297](https://support.isc.org/Ticket/Display.html?id=16297)
I was inspecting the stats output and was very surprised to see this:
` 13779 spilled due to server quota`
The server in question does not have `fetches-per-server` configured, so this defaults to zero (unlimited). But yet...
Looking at the code - I suspect there's a failure mode that drops through the 'out' block in fctx_getaddresses() without resetting all_spilled (which starts at 'true').
```c
static isc_result_t
fctx_getaddresses(fetchctx_t *fctx, bool badcache) {
dns_rdata_t rdata = DNS_RDATA_INIT;
isc_result_t result;
dns_resolver_t *res;
isc_stdtime_t now;
unsigned int stdoptions = 0;
dns_forwarder_t *fwd;
dns_adbaddrinfo_t *ai;
bool all_bad;
dns_rdata_ns_t ns;
bool need_alternate = false;
bool all_spilled = true;
```
...
```c
/*
* If all of the addresses found were over the
* fetches-per-server quota, return the configured
* response.
*/
if (all_spilled) {
result = res->quotaresp[dns_quotatype_server];
inc_stats(res, dns_resstatscounter_serverquota);
}
```
This is a server that is using global forwarding, so we skip case 'normal_nses', which is where 'all_spilled' is normally reset from true to false during processing:
```c
if (fctx->fwdpolicy == dns_fwdpolicy_only)
goto out;
```
So I'm guessing that what's been 'counted' and then reported here, is failures in getting responses back from any of the global forwarders (which tallies quite nicely with the problem I'm investigating - even though this wasn't a counter I was expecting to see in the stats!).
The assumption seems to be if it's a failure for any other reason than fetch-limits, that something will reset the 'all_spilled' flag - it would appear that assumption is flawed for some configurations and situations. Could someone have a look at this please - it should be an easy one to fix.
I note that this has also been noticed before on bind-users:
https://lists.isc.org/pipermail/bind-users/2016-June/097011.html
I observed this in 9.11.15-S1, but the code path looks the same still on master.
Requested changes:
- [ ] fix serverquota counter
- [ ] add a new counter for specifically for situation when all forwarders have failedNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3464Histograms for timing and memory statistics2023-11-02T17:05:05ZTony FinchHistograms for timing and memory statisticsBIND needs to be able to record statistics covering a wide range of possible values (several decimal orders of magnitude):
* latency times, from submilliseond queries on the same LAN to multi-minute zone transfers
* memory usage, fo...BIND needs to be able to record statistics covering a wide range of possible values (several decimal orders of magnitude):
* latency times, from submilliseond queries on the same LAN to multi-minute zone transfers
* memory usage, for zones from a handful of records to tens of millions
* message sizes, from 64 bytes to 64 kilobytes
In this issue I'm outlining a possible design for a general-purpose histogram data structure,
that could be added to `libisc` for collecting statistics efficiently in several places in BIND.
## existing histograms in BIND
The statistics channel has histograms for request and response sizes, which use buckets that
are defined manually with some tediously repetitive code. These could be replaced by the
proposed self-tuning histograms, although the bucketing will be somewhat different.
## examples of general-purpose histograms
It's possible to record histograms of values covering a wide range, with bucket sizes chosen automatically to provide a particular level of accuracy (e.g. 1% or 10%), and without using more than a few KiB for each histogram. Existing examples are:
* [circllhist, Circonus log-linear histogram](https://github.com/openhistogram/libcircllhist),
aka [OpenHistogram](https://openhistogram.io/)
Uses decimal floating point with two digits of mantissa and a 1 byte exponent,
to record values with 1% accuracy.
* [DDSketch from DataDog](https://www.datadoghq.com/blog/engineering/computing-accurate-percentiles-with-ddsketch/)
Uses the floating-point logarithm to a base derived from the required accuracy, rounded to an integer to make a bucket index.
Has an alternative "fast" mode more like HdrHistogram.
* [HdrHistogram, high dynamic range histogram](http://www.hdrhistogram.org/)
Uses low-precision floating point numbers as bucket indexes.
* [hg64, 64-bit histograms](https://github.com/fanf2/hg64)
My prototype implementation intended for use in BIND.
The DataDog blog article has a nice overview, and compares a quantile sketch implementation (that is designed for a particular rank error) with a histogram (designed for a particular value error). From my reading on this topic I concluded that histograms are both easier to understand, simpler to implement, and have similar or better CPU and memory usage compared to rank-error-based quantile sketches.
## key idea
The histogram counts how many measurements (time or space) have particular `uint64_t` value
or range of values, according to the histogram's configured precision (e.g. 1% or 10%).
Each range of values corresponds to a bucket or counter.
My prototype `hg64` uses a log-linear bucket spacing, which has two parts:
* a logarithm of the value to cover a large dynamic range with a few bits;
specifically, the log base 2 of a `uint64_t` varies from 0 to 63, which fits in 6 bits.
* linear, evenly spaced buckets between logarithms, to provide more precision
than you can get from just a power of 2 or 10. 4 buckets per log are enough
for 10% precision; 32 buckets per log gives 1% precision.
This log-linear bucketing is the same thing as decimal scientific notation,
like 1e9 (1 significant digit, 10% precision) or 2.2e8 (2 significant digits, 1% precision).
It's also the same as a (low-precision) binary floating point number:
the FP exponent is the logarithmic part, and the FP mantissa is the linear part.
## measurements and values
When counting time measurements, it makes sense for the `uint64_t` value to be the time measured in nanoseconds. This allows the histogram to count any time measurements we are likely to need, from submicrosecond up to a few centuries. There is no point using lower-precision time measurements because the histogram bucketing algorithm will reduce the precision as required.
Unlike nanosecond measurements, whose values are towards the logarithmic mid-range of `uint64_t`, memory measurements tend to cluster around zero. The `hg64` bucketing algorithm provides one counter for each distinct small integer; for instance, with 1% precision `hg64` has a counter for each value from 0 to 63, above which multiple values share each counter. To make the best use of these small-value counters, it makes sense to divide a memory measurement to get the desired resolution. For example, if the allocator quantum is 16 bytes, divide an allocation size by 16 before using it as a histogram value.
## incrementing counters quickly
It is very cheap to turn a `uint64_t` value into a bucket number, using CLZ to get the logarithm
with some bit shuffling to move things into place. The basic principle is
roughly the same as used by HdrHistogram and fast-mode DDSketch.
[Paul Khuong encouraged me to use his algorithm](https://twitter.com/pkhuong/status/1571831293335277573)
which is smaller and faster than the version I developed for my proof-of-concept.
As in BIND's existing statistics code, we use a relaxed atomic increment to update a counter.
When the histogram is in cache and uncontended, the whole operation (calculating the bucket
number and incrementing the counter) takes less than 2.5ns in my prototype code.
## efficient storage
The `hg64` bucket keys are small, e.g. 8 bits for 10% precision, or 11 bits for 1% precision.
We could store the buckets as a simple array of counters, which would use 2 KiB for 10%
precision, or 16 KiB for 1% precision. However a large fraction of that space will be
unused, because the values we are recording do not cover anywhere near 20 orders of
magnitude.
My prototype code has a 64 entry top-level array (one for each possible exponent)
and allocate each sub-array on demand (with a counter for each possible mantissa).
Most of the sub-arrays will remain unused. This layout supports lock-free multithreading.
## operations on histograms
* given a value, find its rank (or percentile)
* find the value at a given rank (or percentile)
* get the mean and standard deviation of the data recorded in the histogram
* merge two histograms (which may differ in precision)
* dump and load a histogram in text (e.g. csv, xml, json) and/or binary (for efficiency)
* export a histogram to a user-selected collection of buckets (e.g. for prometheus)
I have implementations of the first four.
The rank and percentile queries work on a snapshot of the working histogram, to avoid multithreading races and to make the calculations more efficient.
## exporting data
An important consumer for data recorded in histograms is Prometheus.
The docs <https://prometheus.io/docs/practices/histograms/> say it supports
* a "histogram" type (actually a cumulative frequency digest) where quantiles are calculated on the server
* a "summary" type, where quantiles are calculated on the client and the server aggregates them over a sliding window
Prometheus has its own textual format for exposing / ingesting data,
<https://prometheus.io/docs/instrumenting/exposition_formats/>.
It looks like it would be fairly easy for `hg64` and BIND to support it,
though it isn't clear whether the server is able to re-bucket data that
is exposed with a different bucketing than configured on the server.
## elsewhere on gitlab
Related issues #598 #2101 #3455Not plannedTony FinchTony Finchhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3878Missing and undocumented statistics from the HTTP stats channel in the ARM2023-11-02T16:30:30ZPieter LexisMissing and undocumented statistics from the HTTP stats channel in the ARMHi folks,
I'm implementing a new monitoring solution and adding a bunch of stats from the HTTP-based stats channel to said monitoring solution. I'm basing the descriptions on what I can find in the ARM section 8.5 (I'm using the one fro...Hi folks,
I'm implementing a new monitoring solution and adding a bunch of stats from the HTTP-based stats channel to said monitoring solution. I'm basing the descriptions on what I can find in the ARM section 8.5 (I'm using the one from 9.18 and 9.19), but some stats are missing there. Some of them are self-evident based on the name, but some are not. I've compiled a list here, the naming is based on the JSON from the webserver:
* nsstats.TCPConnHighWater
* nsstats.SynthNXDOMAIN
* nsstats.CookieNew
* nsstats.CookieIn
* in view.NAME.resolver.stats
* BucketSize
* ClientCookieOut
* ServerCookieOut
* CookieIn
* CookieClientOk
* BadCookieRcode
* NextItem
* Priming
* traffic
The following stats are mostly unclear:
* view.NAME.resolver.cachestats - none of these statistics are described
* view.NAME.resolver.adb - none of these statistics are described
* opcodes.CODE - Are these the counts of received opcodes from clients?
* rcodes.CODE - Are these the counts of rcodes sent to clients?
* qtypes.CODE - Are these the counts of qtypes that were queried by clients?
* view.NAME.resolver.qtypes - what do these numbers mean?
* view.NAME.resolver.cache - Are the current cache entries for these types?
Cheers,
PieterNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4205OpenAPI Description Document for the JSON statistics channel2023-07-12T15:31:13ZMarkus KötterOpenAPI Description Document for the JSON statistics channel### Description
Using the JSON statistics channel is easier when parsing of the data is accomplished using a description document.
### Request
I prepared the following description document - it validates and works.
Possible improveme...### Description
Using the JSON statistics channel is easier when parsing of the data is accomplished using a description document.
### Request
I prepared the following description document - it validates and works.
Possible improvements include:
- not using additionalProperties: { type: integer } for the statistics,
- document the statistic values (https://gitlab.isc.org/isc-projects/bind9/-/issues/3878)
- require properties
Would be great if you could include this next to https://gitlab.isc.org/isc-projects/bind9/-/blob/main/bin/named/bind9.xsl and have the webserver serve it as bind9.yaml similar to https://gitlab.isc.org/isc-projects/bind9/-/blob/main/bin/named/statschannel.c#L3138 for ease of use.
### Links / references
As I'm unable to fork the repo and create a MR (project limit reached), it is embedded inline.
```yaml
openapi: "3.0.0"
info:
version: "0.1"
title: OpenAPI Description Document for Bind9 JSON statistics channel
servers:
- url: /
paths:
/json:
get:
operationId: summary
responses:
'200':
description: The Summary
content:
application/json:
schema:
$ref: "#/components/schemas/Summary"
/json/v1/status:
# /json/v1:
get:
operationId: status
responses:
'200':
description: The Status
content:
application/json:
schema:
$ref: "#/components/schemas/Status"
/json/v1/server:
get:
operationId: server
responses:
'200':
description: The Status
content:
application/json:
schema:
$ref: "#/components/schemas/Server"
/json/v1/zones:
get:
operationId: zones
responses:
'200':
description: The Zones
content:
application/json:
schema:
$ref: "#/components/schemas/Views"
/json/v1/net:
get:
operationId: net
responses:
'200':
description: The Sockstats
content:
application/json:
schema:
$ref: "#/components/schemas/Sockstats"
/json/v1/mem:
get:
operationId: mem
responses:
'200':
description: The Memory Statistics
content:
application/json:
schema:
$ref: "#/components/schemas/Memory"
/json/v1/traffic:
get:
operationId: traffic
responses:
'200':
description: The Traffic Statistics
content:
application/json:
schema:
$ref: "#/components/schemas/Traffic"
components:
schemas:
Status:
type: object
additionalProperties: False
properties:
json-stats-version:
type: string
boot-time:
type: string
format: date-time
config-time:
type: string
format: date-time
current-time:
type: string
format: date-time
version:
type: string
Server:
type: object
additionalProperties: False
allOf:
- $ref: "#/components/schemas/Status"
- $ref: "#/components/schemas/Views"
properties:
opcodes:
type: object
additionalProperties:
type: integer
rcodes:
type: object
additionalProperties:
type: integer
qtypes:
type: object
additionalProperties:
type: integer
nsstats:
type: object
additionalProperties:
type: integer
zonestats:
type: object
additionalProperties:
type: integer
Views:
type: object
additionalProperties: False
allOf:
- $ref: "#/components/schemas/Status"
properties:
views:
type: object
additionalProperties:
$ref: "#/components/schemas/View"
View:
type: object
additionalProperties: False
properties:
zones:
type: array
items:
$ref: "#/components/schemas/Zone"
resolver:
type: object
additionalProperties: False
properties:
stats:
type: object
additionalProperties:
type: integer
qtypes:
type: object
additionalProperties:
type: integer
cache:
type: object
additionalProperties:
type: integer
cachestats:
type: object
additionalProperties:
type: integer
adb:
type: object
additionalProperties:
type: integer
Zone:
type: object
additionalProperties: False
properties:
name:
type: string
class:
type: string
serial:
type: integer
type:
type: string
loaded:
type: string
Sockstats:
type: object
additionalProperties: False
allOf:
- $ref: "#/components/schemas/Status"
properties:
sockstats:
type: object
additionalProperties:
type: integer
Memory:
type: object
additionalProperties: False
allOf:
- $ref: "#/components/schemas/Status"
properties:
memory:
type: object
additionalProperties: False
properties:
TotalUse:
type: integer
InUse:
type: integer
Malloced:
type: integer
ContextSize:
type: integer
Lost:
type: integer
contexts:
type: array
items:
$ref: "#/components/schemas/MemoryContext"
MemoryContext:
type: object
additionalProperties: False
properties:
id:
type: string
name:
type: string
references:
type: integer
total:
type: integer
inuse:
type: integer
maxinuse:
type: integer
malloced:
type: integer
maxmalloced:
type: integer
pools:
type: integer
hiwater:
type: integer
lowater:
type: integer
Traffic:
type: object
additionalProperties: False
allOf:
- $ref: "#/components/schemas/Status"
properties:
traffic:
type: object
additionalProperties:
$ref: "#/components/schemas/TrafficData"
TrafficData:
type: object
additionalProperties:
type: integer
TaskMgr:
type: object
additionalProperties: False
properties:
taskmgr:
type: object
additionalProperties: False
properties:
thread-model:
type: string
default-quantum:
type: integer
tasks:
type: array
items:
$ref: "#/components/schemas/Task"
Task:
type: object
additionalProperties: False
properties:
id:
type: string
name:
type: string
references:
type: integer
state:
type: string
quantum:
type: integer
events:
type: integer
Summary:
type: object
additionalProperties: False
allOf:
- $ref: "#/components/schemas/Status"
- $ref: "#/components/schemas/Server"
- $ref: "#/components/schemas/Views"
- $ref: "#/components/schemas/Memory"
- $ref: "#/components/schemas/Sockstats"
- $ref: "#/components/schemas/Traffic"
- $ref: "#/components/schemas/TaskMgr"
```Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/38Statistics System Overhaul2023-07-12T15:30:36ZRay BellisStatistics System OverhaulThe statistics layer in BIND suffers from poor separation of concerns, with several parts of BIND and its libraries each containing code for rendering statistics in XML, JSON, or in plain text. This leads to inconsistencies in the outpu...The statistics layer in BIND suffers from poor separation of concerns, with several parts of BIND and its libraries each containing code for rendering statistics in XML, JSON, or in plain text. This leads to inconsistencies in the output, as well as making it harder to extend. The individual modules that generate statistics should be agnostic to the output format.
With some core parts of BIND likely to be moved into individual hooks modules (#15) we need a way to record and access statistics without them having to be allocated space in static built-in structures.
The proposal then is to replace the existing static statistics structures with an _abstract, extensible data store_ represented as a nested tree of key-value objects. BIND components would _register_ counter variables within the tree of objects, and retain fast access to those variables by retaining copies of an _opaque pointer_ for each variable.
The existing XML and JSON renderers would be replaced with generic code that can serialize the tree (or portions thereof) by enumeration without specific knowledge of the keys and values stored therein.
The supported data types would be:
* gauge
* counter
* timestamp
* text label
with additional structural types to create the tree of:
* object
* array
Wherever possible values should be accessed using atomic operations. Each variable should have a name that would be used in the XML renderer (or an implicit index for array elements). Each variable should have its associated description attached, although in cases where many variables share the same meaning (i.e. multiple entries in an array) we should be careful to avoid memory bloat.
Access to individual variables needs to be an O(1) operation for performance. Enumeration of the tree should be O(total size of tree). If variables are to be dynamically inserted or deleted, this should be O(log n) or better.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/598Wishlist: statistics for DNS-over-TCP and TLS2024-02-29T16:01:45ZTony FinchWishlist: statistics for DNS-over-TCP and TLSA couple of suggestions:
1. For DNS-over-TLS using a proxy, it would be nice to have separate statistics counters from queries that came from the proxy. When the TLS proxy is running on the same server, it would be enough to have sepa...A couple of suggestions:
1. For DNS-over-TLS using a proxy, it would be nice to have separate statistics counters from queries that came from the proxy. When the TLS proxy is running on the same server, it would be enough to have separate counters when the client address is in the interface list that BIND keeps track of. Is this generally useful enough to be worthwhile?
2. For DNS-over-TCP (and by implication, DNS-over-TLS) it would be helpful to have some guide to setting TCP idle timeouts. Two things would help:
* include the connection age in the query log - useful for later analysis, but no good if query logging needs to be left off
* keep an overall histogram of connection age - I don't know of any smaller summary statistics that would be useful, because the distribution of queries is very skewedBIND 9.19.xAydın MercanAydın Mercanhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3455Zone statistics, esp for tracking transfers, update status2023-03-27T14:30:40ZGreg ChoulesZone statistics, esp for tracking transfers, update statusIt is important to know how your server is performing! Complex and large authoritative server setups - many secondaries, zones and data records - have a lot of plates spinning, or balls in the air (tasks running) and understanding how ea...It is important to know how your server is performing! Complex and large authoritative server setups - many secondaries, zones and data records - have a lot of plates spinning, or balls in the air (tasks running) and understanding how each of these tasks is behaving will help operators to know what is 'normal' and what is not.
This issue is to request additional counters in BIND, to provide metrics on various objects and activities. Here are some initial thoughts on what those might be:
1. Total current no. of zones. This is the sum of numbers of zones of all types (see below).
2. Current no. of primary zones, added by zone statements or dynamically using 'addzone'.
3. Current no. of secondary zones..
4. Current no. of forward zones..
5. Current no. of stub zones..
6. Current no. of static-stub zones..
7. Current no. of catalog zones (catz)
8. Current no. of rpz zones
9. No. of SOA queries currently in progress - similar in concept to the current no. of recursive clients. (this might exist already?
10. Histogram of times to resolve SOA zone refresh queries - similar to the query RTT buckets in `named.stats`.
- 0-0.999ms
- 1-4.999ms
- 5-9.999ms
- 10-19.999ms
- 20-49.999ms
- 50-99.999ms
- 100-499.99ms
- 500ms-infinity
11. No. of AXFRs currently in progress
12. No. of IXFRs currently in progress
13. Histogram of times XFRs have been in progress (the bucket sizes are a guess, not based on empirical data)
- 0-0.999s
- 1-4.999s
- 5-9.999s
- 10-19.999s
- 20-49.999s
- 50-99.999s
- 100-499.99s
- 500s-infinityNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4104ZoneQuota stats counter is not counting everything2024-02-24T07:55:05ZOndřej SurýZoneQuota stats counter is not counting everythingThe `ZoneQuota` should log all the hits to `fcount_incr()` returning `ISC_R_QUOTA`, but it does only in a single place. The counting should be moved to `fctx_incr()`.The `ZoneQuota` should log all the hits to `fcount_incr()` returning `ISC_R_QUOTA`, but it does only in a single place. The counting should be moved to `fctx_incr()`.May 2024 (9.18.27, 9.18.27-S1, 9.19.24)Ondřej SurýOndřej Surý