BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2020-08-26T21:24:34Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1074underflow in stats channel stale cached RRSIG gauge [ISC-support #14769]2020-08-26T21:24:34ZBrian Conryunderflow in stats channel stale cached RRSIG gauge [ISC-support #14769]A customer running a Supported Preview build has started seeing:
```xml
<rrset><name>#NS</name>
<counter>18446744073709551615</counter></rrset>
<rrset><name>#RRSIG</name>
<counter>18446744073709551615</counter></rrset>
</cache>
```
The...A customer running a Supported Preview build has started seeing:
```xml
<rrset><name>#NS</name>
<counter>18446744073709551615</counter></rrset>
<rrset><name>#RRSIG</name>
<counter>18446744073709551615</counter></rrset>
</cache>
```
The ```<rrsig>``` elements are part of the block that measures the current number of records in the cache by RRTYPE, and the ```#``` prefix on the RRTYPE name indicates that these are "stale" records. By the version reported these are supposed to track the number of records that could be used by the ```serve-stale``` feature (if enabled).
An underflow on the stat seems the most likely explanation.BIND 9.15.3Evan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/602Serve-stale implementation broke cache database RRset statistics2020-08-26T21:24:35ZMichał KępieńServe-stale implementation broke cache database RRset statisticsServe-stale implementation (see commit df50751585b64f72d93ad665abf0f485c8941a3b, [RT #44790][1]) changed the definition of a "stale" rdataset header:
* in BIND 9.11 and earlier, a "stale" rdataset header is one that is more than `RBTDB...Serve-stale implementation (see commit df50751585b64f72d93ad665abf0f485c8941a3b, [RT #44790][1]) changed the definition of a "stale" rdataset header:
* in BIND 9.11 and earlier, a "stale" rdataset header is one that is more than `RBTDB_VIRTUAL` (a hard-coded value of 300) seconds past its TTL-based expiry time and thus is eligible for opportunistic cleanup,
* in BIND 9.12 and later, rdataset headers of the above type are known as "ancient"; meanwhile, a "stale" rdataset header is one which is past its TTL-based expiry time but *not* yet eligible for cleanup; a "stale" header becomes an "ancient" header once `max-stale-ttl + RBTDB_VIRTUAL` seconds pass since its TTL-based expiry time.
The BIND statistics system has been able to track "stale" rdataset headers (according to the BIND 9.11 definition) since commit 80fa3ef8517ff046a72c4cb1e785f30c9ef9ee75 (see [RT #29514][2]). In the outputs of various statistics channels, such RRsets were prefixed with a `#` character.
Serve-stale implementation in BIND 9.12 introduced the following problems:
* As a corollary of the definition change described above, RRsets previously presented using the `#` prefix are no longer the same type of RRsets as before.
* The arrays holding statistics counters were not adjusted to account for the definition change, which means BIND stopped keeping track of "ancient" rdataset header counts altogether (such headers are processed and freed correctly but they do not affect statistics).
* Functions responsible for dumping statistics to various channels were not extended as no new `DNS_RDATASTATSTYPE_ATTR_*` macro was added, etc.
* Certain code comments were not updated to match the new nomenclature, which is guaranteed to cause quite a bit of confusion for new BIND developers.
* The rule for updating counters when an rdataset header is marked as "stale" (according the BIND 9.11 definition; the rule is: first decrement the number of "non-stale" records of given type, then increment the number of "stale" records of given type) no longer holds up: depending on query patterns, an rdataset header may either go through the entire "active" → "stale" → "ancient" cycle or go straight from "active" to "ancient". In other words, current rdataset header state needs to be consulted in order to determine which counters should be touched.
* No statistics counters are updated when an rdataset header is marked as "stale" (according to the BIND 9.12 definition).
Given all of the above issues and the fact that the default value of `max-stale-ttl` is non-zero, I would dare to say that cache database RRset statistics are more or less useless in BIND 9.12 and later.
[1]: https://bugs.isc.org/Ticket/Display.html?id=44790
[2]: https://bugs.isc.org/Ticket/Display.html?id=29514BIND 9.15.3