Mitigation for cache bloat due to negative RRsets (NXDOMAIN and NXRRSET) when preserving expired RRsets using max-stale-ttl
I'm tagging this as 'Customer' because it is affecting/has affected several customer caches (unanticipated cache bloat) with 'stale-cache-enable yes;'
By design, negative cache content is not used in the same way as positive cache content. 'stale-answer-client-timeout' does not apply to negative content, thus the clients will timeout and likely retry the same query several times before giving up, long before the resolver-query-timeout. Only when named has failed to get a response from the authoritative server(s) does the 'stale-refresh-time' window commence.
There is a good reason for this - when querying the authoritative servers for a name that is already NXDOMAIN or NXRRSET in cache, we don't know if this is going to be replaced with positive content or not. Therefore, to ensure that positive content is preferred, we have to wait to be sure that it's sensible to serve the stale negative content as a last resort.
Most negative content is single-use - so adding max-stale-ttl to its retention period can significantly increase cache bloat - for no useful reason whatsoever.
Most negative content has (also by design of sane zone administrators, or overridden by max-ncache-ttl) a much shorter TTL in cache than positive content.
So when we're not preserving stale content, the negative content is fairly quickly removed from cache, in comparison with the positive RRsets. But adding max-stale-ttl to the retention period can quite significantly tilt the balance in favour of all of this one-use-only cached negative content.
We don't have any statistics that measure cache hits for different RTYPEs, so we don't know for sure, what percentage of negative content is used again (cache hits) versus just sitting there for no good reason (cache misses) - perhaps we should? (Perhaps we should also have stats on stale hits and misses by RTYPE?)
But in any case, having seen too many caches overloaded with stale negative content, I should like to propose an option that can be used either to shorten the max-stale-ttl for negative content, but that also takes a value of zero, to disable its retention entirely.