Skip to content

[v9_16] Fix servestale fetchlimits crash

Michal Nowak requested to merge 2565-servestale-fetchlimits-crash-v9_16 into v9_16

When we query the resolver for a domain name that is in the same zone for which is already one or more fetches outstanding, we could potentially hit the fetch limits. If so, recursion fails immediately for the incoming query and if serve-stale is enabled, we may try to return a stale answer.

If the resolver is also is authoritative for the parent zone (for example the root zone), first a delegation is found, but we first check the cache for a better response.

Nothing is found in the cache, so we try to recurse to find the answer to the query.

Because of fetch-limits 'dns_resolver_createfetch()' returns an error, which 'ns_query_recurse()' propagates to the caller, 'query_delegation_recurse()'.

Because serve-stale is enabled, 'query_usestale()' is called, setting 'qctx->db' to the cache db, but leaving 'qctx->version' untouched. Now 'query_lookup()' is called to search for stale data in the cache database with a non-NULL 'qctx->version' (which is set to a zone db version), and thus we hit an assertion in rbtdb.

This crash was introduced in 'v9_16' by commit 2afaff75.

(cherry picked from commit 87591de6)

Closes #2565 (closed)

Edited by Michal Nowak

Merge request reports