QNAME minimization may trigger "premature" priming queries
I do not believe this is a serious issue, maybe it can even be closed
-EWORKSASDESIGNED, but perhaps there is room for improvement
here. I mostly wanted to document what is happening because figuring it
out required me to run
rr for 10 days straight :-)
With QNAME minimization enabled, a
named resolver might send priming
queries before the last cached
./NS response expires. The "root
cause" here is that the glue TTL for the
differs between the root servers (3600000) and the servers authoritative
net. (172800). This can be demonstrated using the following
$ dig @a.root-servers.net root-servers.net. NS +norec $ dig @a.gtld-servers.net root-servers.net. NS +norec +multi
root-servers.net. zone is unsigned, the RFC 2181 trust level
of the RRsets found in the ADDITIONAL section of the referrals returned
net. authoritative servers is higher than those returned in the
ADDITIONAL section of unsigned authoritative responses generated by the
root servers (which are authoritative for
results in the 172800 TTL for
the higher one received from the root servers.
To reproduce the issue:
namedresolver with QNAME minimization enabled and preferably
max-stale-ttl 0;set, so that the DB dump from the next step is easier to interpret.
rndc dumpdb -cache, then run:
grep -F root-servers.net named_dump.db
The result should contain lines like:
a.root-servers.net. 518399 A 18.104.22.168 b.root-servers.net. 518399 A 22.214.171.124 c.root-servers.net. 518399 A 126.96.36.199 d.root-servers.net. 518399 A 188.8.131.52 e.root-servers.net. 518399 A 184.108.40.206 f.root-servers.net. 518399 A 220.127.116.11 g.root-servers.net. 518399 A 18.104.22.168 h.root-servers.net. 518399 A 22.214.171.124 i.root-servers.net. 518399 A 126.96.36.199 j.root-servers.net. 518399 A 188.8.131.52 k.root-servers.net. 518399 A 184.108.40.206 l.root-servers.net. 518399 A 220.127.116.11 m.root-servers.net. 518399 A 18.104.22.168
These are cache entries created by the resolver priming query sent during startup. The TTL is about 1 week, which is consistent with
Query the resolver for
root-servers.net/NS. This will cause the resolver to first ask the root servers about
_.net., which will allow it to learn about the servers authoritative for
root-servers.net.. These will subsequently be queried for
_.root-servers.net., triggering a referral with TTL=172800.
Repeat step 2. The relevant lines should now turn into something like:
a.root-servers.net. 172799 A 22.214.171.124 b.root-servers.net. 172799 A 126.96.36.199 c.root-servers.net. 172799 A 188.8.131.52 d.root-servers.net. 172799 A 184.108.40.206 e.root-servers.net. 172799 A 220.127.116.11 f.root-servers.net. 172799 A 18.104.22.168 g.root-servers.net. 172799 A 22.214.171.124 h.root-servers.net. 172799 A 126.96.36.199 i.root-servers.net. 172799 A 188.8.131.52 j.root-servers.net. 172799 A 184.108.40.206 k.root-servers.net. 172799 A 220.127.116.11 l.root-servers.net. 172799 A 18.104.22.168 m.root-servers.net. 172799 A 22.214.171.124
With the TTL lowered, this resolver will now send the next priming query in about 2 days instead of 6 days.
This scenario does not happen without QNAME minimization, in which case
the resolver would start processing the
root-servers.net/NS query by
immediately querying the root servers - and since the root servers are
root-servers.net., the servers authoritative for
net. would not get a chance to send any TTL=172800 referrals.
I found this purely because I was curious why a
named resolver whose
cache size limit never gets exceeded would sometimes send a priming
query earlier than 6 days after the last priming query. This "problem"
with the interaction of QNAME minimization with root servers is not
nearly as grave as the other one described in #1896.
I think triggering this requires a client query for something in the
root-servers.net. zone. The issue manifests itself two days later :-)