Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
BIND
BIND
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 580
    • Issues 580
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 113
    • Merge Requests 113
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • ISC Open Source Projects
  • BINDBIND
  • Issues
  • #2007

Closed
Open
Opened Jul 06, 2020 by Michał Kępień@michalMaintainer

QNAME minimization may trigger "premature" priming queries

I do not believe this is a serious issue, maybe it can even be closed with -EWORKSASDESIGNED, but perhaps there is room for improvement here. I mostly wanted to document what is happening because figuring it out required me to run rr for 10 days straight :-)

With QNAME minimization enabled, a named resolver might send priming queries before the last cached ./NS response expires. The "root cause" here is that the glue TTL for the root-servers.net. zone differs between the root servers (3600000) and the servers authoritative for net. (172800). This can be demonstrated using the following dig invocations:

$ dig @a.root-servers.net root-servers.net. NS +norec
$ dig @a.gtld-servers.net root-servers.net. NS +norec +multi

Since the root-servers.net. zone is unsigned, the RFC 2181 trust level of the RRsets found in the ADDITIONAL section of the referrals returned by net. authoritative servers is higher than those returned in the ADDITIONAL section of unsigned authoritative responses generated by the root servers (which are authoritative for root-servers.net.). This results in the 172800 TTL for [a-m].root-servers.net/A(AAA) overriding the higher one received from the root servers.

To reproduce the issue:

  1. Start a named resolver with QNAME minimization enabled and preferably max-stale-ttl 0; set, so that the DB dump from the next step is easier to interpret.

  2. Do an rndc dumpdb -cache, then run:

    grep -F root-servers.net named_dump.db

    The result should contain lines like:

    a.root-servers.net.	518399	A	198.41.0.4
    b.root-servers.net.	518399	A	199.9.14.201
    c.root-servers.net.	518399	A	192.33.4.12
    d.root-servers.net.	518399	A	199.7.91.13
    e.root-servers.net.	518399	A	192.203.230.10
    f.root-servers.net.	518399	A	192.5.5.241
    g.root-servers.net.	518399	A	192.112.36.4
    h.root-servers.net.	518399	A	198.97.190.53
    i.root-servers.net.	518399	A	192.36.148.17
    j.root-servers.net.	518399	A	192.58.128.30
    k.root-servers.net.	518399	A	193.0.14.129
    l.root-servers.net.	518399	A	199.7.83.42
    m.root-servers.net.	518399	A	202.12.27.33

    These are cache entries created by the resolver priming query sent during startup. The TTL is about 1 week, which is consistent with max-cache-ttl.

  3. Query the resolver for root-servers.net/NS. This will cause the resolver to first ask the root servers about _.net., which will allow it to learn about the servers authoritative for root-servers.net.. These will subsequently be queried for _.root-servers.net., triggering a referral with TTL=172800.

  4. Repeat step 2. The relevant lines should now turn into something like:

    a.root-servers.net.	172799	A	198.41.0.4
    b.root-servers.net.	172799	A	199.9.14.201
    c.root-servers.net.	172799	A	192.33.4.12
    d.root-servers.net.	172799	A	199.7.91.13
    e.root-servers.net.	172799	A	192.203.230.10
    f.root-servers.net.	172799	A	192.5.5.241
    g.root-servers.net.	172799	A	192.112.36.4
    h.root-servers.net.	172799	A	198.97.190.53
    i.root-servers.net.	172799	A	192.36.148.17
    j.root-servers.net.	172799	A	192.58.128.30
    k.root-servers.net.	172799	A	193.0.14.129
    l.root-servers.net.	172799	A	199.7.83.42
    m.root-servers.net.	172799	A	202.12.27.33

With the TTL lowered, this resolver will now send the next priming query in about 2 days instead of 6 days.

This scenario does not happen without QNAME minimization, in which case the resolver would start processing the root-servers.net/NS query by immediately querying the root servers - and since the root servers are authoritative for root-servers.net., the servers authoritative for net. would not get a chance to send any TTL=172800 referrals.

I found this purely because I was curious why a named resolver whose cache size limit never gets exceeded would sometimes send a priming query earlier than 6 days after the last priming query. This "problem" with the interaction of QNAME minimization with root servers is not nearly as grave as the other one described in #1896.

I think triggering this requires a client query for something in the root-servers.net. zone. The issue manifests itself two days later :-)

Edited Jul 06, 2020 by Michał Kępień
Assignee
Assign to
BIND 9.17 Backburner
Milestone
BIND 9.17 Backburner
Assign milestone
Time tracking
None
Due date
None
Reference: isc-projects/bind9#2007