Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
BIND
BIND
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 610
    • Issues 610
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 114
    • Merge Requests 114
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • ISC Open Source Projects
  • BINDBIND
  • Issues
  • #664

Closed
Open
Opened Nov 07, 2018 by Cathy Almond@cathyaDeveloper

fetches-per-server quota is lower-bounded to 1 instead of to 2% of quota

On a server with "fetches-per-server 4000;" I was surprised to see a cache dump with the ADB values for a server showing me a quota set to 1.

; problem-server.example.com [v4 TTL 2658] [v4 not_found] [v6 unexpected] ; 192.0.2.25 [srtt 948570] [flags 00004000] [ttl -342230] [atr 0.62] [quota 1]

Although we didn't document the lower bound in the ARM (this also needs to be addressed), the KB article (https://kb.isc.org/docs/aa-01304) explaining how fetchlimits work, based on information from Engineering, describes the adjustment algorithm thus:

The fetches-per-server option sets a hard upper limit to the number of outstanding fetches allowed for a single server. The lower limit is 2% of fetches-per-server, but never below 1. It also allows you to select what to do with the queries that are being limited - either drop them, or send back a SERVFAIL response.

Clearly however, this is not what is in code, as seen in adb.c maybe-adjust-quota(), the last thing we do:

	/* Ensure we don't drop to zero */
	if (addr->entry->quota == 0)
		addr->entry->quota = 1;
}

The background to this, although very much a corner case, is a mis-configured server that responds to A queries but sends back nothing (so the fetches timeout) for AAAA queries for the same name. This is interacting particularly badly with fetches-per-server because the 'good' queries all get answers and are cached, whereas the 'bad' ones all timeout, SERVFAIL to the client and are not cached.

Turning on servfail cache would mitigate that to some extent.

But nevertheless, the quota going all the way down to 1 (instead of to 80) is making matter much worse.

Please fix, because although this corner case is not our problem as such (the mis-behaving server is being fixed), it is bad that the quota is going down so low that it's very hard to get enough queries to be processed in order to recalculate the atr often enough to be reasonably representative of the query rate to this server.

There was also no evidence of the low quota in the logging - presumably because it had been a rock bottom for longer than the logfile sample I looked at - which was for several hours. It therefore needed a cache dump to identify the problem.

(P.S. I'm assuming it's inefficient to calculate "2% of quota" every time we pass this way, so the bottom limit probably wants calculating initially to use here, and might well be something we want to add to adb on a per-server basis too, in anticipation of future work on fetches-per-server to allow for server-specific quota overrides)

Reference: https://support.isc.org/Ticket/Display.html?id=13720

Assignee
Assign to
November 2019 (9.11.13, 9.14.8, 9.15.6)
Milestone
November 2019 (9.11.13, 9.14.8, 9.15.6) (Past due)
Assign milestone
Time tracking
None
Due date
None
Reference: isc-projects/bind9#664