'forward first' configuration can forward much more often than just 'first'
Summary
With 'forward first' it seems that named sometimes asks the forwarder(s) more often than just 'first'.
Steps to reproduce
This requires a non-responding forwarder (e.g. silent failure / timeout, but any other failure mode that triggers fallback to iterative recursion may also behave the same way (untested))
Configure the server for 'forward first'.
With an empty cache, query for a name that will require named to follow several delegation trails, e.g. through delegation or through CNAMEs.
www.amazon.co.uk/A is a good example.
What is the current bug behavior?
named (correctly) first sends the original question to the forwarder(s), then asks one of the root servers (using hints to find the address).
After receiving the referral (e.g. to the servers for 'uk'), before sending the original query to any of the referred-to servers named will send another copy of the original question to all of the forwarders. This is believed to be an error.
At each new referral (e.g. from 'uk' to 'co.uk' and from 'co.uk' to 'amazon.co.uk') named will send yet another copy of the original question to all of the forwarders before asking any of the new referrals. This is again believed to be an error.
What is the expected correct behavior?
The expected behaviour is that, in the process of resolving a query, in a 'forward first' configuration named will give each forwarder one, and only one, chance to provide an answer to each distinct query before falling back to recursion.
Learning the list of servers that the root zone delegates 'uk' to should not be sufficient cause to expect that the forwarders may have a better answer to the same question they failed to answer previously
It does make sense for the forwarders to be given a chance to answer new questions (such as resolving the addresses of the delegated-to servers), but we don't need to be sending them the same questions repeatedly. The option is 'forward first', not 'forward first-and-alternating-thereafter'.
Relevant logs and/or screenshots
options {
forward first;
forwarders {
8.8.8.8;
8.8.4.4;
};
};
This is sufficient to demonstrate as long as both 8.8.8.8 and 8.8.4.4 are null routed (e.g. static routes exist that forward the packets to a host that will drop them silently).
I have an rr
recording of this that can be examined. Will be attached once the issue number is known.
Notes
This issue was observed as part of a customer-reported issue, but only contributed indirectly to the customer issue.