BIND 9.18 retries same authoritative nameserver before trying another one

Summary

Given an authoritative zone with multiple nameservers, some of which may currently be unavailable, how should a BIND recursive resolver react when its first query attempt times out?

Empirically, BIND 9.16 will try a different authoritative nameserver next, but BIND 9.18 will try the same one 3 times in a row before moving on to a different one. The BIND 9.18 behavior is significantly less robust.

BIND version used

[root@dmrz-test-rdns ~]# named -V
BIND 9.18.8 (Stable Release) <id:35f5d35>
running on Linux x86_64 3.10.0-1160.el7.x86_64 #1 SMP Mon Oct 19 16:18:59 UTC 2020
built by make with  '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/opt/isc/isc-bind/root/usr' '--exec-prefix=/opt/isc/isc-bind/root/usr' '--bindir=/opt/isc/isc-bind/root/usr/bin' '--sbindir=/opt/isc/isc-bind/root/usr/sbin' '--sysconfdir=/etc/opt/isc/isc-bind' '--datadir=/opt/isc/isc-bind/root/usr/share' '--includedir=/opt/isc/isc-bind/root/usr/include' '--libdir=/opt/isc/isc-bind/root/usr/lib64' '--libexecdir=/opt/isc/isc-bind/root/usr/libexec' '--localstatedir=/var/opt/isc/isc-bind' '--sharedstatedir=/var/opt/isc/isc-bind/lib' '--mandir=/opt/isc/isc-bind/root/usr/share/man' '--infodir=/opt/isc/isc-bind/root/usr/share/info' '--disable-static' '--enable-dnstap' '--with-pic' '--with-gssapi' '--with-json-c' '--with-libxml2' '--without-lmdb' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic' 'LDFLAGS=-Wl,-z,relro  -L/opt/isc/isc-bind/root/usr/lib64' 'CPPFLAGS= -I/opt/isc/isc-bind/root/usr/include' 'LT_SYS_LIBRARY_PATH=/usr/lib64' 'PKG_CONFIG_PATH=:/opt/isc/isc-bind/root/usr/lib64/pkgconfig:/opt/isc/isc-bind/root/usr/share/pkgconfig' 'SPHINX_BUILD=/builddir/build/BUILD/bind-9.18.8/sphinx/bin/sphinx-build'
compiled by GCC 4.8.5 20150623 (Red Hat 4.8.5-44)
compiled with OpenSSL version: OpenSSL 1.0.2k-fips  26 Jan 2017
linked to OpenSSL version: OpenSSL 1.0.2k-fips  26 Jan 2017
compiled with libuv version: 1.44.2
linked to libuv version: 1.44.2
compiled with libnghttp2 version: 1.33.0
linked to libnghttp2 version: 1.33.0
compiled with libxml2 version: 2.9.1
linked to libxml2 version: 20901
compiled with json-c version: 0.11
linked to json-c version: 0.11
compiled with zlib version: 1.2.7
linked to zlib version: 1.2.7
compiled with protobuf-c version: 1.4.1
linked to protobuf-c version: 1.4.1
threads support is enabled
DNSSEC algorithms: RSASHA1 NSEC3RSASHA1 RSASHA256 RSASHA512 ECDSAP256SHA256 ECDSAP384SHA384
DS algorithms: SHA-1 SHA-256 SHA-384
HMAC algorithms: HMAC-MD5 HMAC-SHA1 HMAC-SHA224 HMAC-SHA256 HMAC-SHA384 HMAC-SHA512
TKEY mode 2 support (Diffie-Hellman): yes
TKEY mode 3 support (GSS-API): yes

default paths:
  named configuration:  /etc/opt/isc/isc-bind/named.conf
  rndc configuration:   /etc/opt/isc/isc-bind/rndc.conf
  DNSSEC root key:      /etc/opt/isc/isc-bind/bind.keys
  nsupdate session key: /var/opt/isc/isc-bind/run/named/session.key
  named PID file:       /var/opt/isc/isc-bind/run/named/named.pid
  named lock file:      /var/opt/isc/isc-bind/run/named/named.lock

Steps to reproduce

Configure an authoritative server with the following authoritative zone, listing one working nameserver IP (itself) plus two others which are "down".

example.com.		30	IN	SOA	ns1.example.com. dns-admin.illinois.edu. 1 3600 900 1209600 30
example.com.		30	IN	NS	ns1.example.com.
example.com.		30	IN	NS	ns2.example.com.
example.com.		30	IN	NS	ns3.example.com.
foo.example.com.	30	IN	TXT	"test"
ns1.example.com.	30	IN	A	18.191.96.154
ns2.example.com.	30	IN	A	192.168.0.20
ns3.example.com.	30	IN	A	192.168.0.30

Configure a CentOS 7 instance "dmrz-test-rdns" running isc-bind-bind 9.18.8-1.1.el7 from https://copr.fedorainfracloud.org/coprs/isc/bind/ with a stub zone for example.com:

options {
	directory "/var/opt/isc/isc-bind/named/data";

	allow-query { localhost; 10.0.0.0/8; };
	recursion yes;
	dnssec-validation no;

	# enable querylogging at startup
	querylog yes;
};

logging {
	channel default_debug {
		file "named.run";
		print-time yes;
		severity dynamic;
	};

	# local file for query logs
	channel queries_file {
		file "queries.log" versions 5 size 200m;
		severity info;
		print-time yes; print-category yes; print-severity yes;
	};
	category queries { queries_file; };
};

zone "example.com" IN {
	file "example.com";
	type stub;
	masters { 18.191.96.154; };
};

As root on dmrz-test-rdns, start a packet capture:

[root@dmrz-test-rdns data]# tcpdump -i eth0 -Uw /var/tmp/${HOSTNAME}.pcap

From another workstation in the allow-query ACL, send a test query:

$ dig @10.225.64.30 foo.example.com txt

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.10 <<>> @10.225.64.30 foo.example.com txt
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33425
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;foo.example.com.		IN	TXT

;; ANSWER SECTION:
foo.example.com.	30	IN	TXT	"test"

;; Query time: 3784 msec
;; SERVER: 10.225.64.30#53(10.225.64.30)
;; WHEN: Tue Nov 01 20:08:08 UTC 2022
;; MSG SIZE  rcvd: 61

Note the very long query time (over 3 seconds). If you don't see a long query time at first, wait at least 30s (for cache to expire) and try again.

Observe named complaining about a timeout:

[root@dmrz-test-rdns data]# tail /var/opt/isc/isc-bind/named/data/named.run 
01-Nov-2022 20:04:54.508 all zones loaded
01-Nov-2022 20:04:54.508 running
01-Nov-2022 20:08:08.556 timed out resolving 'foo.example.com/TXT/IN': 192.168.0.20#53
01-Nov-2022 20:08:51.441 timed out resolving 'foo.example.com/TXT/IN': 192.168.0.30#53

Download the packet capture file and examine it in Wireshark, filtering by dns protocol.

Observe that after receiving a query from client (10.224.255.16) at 20:08:04, we try 3 times in a row to query the unresponsive 192.168.0.20 before moving on to the working nameserver 18.191.96.154.

On another occasion, we try 192.168.0.30 3 times in a row.

At 20:09:32 we happen to try the working nameserver first and are able to respond to our client quickly.

No.     Time                          Source                Destination           Protocol Length Info
      3 2022-11-01 20:08:04.776005    10.224.255.16         10.225.64.30          DNS      86     Standard query 0x8291 TXT foo.example.com OPT
      4 2022-11-01 20:08:04.776851    10.225.64.30          192.168.0.20          DNS      98     Standard query 0x90f6 TXT foo.example.com OPT
      5 2022-11-01 20:08:05.576960    10.225.64.30          192.168.0.20          DNS      98     Standard query 0x7acc TXT foo.example.com OPT
      7 2022-11-01 20:08:06.378009    10.225.64.30          192.168.0.20          DNS      98     Standard query 0x05fe TXT foo.example.com OPT
     14 2022-11-01 20:08:08.557405    10.225.64.30          18.191.96.154         DNS      98     Standard query 0x7979 TXT foo.example.com OPT
     15 2022-11-01 20:08:08.558188    18.191.96.154         10.225.64.30          DNS      233    Standard query response 0x7979 TXT foo.example.com TXT NS ns3.example.com NS ns2.example.com NS ns1.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT
     16 2022-11-01 20:08:08.558432    10.225.64.30          10.224.255.16         DNS      103    Standard query response 0x8291 TXT foo.example.com TXT OPT
     31 2022-11-01 20:08:47.747803    10.224.255.16         10.225.64.30          DNS      86     Standard query 0xbf02 TXT foo.example.com OPT
     32 2022-11-01 20:08:47.748163    10.225.64.30          192.168.0.30          DNS      98     Standard query 0x474b TXT foo.example.com OPT
     33 2022-11-01 20:08:48.549188    10.225.64.30          192.168.0.30          DNS      98     Standard query 0x2130 TXT foo.example.com OPT
     34 2022-11-01 20:08:49.350213    10.225.64.30          192.168.0.30          DNS      98     Standard query 0x7d5f TXT foo.example.com OPT
     37 2022-11-01 20:08:51.441890    10.225.64.30          18.191.96.154         DNS      114    Standard query 0x1ef3 TXT foo.example.com OPT
     38 2022-11-01 20:08:51.442705    18.191.96.154         10.225.64.30          DNS      233    Standard query response 0x1ef3 TXT foo.example.com TXT NS ns2.example.com NS ns1.example.com NS ns3.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT
     39 2022-11-01 20:08:51.442872    10.225.64.30          10.224.255.16         DNS      103    Standard query response 0xbf02 TXT foo.example.com TXT OPT
     55 2022-11-01 20:09:32.834068    10.224.255.16         10.225.64.30          DNS      86     Standard query 0x0297 TXT foo.example.com OPT
     56 2022-11-01 20:09:32.834680    10.225.64.30          18.191.96.154         DNS      114    Standard query 0x958a TXT foo.example.com OPT
     57 2022-11-01 20:09:32.835082    18.191.96.154         10.225.64.30          DNS      233    Standard query response 0x958a TXT foo.example.com TXT NS ns2.example.com NS ns3.example.com NS ns1.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT
     58 2022-11-01 20:09:32.835211    10.225.64.30          10.224.255.16         DNS      103    Standard query response 0x0297 TXT foo.example.com TXT OPT

Now retry the experiment with a different recursive resolver "dmrz-test-rdns-916" (10.225.64.119) running isc-bind-bind 9.16.34-1.1.el7 from https://copr.fedorainfracloud.org/coprs/isc/bind-esv/

BIND 9.16 handles the situation much more gracefully; at 20:42:51 we fail to get a response from 192.168.0.30, but instead of retrying the same server we try another one, which happens to be the working one, so we are able to respond successfully to the client in under 1 second.

No.     Time                          Source                Destination           Protocol Length Info
     12 2022-11-01 20:42:51.913893    10.224.255.16         10.225.64.119         DNS      86     Standard query 0x29ac TXT foo.example.com OPT
     13 2022-11-01 20:42:51.915009    10.225.64.119         192.168.0.30          DNS      98     Standard query 0x19c1 TXT foo.example.com OPT
     14 2022-11-01 20:42:52.713887    10.225.64.119         18.191.96.154         DNS      98     Standard query 0x2280 TXT foo.example.com OPT
     15 2022-11-01 20:42:52.714303    18.191.96.154         10.225.64.119         DNS      131    Standard query response 0x2280 TXT foo.example.com TXT OPT
     16 2022-11-01 20:42:52.714709    10.225.64.119         10.224.255.16         DNS      103    Standard query response 0x29ac TXT foo.example.com TXT OPT
     53 2022-11-01 20:44:15.623212    10.224.255.16         10.225.64.119         DNS      86     Standard query 0xcf02 TXT foo.example.com OPT
     54 2022-11-01 20:44:15.623584    10.225.64.119         18.191.96.154         DNS      114    Standard query 0xc333 TXT foo.example.com OPT
     55 2022-11-01 20:44:15.624183    18.191.96.154         10.225.64.119         DNS      233    Standard query response 0xc333 TXT foo.example.com TXT NS ns1.example.com NS ns3.example.com NS ns2.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT
     56 2022-11-01 20:44:15.624371    10.225.64.119         10.224.255.16         DNS      103    Standard query response 0xcf02 TXT foo.example.com TXT OPT
     68 2022-11-01 20:44:54.321211    10.224.255.16         10.225.64.119         DNS      86     Standard query 0x6f5f TXT foo.example.com OPT
     69 2022-11-01 20:44:54.321578    10.225.64.119         192.168.0.20          DNS      98     Standard query 0x1fa5 TXT foo.example.com OPT
     70 2022-11-01 20:44:55.120996    10.225.64.119         18.191.96.154         DNS      114    Standard query 0xddae TXT foo.example.com OPT
     71 2022-11-01 20:44:55.121400    18.191.96.154         10.225.64.119         DNS      233    Standard query response 0xddae TXT foo.example.com TXT NS ns3.example.com NS ns1.example.com NS ns2.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT
     72 2022-11-01 20:44:55.121577    10.225.64.119         10.224.255.16         DNS      103    Standard query response 0x6f5f TXT foo.example.com TXT OPT

In this case, we also do not see any "timed out" log messages in named.run

What is the current bug behavior?

As shown above, BIND 9.18 consistently tries multiple times in a row to reach the same unresponsive authoritative nameserver IP, and is therefore unable to answer the client in a timely fashion even though the zone has another authoritative nameserver which is working just fine.

What is the expected correct behavior?

Be like BIND 9.16 and retry against a different authoritative nameserver IP when the first one doesn't respond.

aside: this may be related to #3629

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information