BIND 9.18 retries same authoritative nameserver before trying another one
Summary
Given an authoritative zone with multiple nameservers, some of which may currently be unavailable, how should a BIND recursive resolver react when its first query attempt times out?
Empirically, BIND 9.16 will try a different authoritative nameserver next, but BIND 9.18 will try the same one 3 times in a row before moving on to a different one. The BIND 9.18 behavior is significantly less robust.
BIND version used
[root@dmrz-test-rdns ~]# named -V
BIND 9.18.8 (Stable Release) <id:35f5d35>
running on Linux x86_64 3.10.0-1160.el7.x86_64 #1 SMP Mon Oct 19 16:18:59 UTC 2020
built by make with '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/opt/isc/isc-bind/root/usr' '--exec-prefix=/opt/isc/isc-bind/root/usr' '--bindir=/opt/isc/isc-bind/root/usr/bin' '--sbindir=/opt/isc/isc-bind/root/usr/sbin' '--sysconfdir=/etc/opt/isc/isc-bind' '--datadir=/opt/isc/isc-bind/root/usr/share' '--includedir=/opt/isc/isc-bind/root/usr/include' '--libdir=/opt/isc/isc-bind/root/usr/lib64' '--libexecdir=/opt/isc/isc-bind/root/usr/libexec' '--localstatedir=/var/opt/isc/isc-bind' '--sharedstatedir=/var/opt/isc/isc-bind/lib' '--mandir=/opt/isc/isc-bind/root/usr/share/man' '--infodir=/opt/isc/isc-bind/root/usr/share/info' '--disable-static' '--enable-dnstap' '--with-pic' '--with-gssapi' '--with-json-c' '--with-libxml2' '--without-lmdb' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic' 'LDFLAGS=-Wl,-z,relro -L/opt/isc/isc-bind/root/usr/lib64' 'CPPFLAGS= -I/opt/isc/isc-bind/root/usr/include' 'LT_SYS_LIBRARY_PATH=/usr/lib64' 'PKG_CONFIG_PATH=:/opt/isc/isc-bind/root/usr/lib64/pkgconfig:/opt/isc/isc-bind/root/usr/share/pkgconfig' 'SPHINX_BUILD=/builddir/build/BUILD/bind-9.18.8/sphinx/bin/sphinx-build'
compiled by GCC 4.8.5 20150623 (Red Hat 4.8.5-44)
compiled with OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
linked to OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
compiled with libuv version: 1.44.2
linked to libuv version: 1.44.2
compiled with libnghttp2 version: 1.33.0
linked to libnghttp2 version: 1.33.0
compiled with libxml2 version: 2.9.1
linked to libxml2 version: 20901
compiled with json-c version: 0.11
linked to json-c version: 0.11
compiled with zlib version: 1.2.7
linked to zlib version: 1.2.7
compiled with protobuf-c version: 1.4.1
linked to protobuf-c version: 1.4.1
threads support is enabled
DNSSEC algorithms: RSASHA1 NSEC3RSASHA1 RSASHA256 RSASHA512 ECDSAP256SHA256 ECDSAP384SHA384
DS algorithms: SHA-1 SHA-256 SHA-384
HMAC algorithms: HMAC-MD5 HMAC-SHA1 HMAC-SHA224 HMAC-SHA256 HMAC-SHA384 HMAC-SHA512
TKEY mode 2 support (Diffie-Hellman): yes
TKEY mode 3 support (GSS-API): yes
default paths:
named configuration: /etc/opt/isc/isc-bind/named.conf
rndc configuration: /etc/opt/isc/isc-bind/rndc.conf
DNSSEC root key: /etc/opt/isc/isc-bind/bind.keys
nsupdate session key: /var/opt/isc/isc-bind/run/named/session.key
named PID file: /var/opt/isc/isc-bind/run/named/named.pid
named lock file: /var/opt/isc/isc-bind/run/named/named.lock
Steps to reproduce
-
Configure an authoritative server with the following authoritative zone, listing one working nameserver IP (itself) plus two others which are "down".
example.com. 30 IN SOA ns1.example.com. dns-admin.illinois.edu. 1 3600 900 1209600 30 example.com. 30 IN NS ns1.example.com. example.com. 30 IN NS ns2.example.com. example.com. 30 IN NS ns3.example.com. foo.example.com. 30 IN TXT "test" ns1.example.com. 30 IN A 18.191.96.154 ns2.example.com. 30 IN A 192.168.0.20 ns3.example.com. 30 IN A 192.168.0.30
-
Configure a CentOS 7 instance "dmrz-test-rdns" running isc-bind-bind 9.18.8-1.1.el7 from https://copr.fedorainfracloud.org/coprs/isc/bind/ with a stub zone for example.com:
options { directory "/var/opt/isc/isc-bind/named/data"; allow-query { localhost; 10.0.0.0/8; }; recursion yes; dnssec-validation no; # enable querylogging at startup querylog yes; }; logging { channel default_debug { file "named.run"; print-time yes; severity dynamic; }; # local file for query logs channel queries_file { file "queries.log" versions 5 size 200m; severity info; print-time yes; print-category yes; print-severity yes; }; category queries { queries_file; }; }; zone "example.com" IN { file "example.com"; type stub; masters { 18.191.96.154; }; };
-
As root on dmrz-test-rdns, start a packet capture:
[root@dmrz-test-rdns data]# tcpdump -i eth0 -Uw /var/tmp/${HOSTNAME}.pcap
-
From another workstation in the allow-query ACL, send a test query:
$ dig @10.225.64.30 foo.example.com txt ; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.10 <<>> @10.225.64.30 foo.example.com txt ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33425 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;foo.example.com. IN TXT ;; ANSWER SECTION: foo.example.com. 30 IN TXT "test" ;; Query time: 3784 msec ;; SERVER: 10.225.64.30#53(10.225.64.30) ;; WHEN: Tue Nov 01 20:08:08 UTC 2022 ;; MSG SIZE rcvd: 61
Note the very long query time (over 3 seconds). If you don't see a long query time at first, wait at least 30s (for cache to expire) and try again.
-
Observe named complaining about a timeout:
[root@dmrz-test-rdns data]# tail /var/opt/isc/isc-bind/named/data/named.run 01-Nov-2022 20:04:54.508 all zones loaded 01-Nov-2022 20:04:54.508 running 01-Nov-2022 20:08:08.556 timed out resolving 'foo.example.com/TXT/IN': 192.168.0.20#53 01-Nov-2022 20:08:51.441 timed out resolving 'foo.example.com/TXT/IN': 192.168.0.30#53
-
Download the packet capture file and examine it in Wireshark, filtering by
dns
protocol.Observe that after receiving a query from client (10.224.255.16) at 20:08:04, we try 3 times in a row to query the unresponsive 192.168.0.20 before moving on to the working nameserver 18.191.96.154.
On another occasion, we try 192.168.0.30 3 times in a row.
At 20:09:32 we happen to try the working nameserver first and are able to respond to our client quickly.
No. Time Source Destination Protocol Length Info 3 2022-11-01 20:08:04.776005 10.224.255.16 10.225.64.30 DNS 86 Standard query 0x8291 TXT foo.example.com OPT 4 2022-11-01 20:08:04.776851 10.225.64.30 192.168.0.20 DNS 98 Standard query 0x90f6 TXT foo.example.com OPT 5 2022-11-01 20:08:05.576960 10.225.64.30 192.168.0.20 DNS 98 Standard query 0x7acc TXT foo.example.com OPT 7 2022-11-01 20:08:06.378009 10.225.64.30 192.168.0.20 DNS 98 Standard query 0x05fe TXT foo.example.com OPT 14 2022-11-01 20:08:08.557405 10.225.64.30 18.191.96.154 DNS 98 Standard query 0x7979 TXT foo.example.com OPT 15 2022-11-01 20:08:08.558188 18.191.96.154 10.225.64.30 DNS 233 Standard query response 0x7979 TXT foo.example.com TXT NS ns3.example.com NS ns2.example.com NS ns1.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT 16 2022-11-01 20:08:08.558432 10.225.64.30 10.224.255.16 DNS 103 Standard query response 0x8291 TXT foo.example.com TXT OPT 31 2022-11-01 20:08:47.747803 10.224.255.16 10.225.64.30 DNS 86 Standard query 0xbf02 TXT foo.example.com OPT 32 2022-11-01 20:08:47.748163 10.225.64.30 192.168.0.30 DNS 98 Standard query 0x474b TXT foo.example.com OPT 33 2022-11-01 20:08:48.549188 10.225.64.30 192.168.0.30 DNS 98 Standard query 0x2130 TXT foo.example.com OPT 34 2022-11-01 20:08:49.350213 10.225.64.30 192.168.0.30 DNS 98 Standard query 0x7d5f TXT foo.example.com OPT 37 2022-11-01 20:08:51.441890 10.225.64.30 18.191.96.154 DNS 114 Standard query 0x1ef3 TXT foo.example.com OPT 38 2022-11-01 20:08:51.442705 18.191.96.154 10.225.64.30 DNS 233 Standard query response 0x1ef3 TXT foo.example.com TXT NS ns2.example.com NS ns1.example.com NS ns3.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT 39 2022-11-01 20:08:51.442872 10.225.64.30 10.224.255.16 DNS 103 Standard query response 0xbf02 TXT foo.example.com TXT OPT 55 2022-11-01 20:09:32.834068 10.224.255.16 10.225.64.30 DNS 86 Standard query 0x0297 TXT foo.example.com OPT 56 2022-11-01 20:09:32.834680 10.225.64.30 18.191.96.154 DNS 114 Standard query 0x958a TXT foo.example.com OPT 57 2022-11-01 20:09:32.835082 18.191.96.154 10.225.64.30 DNS 233 Standard query response 0x958a TXT foo.example.com TXT NS ns2.example.com NS ns3.example.com NS ns1.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT 58 2022-11-01 20:09:32.835211 10.225.64.30 10.224.255.16 DNS 103 Standard query response 0x0297 TXT foo.example.com TXT OPT
-
Now retry the experiment with a different recursive resolver "dmrz-test-rdns-916" (10.225.64.119) running isc-bind-bind 9.16.34-1.1.el7 from https://copr.fedorainfracloud.org/coprs/isc/bind-esv/
BIND 9.16 handles the situation much more gracefully; at 20:42:51 we fail to get a response from 192.168.0.30, but instead of retrying the same server we try another one, which happens to be the working one, so we are able to respond successfully to the client in under 1 second.
No. Time Source Destination Protocol Length Info 12 2022-11-01 20:42:51.913893 10.224.255.16 10.225.64.119 DNS 86 Standard query 0x29ac TXT foo.example.com OPT 13 2022-11-01 20:42:51.915009 10.225.64.119 192.168.0.30 DNS 98 Standard query 0x19c1 TXT foo.example.com OPT 14 2022-11-01 20:42:52.713887 10.225.64.119 18.191.96.154 DNS 98 Standard query 0x2280 TXT foo.example.com OPT 15 2022-11-01 20:42:52.714303 18.191.96.154 10.225.64.119 DNS 131 Standard query response 0x2280 TXT foo.example.com TXT OPT 16 2022-11-01 20:42:52.714709 10.225.64.119 10.224.255.16 DNS 103 Standard query response 0x29ac TXT foo.example.com TXT OPT 53 2022-11-01 20:44:15.623212 10.224.255.16 10.225.64.119 DNS 86 Standard query 0xcf02 TXT foo.example.com OPT 54 2022-11-01 20:44:15.623584 10.225.64.119 18.191.96.154 DNS 114 Standard query 0xc333 TXT foo.example.com OPT 55 2022-11-01 20:44:15.624183 18.191.96.154 10.225.64.119 DNS 233 Standard query response 0xc333 TXT foo.example.com TXT NS ns1.example.com NS ns3.example.com NS ns2.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT 56 2022-11-01 20:44:15.624371 10.225.64.119 10.224.255.16 DNS 103 Standard query response 0xcf02 TXT foo.example.com TXT OPT 68 2022-11-01 20:44:54.321211 10.224.255.16 10.225.64.119 DNS 86 Standard query 0x6f5f TXT foo.example.com OPT 69 2022-11-01 20:44:54.321578 10.225.64.119 192.168.0.20 DNS 98 Standard query 0x1fa5 TXT foo.example.com OPT 70 2022-11-01 20:44:55.120996 10.225.64.119 18.191.96.154 DNS 114 Standard query 0xddae TXT foo.example.com OPT 71 2022-11-01 20:44:55.121400 18.191.96.154 10.225.64.119 DNS 233 Standard query response 0xddae TXT foo.example.com TXT NS ns3.example.com NS ns1.example.com NS ns2.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT 72 2022-11-01 20:44:55.121577 10.225.64.119 10.224.255.16 DNS 103 Standard query response 0x6f5f TXT foo.example.com TXT OPT
In this case, we also do not see any "timed out" log messages in named.run
What is the current bug behavior?
As shown above, BIND 9.18 consistently tries multiple times in a row to reach the same unresponsive authoritative nameserver IP, and is therefore unable to answer the client in a timely fashion even though the zone has another authoritative nameserver which is working just fine.
What is the expected correct behavior?
Be like BIND 9.16 and retry against a different authoritative nameserver IP when the first one doesn't respond.
aside: this may be related to #3629