BIND sometimes fixates on one server address for a zone
A customer has reported:
I noticed I focused on the wrong nameservers before (I sent the nameservers for akamaiedge.net, instead of g.akamaiedge.net), but the issue is the same. The authoritative nameservers to consider are:
n0g.akamaiedge.net. 152 IN A 88.221.81.192 n0g.akamaiedge.net. 152 IN AAAA 2600:1480:e800::c0 n1g.akamaiedge.net. 152 IN A 2.16.65.53 n2g.akamaiedge.net. 152 IN A 2.16.65.86 n3g.akamaiedge.net. 152 IN A 2.16.65.44 n4g.akamaiedge.net. 152 IN A 2.16.65.68 n5g.akamaiedge.net. 162 IN A 2.16.65.77 n6g.akamaiedge.net. 162 IN A 2.21.25.118 n7g.akamaiedge.net. 181 IN A 2.17.41.132
response times are:
88.221.81.192: 147 msec 2600:1480:e800::c0: 146 msec 2.16.65.53: 1 msec 2.16.65.86: 1 msec 2.16.65.44: 1 msec 2.16.65.68: 1 msec 2.16.65.77: 1 msec 2.21.25.118: 15 msec 2.17.41.132: 13 msec
They have provided data from rndc dumpdb -all
.
selected cache data:
; glue
g.akamaiedge.net. 865 NS n0g.akamaiedge.net.
865 NS n7g.akamaiedge.net.
865 NS n5g.akamaiedge.net.
865 NS n4g.akamaiedge.net.
865 NS n3g.akamaiedge.net.
865 NS n1g.akamaiedge.net.
865 NS n2g.akamaiedge.net.
865 NS n6g.akamaiedge.net.
; answer
e11550.g.akamaiedge.net. 433 \-TYPE65 ;-$NXRRSET
; g.akamaiedge.net. SOA n0g.akamaiedge.net. hostmaster.akamai.com. 1599033648 1000 1000 1000 1800
; authanswer
n0g.akamaiedge.net. 2246 A 88.221.81.192
; authanswer
2246 AAAA 2600:1480:e800::c0
; authanswer
n1g.akamaiedge.net. 2246 A 2.16.65.53
; authanswer
n2g.akamaiedge.net. 2246 A 2.16.65.86
; authanswer
n3g.akamaiedge.net. 2246 A 2.16.65.44
; authanswer
n4g.akamaiedge.net. 2246 A 2.16.65.68
; authanswer
n5g.akamaiedge.net. 2256 A 2.16.65.77
; authanswer
n6g.akamaiedge.net. 2256 A 2.21.25.118
; authanswer
n7g.akamaiedge.net. 2275 A 2.17.41.132
selected ADB entries:
; selected ADB data
; n0g.akamaiedge.net [v4 TTL 46] [v6 TTL 46] [v4 success] [v6 success]
; 88.221.81.192 [srtt 121879] [flags 00004000] [edns 63/0/0/0/0] [plain 0/0] [udpsize 512] [ttl -991]
; 2600:1480:e800::c0 [srtt 146019] [flags 00004000] [edns 135/0/0/0/0] [plain 0/0] [udpsize 512] [ttl -991]
; n1g.akamaiedge.net [v4 TTL 46] [v4 success] [v6 unexpected]
; 2.16.65.53 [srtt 6] [flags 00000000] [edns 0/0/0/0/0] [plain 0/0] [ttl 810]
; n2g.akamaiedge.net [v4 TTL 46] [v4 success] [v6 unexpected]
; 2.16.65.86 [srtt 21] [flags 00000000] [edns 0/0/0/0/0] [plain 0/0] [ttl 810]
; n3g.akamaiedge.net [v4 TTL 46] [v4 success] [v6 unexpected]
; 2.16.65.44 [srtt 20] [flags 00000000] [edns 0/0/0/0/0] [plain 0/0] [ttl 810]
; n4g.akamaiedge.net [v4 TTL 46] [v4 success] [v6 unexpected]
; 2.16.65.68 [srtt 29] [flags 00000000] [edns 0/0/0/0/0] [plain 0/0] [ttl 810]
; n5g.akamaiedge.net [v4 TTL 56] [v4 success] [v6 unexpected]
; 2.16.65.77 [srtt 30] [flags 00000000] [edns 0/0/0/0/0] [plain 0/0] [ttl 810]
; n6g.akamaiedge.net [v4 TTL 56] [v4 success] [v6 unexpected]
; 2.21.25.118 [srtt 27] [flags 00000000] [edns 0/0/0/0/0] [plain 0/0] [ttl 810]
; n7g.akamaiedge.net [v4 TTL 75] [v4 success] [v6 unexpected]
; 2.17.41.132 [srtt 9] [flags 00000000] [edns 0/0/0/0/0] [plain 0/0] [ttl 810]
One thing to note about the ADB entries is that the entries for n1g
through n7g
have not been used and appear to have been added, but unused, prior to the dumpdb
(new entries are initialized to a value between 1 and 32 microseconds).
The core of the cycle appears to be:
- As long as at least one address is found in the ADB for at least one of the names in the NS rrset, no new data is fetched or moved into the ADB
- As long a
named
is waiting for a response from an address, that ADB entry is preserved -
named
sets how long to wait for a response based on the current SRTT - An ADB entry can be used even if it is expired
While theoretically any address could be the one fixated on, by virtue point 3 above the ones with the higher SRTT are more likely to be selected than the ones with the lower SRTT.
This is also more likely to happen for a frequently-queried zone with many records with low TTLs, such as the zone of a CDN.
This is not the first time I've seen behavior that I've believed linked to this, but it is the first time a customer has noticed it and it's also the clearest documentation yet for it.
I expect that there are multiple possible solutions to this, with the hard part being choosing the one that we believe will be the easiest to implement and have the lowest chances of unintended consequences.