spurious root queries on timeout
Reported by Ke Li email@example.com against 9.11.18 and 9.16.1.
Dear BIND authors, We have documented specific cases where BIND9 (9.11.18 and 9.16.1) generates generate requests to root servers which we think are not very useful. We would like to know if it is a known behavior or if there is an underlying design choice for these queries that we do not understand? Below is a brief overview of what we found. The behavior we found is that when BIND9 has TLD servers' addresses in the cache, which authoritative for domains like "com", and BIND9 gets an A or AAAA type request like "some.example.com" from users, it still sends requests like "ns1.example.com" to root and root server replies with addresses of TLD servers again. The pattern looks like this: user asks BIND9 Query: bidder.criteo.com, Type A = BIND9 asks TLD servers To: 220.127.116.11 (g.gtld) Query: = bidder.criteo.com, Type A = Get a response from TLD servers From: 18.104.22.168 (g.gtld) Query: = bidder.criteo.com = = Response: NS ns23.criteo.com NS = ns22.criteo.com NS ns25.criteo.com NS = ns26.criteo.com NS ns27.criteo.com NS = ns28.criteo.com. All with A-type records in = "Additional Records". = BIND9 asks one of the nameservers. No reply To: 22.214.171.124 (ns25.criteo.= com) Query: = bidder.criteo.com, Type A = BIND9 asks another nameserver. To: 126.96.36.199 (ns28.criteo.com) Query: = bidder.criteo.com Type A = And at the same time, = = BIND9 sends requests to root = To: 188.8.131.52 (j.root) Query: = ns22.criteo.com Type AAAA = To: 184.108.40.206 (j.root) Query: = ns23.criteo.com Type AAAA = To: 220.127.116.11 (j.root) Query: = ns27.criteo.com Type AAAA = To: 18.104.22.168 (j.root) Query: = ns25.criteo.com Type AAAA = To: 22.214.171.124 (j.root) Query: = ns26.criteo.com Type AAAA = To: 126.96.36.199 (j.root) Query: = ns28.criteo.com Type AAAA = We deployed a BIND9 v9.11.18 instance and a BIND9 v9.16.1 locally and loaded web captured traffic by Wireshark on port 53. Then we analyzed the data and found several about these interesting requests to root. 1. they are requesting authoritative nameservers of a subdomain or a hostname. For "ns23.criteo.com" and "ns22.criteo.com" are authoritative nameservers for 2. they are requesting records that are not in the last level nameserver's response. For in the response from the TLD server to BIND9's request on "bidder.criteo.com", there is no type record (in "Additional Records") for nameserver "ns23.criteo.com", so BIND9 later AAAA type request on "ns23.criteo.com" to root. 3. if BIND9 timeouts when it queries one of these nameservers, BIND9 will generate these requests to root. For example, after getting the response from the TLD server on "bidder.criteo.com", BIND9 goes ahead and sends a request on "bidder.criteo.com" to "ns25.criteo.com", but there is no reply. Then BIND9 will send the request to another name server (randomly chose) "ns28.criteo.com" and also generate requests to root. Therefore, we guess this kind of request are generated by timeouts when BIND9 queries nameservers. We then tried to validate our hypothesis. We manually created timeouts iptables to ban IPs of some nameservers and the same behavior happened. A simple test pcap file as an example is attached, with an explanation. Also, the configuration file of our deployment is attached. We then validated our hypothesis on a recursive resolver at an academic institution running BIND9 v9.11.14, found out that around 80% A and AAAA root servers were in this pattern. We'd appreciate it if you help us understand this behavior. We mainly are curious about reason behind it. Is it a necessary design or is it avoidable? We think maybe some DNS root servers would be saved if BIND9 could avoid this kind of behavior. Thank you very much!