"resolver" test fails intermittently
The recently-added (only on main
) check for a SERVFAIL response to a
TCP query with an empty question section fails pretty frequently. The
check was added in commit 2f3ded76 (part
of !5616 (merged)). Despite a code comment suggesting otherwise, the test
fails due to dig
hitting a timeout instead of receiving a SERVFAIL
response:
$ cat dig.ns5.out.70
; <<>> DiG 9.17.21 <<>> -p 29349 @10.53.0.5 -b 10.53.0.5 +tcp tcpalso.no-questions. a +tries=3 +time=4
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
Example occurrences:
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/2197750
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/2197749
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/2197279
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/2196581
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/2185707
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/2185704
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/2185288
Most of these happened on FreeBSD, but two happened on Linux (under TSAN), so the common denominator seems to be "machine under load" rather than "only happens on FreeBSD".
Edited by Michał Kępień