misleading debug log: stop on unrecognized qresult in rpz_rewrite()
Summary
BIND produces the following misleading/confusing debug 1 message:
08-Jul-2024 23:51:03.167 query-errors: debug 1: client @0x7f57aaa49d68 10.225.64.23#40517 (aro.army.mil.edgekey.dmz.akamai.csd.disa.mil): rpz QNAME rewrite aro.army.mil.edgekey.dmz.akamai.csd.disa.mil stop on unrecognized qresult in rpz_rewrite() failed: operation canceled
when the actual cause of resolution failure is a lame delegation and has nothing to do with RPZ.
BIND version affected
BIND 9.18.27 (Extended Support Version) <id:663e6d9>
running on Linux x86_64 5.14.0-427.13.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Apr 30 18:22:29 EDT 2024
built by make with '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/opt/isc/isc-bind/root/usr' '--exec-prefix=/opt/isc/isc-bind/root/usr' '--bindir=/opt/isc/isc-bind/root/usr/bin' '--sbindir=/opt/isc/isc-bind/root/usr/sbin' '--sysconfdir=/etc/opt/isc/scls/isc-bind' '--datadir=/opt/isc/isc-bind/root/usr/share' '--includedir=/opt/isc/isc-bind/root/usr/include' '--libdir=/opt/isc/isc-bind/root/usr/lib64' '--libexecdir=/opt/isc/isc-bind/root/usr/libexec' '--localstatedir=/var/opt/isc/scls/isc-bind' '--sharedstatedir=/var/opt/isc/scls/isc-bind/lib' '--mandir=/opt/isc/isc-bind/root/usr/share/man' '--infodir=/opt/isc/isc-bind/root/usr/share/info' '--enable-warn-error' '--disable-static' '--enable-dnstap' '--with-pic' '--with-gssapi' '--with-json-c' '--with-libxml2' '--without-lmdb' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'CC=gcc' 'CFLAGS=-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer' 'LDFLAGS=-Wl,-z,relro -Wl,--as-needed -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -L/opt/isc/isc-bind/root/usr/lib64' 'CPPFLAGS= -I/opt/isc/isc-bind/root/usr/include' 'LT_SYS_LIBRARY_PATH=/usr/lib64' 'PKG_CONFIG_PATH=:/opt/isc/isc-bind/root/usr/lib64/pkgconfig:/opt/isc/isc-bind/root/usr/share/pkgconfig' 'SPHINX_BUILD=/builddir/build/BUILD/bind-9.18.27/sphinx/bin/sphinx-build'
compiled by GCC 11.4.1 20231218 (Red Hat 11.4.1-3)
compiled with OpenSSL version: OpenSSL 3.0.7 1 Nov 2022
linked to OpenSSL version: OpenSSL 3.0.7 1 Nov 2022
compiled with libuv version: 1.44.2
linked to libuv version: 1.44.2
compiled with libnghttp2 version: 1.43.0
linked to libnghttp2 version: 1.43.0
compiled with libxml2 version: 2.9.13
linked to libxml2 version: 20913
compiled with json-c version: 0.14
linked to json-c version: 0.14
compiled with zlib version: 1.2.11
linked to zlib version: 1.2.11
compiled with protobuf-c version: 1.4.1
linked to protobuf-c version: 1.4.1
threads support is enabled
DNSSEC algorithms: RSASHA256 RSASHA512 ECDSAP256SHA256 ECDSAP384SHA384 ED25519 ED448
DS algorithms: SHA-1 SHA-256 SHA-384
HMAC algorithms: HMAC-MD5 HMAC-SHA1 HMAC-SHA224 HMAC-SHA256 HMAC-SHA384 HMAC-SHA512
TKEY mode 2 support (Diffie-Hellman): yes
TKEY mode 3 support (GSS-API): yes
default paths:
named configuration: /etc/opt/isc/scls/isc-bind/named.conf
rndc configuration: /etc/opt/isc/scls/isc-bind/rndc.conf
DNSSEC root key: /etc/opt/isc/scls/isc-bind/bind.keys
nsupdate session key: /var/opt/isc/scls/isc-bind/run/named/session.key
named PID file: /var/opt/isc/scls/isc-bind/run/named/named.pid
named lock file: /var/opt/isc/scls/isc-bind/run/named/named.lock
Steps to reproduce
- Using the configuration files below, restart named and enable debug level 1:
[root@dmrz-test-rdns ~]# systemctl restart isc-bind-named
[root@dmrz-test-rdns ~]# rndc trace 1
- From a separate client system, issue this query (which currently fails):
[dmrz@dmrz-nettools1-public1 ~]$ date; dig +tries=1 @10.225.64.109 aro.army.mil.edgekey.dmz.akamai.csd.disa.mil. a
Mon Jul 8 11:50:51 PM UTC 2024
; <<>> DiG 9.16.23-RH <<>> +tries @10.225.64.109 aro.army.mil.edgekey.dmz.akamai.csd.disa.mil. a
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
- Examine /var/opt/isc/scls/isc-bind/named/data/named.run to look for clues about why it failed:
08-Jul-2024 23:50:36.650 general: info: debug level is now 1
08-Jul-2024 23:50:51.167 resolver: debug 1: fetch: aro.army.mil.edgekey.dmz.akamai.csd.disa.mil/A
08-Jul-2024 23:50:51.167 resolver: debug 1: fetch: mil/NS
08-Jul-2024 23:50:51.177 resolver: debug 1: fetch: disa.mil/NS
08-Jul-2024 23:50:51.241 resolver: debug 1: fetch: csd.disa.mil/NS
08-Jul-2024 23:50:53.677 resolver: debug 1: fetch: akamai.csd.disa.mil/NS
08-Jul-2024 23:50:54.583 resolver: debug 1: fetch: dmz.akamai.csd.disa.mil/NS
08-Jul-2024 23:50:54.583 resolver: debug 1: fetch: dns1.akamai.csd.disa.mil/A
08-Jul-2024 23:50:54.583 resolver: debug 1: fetch: dns1.akamai.csd.disa.mil/AAAA
08-Jul-2024 23:50:54.583 resolver: debug 1: fetch: dns2.akamai.csd.disa.mil/A
08-Jul-2024 23:50:54.583 resolver: debug 1: fetch: dns2.akamai.csd.disa.mil/AAAA
08-Jul-2024 23:50:54.583 resolver: debug 1: fetch: dns3.akamai.csd.disa.mil/A
08-Jul-2024 23:50:54.583 resolver: debug 1: fetch: dns3.akamai.csd.disa.mil/AAAA
08-Jul-2024 23:50:54.583 resolver: debug 1: fetch: dns4.akamai.csd.disa.mil/A
08-Jul-2024 23:50:54.583 resolver: debug 1: fetch: dns4.akamai.csd.disa.mil/AAAA
08-Jul-2024 23:51:03.167 resolver: info: shut down hung fetch while resolving 'aro.army.mil.edgekey.dmz.akamai.csd.disa.mil/A'
08-Jul-2024 23:51:03.167 query-errors: debug 1: client @0x7f57aaa49d68 10.225.64.23#40517 (aro.army.mil.edgekey.dmz.akamai.csd.disa.mil): rpz QNAME rewrite aro.army.mil.edgekey.dmz.akamai.csd.disa.mil stop on unrecognized qresult in rpz_rewrite() failed: operation canceled
08-Jul-2024 23:51:03.167 query-errors: info: client @0x7f57aaa49d68 10.225.64.23#40517 (aro.army.mil.edgekey.dmz.akamai.csd.disa.mil): query failed (operation canceled) for aro.army.mil.edgekey.dmz.akamai.csd.disa.mil/IN/A at query.c:7843
What is the current bug behavior?
As of this writing (though hopefully not forever), this query fails because our default qname-minimization relaxed
behavior querying for NS records runs afoul of a lame delegation:
$ dig +noall +ans csd.disa.mil. ns
csd.disa.mil. 337 IN NS ns.cybercom.mil.
csd.disa.mil. 337 IN NS ns.jtfgno.mil.
csd.disa.mil. 337 IN NS ns1.csd.disa.mil.
$ dig +norec +noall +ans +add @ns1.csd.disa.mil. akamai.csd.disa.mil. ns
akamai.csd.disa.mil. 704 IN NS dns1.akamai.csd.disa.mil.
akamai.csd.disa.mil. 704 IN NS dns4.akamai.csd.disa.mil.
akamai.csd.disa.mil. 704 IN NS dns3.akamai.csd.disa.mil.
akamai.csd.disa.mil. 704 IN NS dns2.akamai.csd.disa.mil.
dns1.akamai.csd.disa.mil. 54713 IN A 214.48.5.33
dns2.akamai.csd.disa.mil. 54713 IN A 214.48.2.35
dns3.akamai.csd.disa.mil. 54711 IN A 214.48.1.73
dns4.akamai.csd.disa.mil. 54713 IN A 214.48.3.40
$ dig +norec @214.48.5.33 dmz.akamai.csd.disa.mil. ns
;; communications error to 214.48.5.33#53: timed out
;; communications error to 214.48.5.33#53: timed out
;; communications error to 214.48.5.33#53: timed out
(same for the other three IPs)
Of course, the failure itself isn't BIND's problem.
But before I understood what was going on, I was using rndc trace to try to figure it out, and the log message stop on unrecognized qresult in rpz_rewrite() failed: operation canceled
was unhelpful because it misled me to believe that something was wrong with RPZ... some sort of mysterious qresult that it wasn't prepared to handle? In hindsight I suspect that the "unrecognized" qresult was in actual fact referring to a total lack of any qresult at all, but that wasn't clear at the time.
What is the expected correct behavior?
It would have helped me a lot today if this log message had said something more along the lines of "skipping because we don't have any response to rewrite" (making it clear that the problem is somewhere else and not in the RPZ code).
It's not a big deal, just a small suggestion.
Relevant configuration files
/etc/opt/isc/scls/isc-bind/named.conf:
options {
directory "/var/opt/isc/scls/isc-bind/named/data";
allow-query { localhost; 10.0.0.0/8; };
recursion yes;
dnssec-validation no;
# enable querylogging at startup
querylog yes;
response-policy {
zone "rpz-hardcoded.local";
};
};
logging {
channel default_debug {
file "named.run";
print-time yes; print-category yes; print-severity yes;
severity dynamic;
};
# local file for query logs
channel queries_file {
file "queries.log" versions 5 size 200m;
severity info;
print-time yes; print-category yes; print-severity yes;
};
category queries { queries_file; };
};
zone "rpz-hardcoded.local" {
file "rpz-hardcoded.local.zone";
type primary;
allow-query { localhost; };
allow-transfer { localhost; };
};
/var/opt/isc/scls/isc-bind/named/data/rpz-hardcoded.local.zone
$ORIGIN rpz-hardcoded.local.
$TTL 3600
@ IN SOA example.com. nobody.example.com. (
1 ; serial
3600 ; refresh (1 hour)
900 ; retry (15 minutes)
1209600 ; expire (2 weeks)
60 ; minimum (1 minute)
)
NS LOCALHOST.
; return NXDOMAIN response for DNS-over-HTTPS canary domain (see
; https://support.mozilla.org/en-US/kb/canary-domain-use-application-dnsnet)
use-application-dns.net CNAME .
Relevant logs
(see above)