Regression: Forward queries with forward only is no longer working
Summary
Running the latest bind version (9.13.3) forwarding queries with the forward only
option is no longer working.
Basically I have a split horizon DNS, using multiple bind instances running on multiple hosts. But I was able to identify the culprit to be the resolver, which was configured to forward queries for specific zones to an internal authoritative server, while answering all other queries the usual way.
This used to work fine with 9.13.1
, but when upgrading to 9.13.2
, this no longer works. It is still broken in 9.13.3
.
BIND version used
Broken version (9.13.2
):
BIND 9.13.2 (Development Release) <id:4f6ef2f>
running on Linux x86_64 4.14.72-1-lts #1 SMP Wed Sep 26 12:31:03 CEST 2018
built by make with '--prefix=/usr' '--sysconfdir=/etc' '--sbindir=/usr/bin' '--localstatedir=/var' '--disable-static' '--enable-ipv6' '--enable-filter-aaaa' '--enable-fixed-rrset' '--enable-seccomp' '--enable-full-report' '--with-python=/usr/bin/python' '--with-geoip' '--with-idn' '--with-openssl' '--with-libjson' '--with-libxml2' '--with-libtool' 'CFLAGS=-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -DDIG_SIGCHASE' 'LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now' 'CPPFLAGS=-D_FORTIFY_SOURCE=2'
compiled by GCC 8.2.1 20180831
compiled with OpenSSL version: OpenSSL 1.1.1 11 Sep 2018
linked to OpenSSL version: OpenSSL 1.1.1 11 Sep 2018
compiled with libxml2 version: 2.9.8
linked to libxml2 version: 20908
compiled with libjson-c version: 0.13.1
linked to libjson-c version: 0.13.1
compiled with zlib version: 1.2.11
linked to zlib version: 1.2.11
threads support is enabled
Last version that used to work (9.13.1
):
BIND 9.13.1 (Development Release) <id:5b71025>
running on Linux x86_64 4.14.72-1-lts #1 SMP Wed Sep 26 12:31:03 CEST 2018
built by make with '--prefix=/usr' '--sysconfdir=/etc' '--sbindir=/usr/bin' '--localstatedir=/var' '--disable-static' '--enable-ipv6' '--enable-filter-aaaa' '--enable-fixed-rrset' '--enable-seccomp' '--enable-full-report' '--with-python=/usr/bin/python' '--with-geoip' '--with-idn' '--with-openssl' '--with-libjson' '--with-libxml2' '--with-libtool' 'CFLAGS=-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -DDIG_SIGCHASE' 'LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now' 'CPPFLAGS=-D_FORTIFY_SOURCE=2'
compiled by GCC 8.2.1 20180831
compiled with OpenSSL version: OpenSSL 1.1.1 11 Sep 2018
linked to OpenSSL version: OpenSSL 1.1.1 11 Sep 2018
compiled with libxml2 version: 2.9.8
linked to libxml2 version: 20908
compiled with libjson-c version: 0.13.1
linked to libjson-c version: 0.13.1
compiled with zlib version: 1.2.11
linked to zlib version: 1.2.11
threads support is enabled
Steps to reproduce
- Configure bind in a way where queries for a particular zone are forwarded to a dedicated authoritative nameserver. Use the
forward only
option for this. I'm attaching mynamed.conf
further down below in this bug report. - Issue an query to the forwarded zone
- Bind will return with a
SERVFAIL
What is the current bug behavior?
Any queries for the babioch.de
zone are answered with SERVFAIL
, for instance:
dig @10.24.0.1 mail.babioch.de
; <<>> DiG 9.13.2 <<>> @10.24.0.1 mail.babioch.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 10792
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: d4edf146ddd092808c7e71535bb8bbfc46eaa07e7d222dfe (good)
;; QUESTION SECTION:
;mail.babioch.de. IN A
;; Query time: 18 msec
;; SERVER: 10.24.0.1#53(10.24.0.1)
;; WHEN: Sa Okt 06 15:43:24 CEST 2018
;; MSG SIZE rcvd: 72
What is the expected correct behavior?
Queries for the babioch.de
zone are correctly answered:
dig @10.24.0.1 mail.babioch.de
; <<>> DiG 9.13.1 <<>> @10.24.0.1 mail.babioch.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59672
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: ce6833cdc8a7f26af62c368d5bb8bc443c63c9ba62ea9a7d (good)
;; QUESTION SECTION:
;mail.babioch.de. IN A
;; ANSWER SECTION:
mail.babioch.de. 300 IN A 10.24.0.20
;; Query time: 1 msec
;; SERVER: 10.24.0.1#53(10.24.0.1)
;; WHEN: Sa Okt 06 15:44:36 CEST 2018
;; MSG SIZE rcvd: 88
Relevant configuration files
The stripped down version of my configuration, which still will trigger this regression, looks like this:
options {
listen-on port 53 { 127.0.0.1; 10.24.0.1; };
listen-on-v6 port 53 { ::1; };
directory "/var/named";
pid-file "/run/named/named.pid";
recursion yes;
allow-query { localhost; 10.24.0.0/16; };
};
include "/etc/named.rfc1912.zones";
include "/etc/named.root.zone";
zone "babioch.de" IN {
type forward;
forward only;
forwarders { 10.24.0.10; };
};
Relevant logs and/or screenshots
The query log contains this:
Sep 30 01:33:31 kvm1.babioch.de named[16298]: client @0x7ff0c40ad670 127.0.0.1#51022 (mail.babioch.de): query: mail.babioch.de IN A +E(0)K (127.0.0.1)
Sep 30 01:33:31 kvm1.babioch.de named[16298]: client @0x7ff0c40ad670 127.0.0.1#51022 (mail.babioch.de): query failed (SERVFAIL) for mail.babioch.de/IN/A at query.c:10672
Possible fixes
The line in question is handling stale answers 1. I'm not entirely sure how this applies to my use-case, since nothing should be stale here.
Interestingly enough I can get it working, when I'm removing the "forward only" directive from my configuration. This looks like this:
I was able to work around this with commenting out the forward only
option, effectively falling back to forward first
. This works in my case, but there might be setups, where this is not an option.