Problems reported in BIND 9.16.0 after hitting tcp-clients limit
Received by security-officer@isc.org from external party:
ISC BIND folk:
BIND 9.16.0 seems to get TCP queues stuck after the number of client TCP
connects hits the max configured with tcp-clients. The symptoms for the
affected addresses are:
o DNS over TCP (such as "dig +tcp @address") times out
o output "netstat -ln | egrep '^tcp.*:53'" shows non-0 (in fact always
11 so far) Recv-Q number
For example:
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 XXX.XXX.3.138:53 0.0.0.0:* LISTEN
tcp 11 0 XXX.XXX.1.27:53 0.0.0.0:* LISTEN
tcp 11 0 XXX.XXX.64.26:53 0.0.0.0:* LISTEN
tcp 11 0 XXX.XXX.1.26:53 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN
tcp6 0 0 fe80::d267:26ff:fed2:53 :::* LISTEN
tcp6 0 0 XXXX:0:XXXX:XXXX::2:53 :::* LISTEN
tcp6 0 0 ::1:53 :::* LISTEN
tcp6 11 0 XXXX:0:XXXX:XXXX::10:53 :::* LISTEN
tcp6 11 0 XXXX:0:XXXX:XXXX::11:53 :::* LISTEN
tcp6 11 0 XXXX:0:XXXX:XXXX::12:53 :::* LISTEN
All addresses with Recv-Q of 11 fail to respond to DNS over TCP.
The only fix I've found is to stop & start named. I suspect that disabling
listening on the affected addresses, doing "rndc reconfig", then enabling
listening would work, but that's a bit of a pain.
This is BIND 9.16.0 on 2 Linux systems:
____________________Debian____________________
: uname -a
Linux xxx 4.19.0-5-amd64 #1 SMP Debian 4.19.37-5+deb10u1 (2019-07-19) x86_64 GNU/Linux
: named -V
BIND 9.16.0 (Stable Release) <id:6270e60>
running on Linux x86_64 4.19.0-5-amd64 #1 SMP Debian 4.19.37-5+deb10u1 (2019-07-19)
built by make with 'STD_CDEFINES=-DISC_FACILITY=LOG_LOCAL5' '--libdir=/usr/lib/x86_64-linux-gnu' '--with-openssl' '--enable-dnstap' '--enable-fixed-rrset' '--with-libtool' '--sysconfdir=/local/nsdata/etc' '--localstatedir=/local/nsdata/var'
compiled by GCC 8.3.0
compiled with OpenSSL version: OpenSSL 1.1.1d 10 Sep 2019
linked to OpenSSL version: OpenSSL 1.1.1d 10 Sep 2019
compiled with libxml2 version: 2.9.4
linked to libxml2 version: 20904
compiled with protobuf-c version: 1.3.1
linked to protobuf-c version: 1.3.1
threads support is enabled
default paths:
named configuration: /local/nsdata/etc/named.conf
rndc configuration: /local/nsdata/etc/rndc.conf
DNSSEC root key: /local/nsdata/etc/bind.keys
nsupdate session key: /local/nsdata/var/run/named/session.key
named PID file: /local/nsdata/var/run/named/named.pid
named lock file: /local/nsdata/var/run/named/named.lock
____________________CentOS____________________
: uname -a
Linux xxx 3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
: named -V
BIND 9.16.0 (Stable Release) <id:6270e60>
running on Linux x86_64 3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020
built by make with '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/local/nsdata/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--with-libtool' '--with-pic' '--disable-static' '--with-docbook-xsl=/usr/share/sgml/docbook/xsl-stylesheets' '--enable-fixed-rrset' '--with-gssapi=yes' '--disable-isc-spnego' '--localstatedir=/var' '--with-geoip=no' '--with-python' '--enable-dnstap' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'CFLAGS= -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic' 'LDFLAGS=-Wl,-z,relro ' 'CPPFLAGS= -DDIG_SIGCHASE -DISC_FACILITY=LOG_LOCAL5' 'PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig'
compiled by GCC 4.8.5 20150623 (Red Hat 4.8.5-39)
compiled with OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
linked to OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
compiled with libxml2 version: 2.9.1
linked to libxml2 version: 20901
compiled with zlib version: 1.2.7
linked to zlib version: 1.2.7
compiled with protobuf-c version: 1.0.2
linked to protobuf-c version: 1.0.2
threads support is enabled
default paths:
named configuration: /local/nsdata/etc/named.conf
rndc configuration: /local/nsdata/etc/rndc.conf
DNSSEC root key: /local/nsdata/etc/bind.keys
nsupdate session key: /var/run/named/session.key
named PID file: /var/run/named/named.pid
named lock file: /var/run/named/named.lock
________________________________________
In both cases "rndc status" shows that the number of TCP connections has hit
the max allowed:
version: BIND 9.16.0 (Stable Release) <id:6270e60>
running on xxx: Linux x86_64 3.10.0-1062.12.1.el7.x86_64 #1
SMP Tue Feb 4 23:02:59 UTC 2020
boot time: Wed, 26 Feb 2020 20:29:33 GMT
last configured: Wed, 26 Feb 2020 20:29:33 GMT
configuration file: /local/nsdata/etc/named.conf
(/local/nsdata/local/nsdata/etc/named.conf)
CPUs found: 16
worker threads: 16
UDP listeners per interface: 16
number of zones: 1016 (189 automatic)
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 0/900/1000
tcp clients: 4/400 <======
TCP high-water: 400 <======
server is up and running
DNS over UDP is not affected & DNS over TCP to other addresses is not
affected.
Let me know what else you need to know about this.
Edited by Ondřej Surý