Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • BIND BIND
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 572
    • Issues 572
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 111
    • Merge requests 111
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • ISC Open Source ProjectsISC Open Source Projects
  • BINDBIND
  • Issues
  • #2591
Closed
Open
Issue created Mar 19, 2021 by Greg Rabil@agrabil

BIND 9.11.28 server hangs under load with 'dnssec-validation auto' configured

$ uname -a
Linux agr-centos7-ipc-test1 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ named -V 
BIND 9.11.28 (Extended Support Version) <id:60f9417>
running on Linux x86_64 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020
built by make with '--enable-ipv6' '--enable-filter-aaaa' '--enable-largefile' '--enable-fixed-rrset' '--enable-threads' '--enable-dnstap' '--disable-shared' '--with-dlopen=no' '--with-openssl=/opt/incontrol/dns/openssl' '--with-geoip2=/opt/incontrol/dns/maxminddb' '--with-protobuf-c=/opt/incontrol/dns/protobuf-c' '--with-libfstrm=/opt/incontrol/dns/fstrm' '--without-gssapi' '--without-pkcs11' '--with-libxml2=yes' '--with-libjson=yes' '--with-tuning=large' '--prefix=/opt/incontrol/dns' 'LDFLAGS=-ldl' 'PKG_CONFIG_PATH=/opt/incontrol/dns/openssl/lib/pkgconfig'
compiled by GCC 4.8.5 20150623 (Red Hat 4.8.5-44)
compiled with OpenSSL version: OpenSSL 1.1.1i  8 Dec 2020
linked to OpenSSL version: OpenSSL 1.1.1i  8 Dec 2020
compiled with libxml2 version: 2.9.1
linked to libxml2 version: 20901
compiled with libjson-c version: 0.11
linked to libjson-c version: 0.11
compiled with zlib version: 1.2.7
linked to zlib version: 1.2.7
linked to maxminddb version: 1.4.3
compiled with protobuf-c version: 1.3.3
linked to protobuf-c version: 1.3.3
threads support is enabled

default paths:
  named configuration:  /opt/incontrol/dns/etc/named.conf
  rndc configuration:   /opt/incontrol/dns/etc/rndc.conf
  DNSSEC root key:      /opt/incontrol/dns/etc/bind.keys
  nsupdate session key: /opt/incontrol/dns/var/run/named/session.key
  named PID file:       /opt/incontrol/dns/var/run/named/named.pid
  named lock file:      /opt/incontrol/dns/var/run/named/named.lock
  geoip-directory:      opt/incontrol/dns/maxminddb/share/GeoIP

How to reproduce

named.conf

options {
    directory "/opt/incontrol/dns/db";
    pid-file "/opt/incagent-12.0.24/etc/named.pid";
    allow-transfer { none; };
    allow-query { any; };
    dnssec-validation   auto   ;
};

zone "." IN {
        type hint;
        file "db.cache";
};

zone "localhost" IN { 
        type master;
        file "db.localhost";
        allow-update { none; };
};

zone "0.0.127.in-addr.arpa" IN {
        type master;
        file "db.127.0.0";
        allow-update { none; };
};
  • Get the latest bind.keys from https://downloads.isc.org/isc/bind9/keys/9.11/ (This step is not necessary to reproduce the problem)

  • rm -f db/managed-keys*

  • start named

  • Run resperf

$ resperf -s <server_ip> -d queryfile-example-current

queryfile-example-current has 10,000,000 records

  • In another window run 'rndc status' in a loop
#!/bin/bash

while true
do
   rndc status
done
  • Sometimes 'rndc status' will hang and when that happens named does not respond to any queries.

Last 'rndc status' before hung:

version: BIND 9.11.28 (Extended Support Version) <id:60f9417> (not available) 
running on latest: Linux x86_64 4.15.12 #1 SMP Wed Mar 21 12:30:16 EDT 2018
boot time: Fri, 19 Mar 2021 17:05:02 GMT
last configured: Fri, 19 Mar 2021 17:05:02 GMT
configuration file: /opt/incontrol/dns/etc/named.conf
CPUs found: 4
worker threads: 4
UDP listeners per interface: 3
number of zones: 102 (99 automatic)
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 422/900/1000
tcp clients: 4/150
TCP high-water: 4
server is up and running

Sometimes 'recursive clients:' was 2/900/1000 but never 0/900/1000 when it hangs.

At that time strace looks like below:

# strace -f -p <pid>
[pid  6826] epoll_wait(8,  <unfinished ...>
[pid  6825] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid  6824] select(10, [9], [], NULL, NULL <unfinished ...>
[pid  6822] rt_sigsuspend([], 8 <unfinished ...>
[pid  6823] futex(0x7f8e8449e0d4, FUTEX_WAIT_PRIVATE, 5, NULL <unfinished ...>
[pid  6824] <... select resumed>)       = 1 (in [9])
[pid  6824] read(9, ";\275\10vt\376", 36) = 6
[pid  6824] read(9, 0x7f8e82007d30, 30) = -1 EAGAIN (Resource temporarily unavailable)
[pid  6824] select(10, [9], [], NULL, NULL

The problem does not happen in 9.11.22 with same configuration.

Edited Mar 19, 2021 by Greg Rabil
Assignee
Assign to
Time tracking