Memory demands of BIND are MUCH higher on systems with many CPU cores
Summary
Compared to BIND 9.11, latest builds of 9.16 and 9.17 have quite huge memory demands if CPU count is unusually high.
BIND version used
BIND 9.17.8 (Development Release) <id:6455657>
running on Linux aarch64 5.10.8-200.fc33.aarch64 #1 SMP Sun Jan 17 19:21:29 UTC 2021
built by make with '--enable-developer'
compiled by GCC 10.2.1 20201125 (Red Hat 10.2.1-9)
compiled with OpenSSL version: OpenSSL 1.1.1i FIPS 8 Dec 2020
linked to OpenSSL version: OpenSSL 1.1.1i FIPS 8 Dec 2020
compiled with libuv version: 1.40.0
linked to libuv version: 1.40.0
compiled with libxml2 version: 2.9.10
linked to libxml2 version: 20910
compiled with json-c version: 0.14
linked to json-c version: 0.14
compiled with zlib version: 1.2.11
linked to zlib version: 1.2.11
linked to maxminddb version: 1.4.3
threads support is enabled
default paths:
named configuration: /usr/local/etc/named.conf
rndc configuration: /usr/local/etc/rndc.conf
DNSSEC root key: /usr/local/etc/bind.keys
nsupdate session key: /usr/local/var/run/named/session.key
named PID file: /usr/local/var/run/named/named.pid
named lock file: /usr/local/var/run/named/named.lock
geoip-directory: /usr/share/GeoIP
Steps to reproduce
- have an unusual high count of cores detected
- do not limit used cores by -n parameter
# lscpu
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 256
On-line CPU(s) list: 0-255
Thread(s) per core: 4
Core(s) per socket: 32
Socket(s): 2
NUMA node(s): 2
Vendor ID: Cavium
Model: 1
Model name: ThunderX2 99xx
Stepping: 0x1
Frequency boost: disabled
CPU max MHz: 2200.0000
CPU min MHz: 1000.0000
BogoMIPS: 400.00
L1d cache: 2 MiB
L1i cache: 2 MiB
L2 cache: 16 MiB
L3 cache: 64 MiB
NUMA node0 CPU(s): 0-127
NUMA node1 CPU(s): 128-255
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Branch predictor hardening
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid asimdrdm
What is the current bug behavior?
# ps u -C lt-named
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
named 368968 26.0 27.5 42908208 18101652 pts/0 Sl 05:35 1:06 /root/bind9/bin/named/.libs/lt-named -u named -c /etc/named.conf -g
What is the expected correct behavior?
(What you should see instead.)
- BIND 9.11 has tiny memory demands compared to recent versions.
# bind 9.11.26 package of Fedora
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
named 288640 0.3 0.2 3679852 188712 ? Ssl 04:07 0:00 /usr/sbin/named -u named -c /etc/named.conf
# devel build BIND 9.17.8
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
named 289095 27.3 27.5 42581812 18102972 pts/0 Sl+ 04:14 0:16 /root/bind9/bin/named/.libs/lt-named -g -u named -c /etc/name
# devel build BIND 9.16.10
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
named 307237 77.5 28.3 42578444 18634528 pts/0 Sl+ 04:29 1:36 ./named -u named -c /etc/named.conf -g
# devel build BIND 9.11.26 (Extended Support Version) <id:0ea4f682f3>
named 327832 6.2 0.2 2760764 146140 pts/0 Sl+ 04:59 0:00 ./named -u named -c /etc/named.conf -g
Check original comment
Relevant configuration files
(Paste any relevant configuration files - please use code blocks (```)
to format console output. If submitting the contents of your
configuration file in a non-confidential Issue, it is advisable to
obscure key secrets: this can be done automatically by using
named-checkconf -px
.)
Relevant logs and/or screenshots
(Paste any relevant logs - please use code blocks (```) to format console output, logs, and code, as it's very hard to read otherwise.)
21-Jan-2021 05:34:39.847 starting BIND 9.17.8 (Development Release) <id:6455657>
21-Jan-2021 05:34:39.847 running on Linux aarch64 5.10.8-200.fc33.aarch64 #1 SMP Sun Jan 17 19:21:29 UTC 2021
21-Jan-2021 05:34:39.847 built with '--enable-developer'
21-Jan-2021 05:34:39.847 running as: lt-named -u named -c /dev/null -g
21-Jan-2021 05:34:39.847 compiled by GCC 10.2.1 20201125 (Red Hat 10.2.1-9)
21-Jan-2021 05:34:39.847 compiled with OpenSSL version: OpenSSL 1.1.1i FIPS 8 Dec 2020
21-Jan-2021 05:34:39.847 linked to OpenSSL version: OpenSSL 1.1.1i FIPS 8 Dec 2020
21-Jan-2021 05:34:39.847 compiled with libxml2 version: 2.9.10
21-Jan-2021 05:34:39.847 linked to libxml2 version: 20910
21-Jan-2021 05:34:39.847 compiled with json-c version: 0.14
21-Jan-2021 05:34:39.847 linked to json-c version: 0.14
21-Jan-2021 05:34:39.847 compiled with zlib version: 1.2.11
21-Jan-2021 05:34:39.847 linked to zlib version: 1.2.11
21-Jan-2021 05:34:39.847 ----------------------------------------------------
21-Jan-2021 05:34:39.847 BIND 9 is maintained by Internet Systems Consortium,
21-Jan-2021 05:34:39.847 Inc. (ISC), a non-profit 501(c)(3) public-benefit
21-Jan-2021 05:34:39.847 corporation. Support and training for BIND 9 are
21-Jan-2021 05:34:39.847 available at https://www.isc.org/support
21-Jan-2021 05:34:39.847 ----------------------------------------------------
21-Jan-2021 05:34:39.847 adjusted limit on open files from 4096 to 1048576
21-Jan-2021 05:34:39.847 found 256 CPUs, using 256 worker threads
21-Jan-2021 05:34:39.847 using 256 UDP listeners per interface
21-Jan-2021 05:34:49.947 using up to 21000 sockets
21-Jan-2021 05:34:49.947 config.c: option 'trust-anchor-telemetry' is experimental and subject to change in the future
21-Jan-2021 05:34:49.947 loading configuration from '/dev/null'
21-Jan-2021 05:34:49.947 unable to open '/usr/local/etc/bind.keys'; using built-in keys instead
21-Jan-2021 05:34:50.027 looking for GeoIP2 databases in '/usr/share/GeoIP'
21-Jan-2021 05:34:50.027 opened GeoIP2 database '/usr/share/GeoIP/GeoLite2-Country.mmdb'
21-Jan-2021 05:34:50.027 opened GeoIP2 database '/usr/share/GeoIP/GeoLite2-City.mmdb'
21-Jan-2021 05:34:50.037 using default UDP/IPv4 port range: [32768, 60999]
21-Jan-2021 05:34:50.037 using default UDP/IPv6 port range: [32768, 60999]
Possible fixes
Lazy initialization of some worker threads queues. Detected during unit tests slowdown, when checking MR !4582 (closed).
Edited by Petr Menšík