BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2021-03-08T11:15:01Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2464Possible double *free() in ns_interface_listentls()2021-03-08T11:15:01ZArtem BoldarievPossible double *free() in ns_interface_listentls()
There is a call to the `isc_tlsctx_free(&sslctx);` inside the body of the `ns_interface_listentls()`. I think, that this could potentially lead to a double `*free()`ing of the memory allocated by OpenSSL, because `sslctx` gets destroyed...
There is a call to the `isc_tlsctx_free(&sslctx);` inside the body of the `ns_interface_listentls()`. I think, that this could potentially lead to a double `*free()`ing of the memory allocated by OpenSSL, because `sslctx` gets destroyed during the `listenlist` destruction.
```
static isc_result_t
ns_interface_listentls(ns_interface_t *ifp, isc_tlsctx_t *sslctx) {
isc_result_t result;
result = isc_nm_listentlsdns(
ifp->mgr->nm, (isc_nmiface_t *)&ifp->addr, ns__client_request,
ifp, ns__client_tcpconn, ifp, sizeof(ns_client_t),
ifp->mgr->backlog, &ifp->mgr->sctx->tcpquota, sslctx,
&ifp->tcplistensocket);
if (result != ISC_R_SUCCESS) {
isc_log_write(IFMGR_COMMON_LOGARGS, ISC_LOG_ERROR,
"creating TLS socket: %s",
isc_result_totext(result));
isc_tlsctx_free(&sslctx);
return (result);
}
/*
* We call this now to update the tcp-highwater statistic:
* this is necessary because we are adding to the TCP quota just
* by listening.
*/
result = ns__client_tcpconn(NULL, ISC_R_SUCCESS, ifp);
if (result != ISC_R_SUCCESS) {
isc_log_write(IFMGR_COMMON_LOGARGS, ISC_LOG_ERROR,
"updating TCP stats: %s",
isc_result_totext(result));
}
return (result);
}
```March 2021 (9.11.29, 9.11.29-S1, 9.16.13, 9.16.13-S1, 9.17.11)Evan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/2445NSEC3 iterations considered harmful2023-12-05T14:58:58ZMatthijs Mekkingmatthijs@isc.orgNSEC3 iterations considered harmful### Summary
BIND doesn't limit, allowing 16-bit unsigned integer number of iterations. Seeing a lot of traffic for a zone with a high iteration number can effectively DDoS the resolver.
CVSS Score: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/...### Summary
BIND doesn't limit, allowing 16-bit unsigned integer number of iterations. Seeing a lot of traffic for a zone with a high iteration number can effectively DDoS the resolver.
CVSS Score: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:L - 5.3
### BIND version used
Affected versions: 9.11, 9.16
### Steps to reproduce
1. Set up an authoritative server with a DNSSEC signed zone with high iteration count:
`dnssec-signzone -3 - -H 65535 example.`, then configure `named` and start
2. Set up a validating resolver.
3. Run a NXDOMAIN style attack
### What is the current *bug* behavior?
Resolver has very low QPS.
### What is the expected *correct* behavior?
Resolver doesn't starve.
### Relevant configuration files
To do.
### Relevant logs and/or screenshots
N/A
### Possible fixes
From the resolver perspective, this situation is actually protocol compliant.
RFC 5155 says:
```
A zone owner MUST NOT use a value higher than shown in the table
below for iterations for the given key size. A resolver MAY treat a
response with a higher value as insecure, after the validator has
verified that the signature over the NSEC3 RR is correct.
+----------+------------+
| Key Size | Iterations |
+----------+------------+
| 1024 | 150 |
| 2048 | 500 |
| 4096 | 2,500 |
+----------+------------+
```
But it also acknowledges this is susceptible to attacks:
```
12.1.4. Using High Iteration Values
Since validators should treat responses containing NSEC3 RRs with
high iteration values as insecure, presence of just one signed NSEC3
RR with a high iteration value in a zone provides attackers with a
possible downgrade attack.
[...]
Using a high number of iterations also introduces an additional
denial-of-service opportunity against servers, since servers must
calculate several hashes per negative or wildcard response.
```
Proposed fixes:
When loading a zone (primary server):
- We could error if we try to load a zone with NSEC3 records with too high iteration count.
- Or we could treat it as garbage in/garbage out.
When transferring a zone (secondary server):
- Nothing much we can do about it.
When validating a response from this zone (resolver):
- Treat such NSEC3 records as insecure after validating (as suggested in the RFC).October 2021 (9.11.36, 9.11.36-S1, 9.16.22, 9.16.22-S1, 9.17.19)Michał KępieńMichał Kępieńhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2431assertion failure in free_senddata (sock=<optimized out>) at netmgr/tlsdns.c:...2021-04-06T12:07:56ZOndřej Surýassertion failure in free_senddata (sock=<optimized out>) at netmgr/tlsdns.c:1294From https://gitlab.isc.org/isc-projects/bind9/-/jobs/1432208:
```
D:dot:Core was generated by `/builds/isc-projects/bind9/bin/named/.libs/named -D dot-ns1 -X named.lock -m rec'.
9751D:dot:Program terminated with signal SIGABRT, Aborted....From https://gitlab.isc.org/isc-projects/bind9/-/jobs/1432208:
```
D:dot:Core was generated by `/builds/isc-projects/bind9/bin/named/.libs/named -D dot-ns1 -X named.lock -m rec'.
9751D:dot:Program terminated with signal SIGABRT, Aborted.
9752D:dot:#0 0x0000000800fc2c2a in thr_kill () from /lib/libc.so.7
9753D:dot:[Current thread is 1 (LWP 100088)]
9754D:dot:#0 0x0000000800fc2c2a in thr_kill () from /lib/libc.so.7
9755D:dot:#1 0x0000000800fc1084 in raise () from /lib/libc.so.7
9756D:dot:#2 0x0000000800f37279 in abort () from /lib/libc.so.7
9757D:dot:#3 0x000000000023a5d2 in assertion_failed (file=<optimized out>, line=<optimized out>, type=isc_assertiontype_insist, cond=<optimized out>) at main.c:254
9758D:dot:#4 0x000000080031c50a in isc_assertion_failed (file=0x186f8 <error: Cannot access memory at address 0x186f8>, line=6, type=isc_assertiontype_require, cond=0x800fc2c4a <thr_self+10> "\017\202\264G") at assertions.c:46
9759D:dot:#5 0x000000080032dbc8 in isc___mem_put (ctx0=0x80133e000, ptr=0x805c1b010, size=6453, file=<optimized out>, line=0) at mem.c:1096
9760D:dot:#6 0x000000080032af39 in isc__mem_put (mctx=0x186f8, ptr=0x6, size=0, file=0x800fc2c4a <thr_self+10> "\017\202\264G", line=0) at mem.c:2439
9761D:dot:#7 0x0000000800308336 in free_senddata (sock=<optimized out>) at netmgr/tlsdns.c:1294
9762D:dot:#8 0x00000008003083c0 in tls_write_cb (req=<optimized out>, status=0) at netmgr/tlsdns.c:1306
9763D:dot:#9 0x0000000800c00dcc in ?? () from /usr/local/lib/libuv.so.1
9764D:dot:#10 0x0000000800c00717 in ?? () from /usr/local/lib/libuv.so.1
9765D:dot:#11 0x0000000800c06f09 in uv.io_poll () from /usr/local/lib/libuv.so.1
9766D:dot:#12 0x0000000800bf6241 in uv_run () from /usr/local/lib/libuv.so.1
9767D:dot:#13 0x00000008002f86ab in nm_thread (worker0=0x8013ed010) at netmgr/netmgr.c:557
9768D:dot:#14 0x0000000800dedfac in ?? () from /lib/libthr.so.3
9769D:dot:#15 0x0000000000000000 in ?? ()
9770D:dot:Backtrace stopped: Cannot access memory at address 0x7fffdfffe000
```April 2021 (9.11.30/9.11.31, 9.11.30-S1/9.11.31-S1, 9.16.14/9.16.15, 9.16.14-S1/9.16.15-S1, 9.17.12)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2401use netmgr for dispatch2022-01-26T11:33:41ZEvan Huntuse netmgr for dispatchConvert the dns/dispatch module (and its callers, dns/request and dns/resolver) to use the network manager.
This involves, first, refactoring dispatch to remove code that's no longer being used; moving all the calls to the isc_socket AP...Convert the dns/dispatch module (and its callers, dns/request and dns/resolver) to use the network manager.
This involves, first, refactoring dispatch to remove code that's no longer being used; moving all the calls to the isc_socket API into dispatch.c and adding functions such as `dns_dispatch_connect()` and `dns_dispatch_send()` to call them instead; replacing the isc_socket calls with isc_nm calls and then revising the dispatch API as needed to work with the new architecture.
When this is done, there will be only one remaining use of isc_socket in `named` - the route socket that's used for interface scanning. This will also complete the netmgr conversion of `delv`, `mdig`, and `nsupdate`.October 2021 (9.11.36, 9.11.36-S1, 9.16.22, 9.16.22-S1, 9.17.19)Evan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/2389BIND 9.16.10: critical: xfrout.c:1643: INSIST(xfr->sends == 0) failed2022-01-12T16:13:08ZOndřej SurýBIND 9.16.10: critical: xfrout.c:1643: INSIST(xfr->sends == 0) failedReported to security-officer@isc.org
The core can't be attached to the issueReported to security-officer@isc.org
The core can't be attached to the issueMarch 2021 (9.11.29, 9.11.29-S1, 9.16.13, 9.16.13-S1, 9.17.11)https://gitlab.isc.org/isc-projects/bind9/-/issues/2354[CVE-2020-8625] ZDI-CAN-12302: ISC BIND TKEY Query Heap-based Buffer Overflow...2023-06-19T09:07:00ZCathy Almond[CVE-2020-8625] ZDI-CAN-12302: ISC BIND TKEY Query Heap-based Buffer Overflow Remote Code Execution Vulnerability### CVE-specific actions
- [x] Assign a CVE identifier
- [x] Determine CVSS score
- [x] Determine the range of BIND versions affected (including the Subscription Edition)
- [x] Determine whether workarounds for the problem exist...### CVE-specific actions
- [x] Assign a CVE identifier
- [x] Determine CVSS score
- [x] Determine the range of BIND versions affected (including the Subscription Edition)
- [x] Determine whether workarounds for the problem exists
- [x] Prepare a detailed description of the problem which should include the following by default:
- instructions for reproducing the problem (a system test is good enough)
- explanation of code flow which triggers the problem (a system test is *not* good enough)
- [x] Prepare a private merge request containing the following items in separate commits:
- a test for the issue (may be moved to a separate merge request for deferred merging)
- a fix for the issue
- documentation updates (`CHANGES`, release notes, anything else applicable)
- [x] Ensure the merge request from the previous step is reviewed by SWENG staff and has no outstanding discussions
- [x] Ensure the documentation changes introduced by the merge request addressing the problem are reviewed by Support and Marketing staff
- [x] Prepare backports of the merge request addressing the problem for all affected (and still maintained) BIND branches (backporting might affect the issue's scope and/or description)
- [x] Prepare a standalone patch for the last stable release of each affected (and still maintained) BIND branch
### Release-specific actions
- [x] Create/update the private issue containing links to fixes & reproducers for all CVEs fixed in a given release cycle: isc-private/bind9#34
- [x] Reserve a block of `CHANGES` placeholders once the complete set of vulnerabilities fixed in a given release cycle is determined
- [x] Ensure the merge requests containing CVE fixes are merged into `security-*` branches in CVE identifier order
---
As reported to ISC Security Officer:
ZDI-CAN-12302: ISC BIND TKEY Query Heap-based Buffer Overflow Remote Code Execution Vulnerability
-- CVSS -----------------------------------------
8.1: AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H
-- ABSTRACT -------------------------------------
Trend Micro's Zero Day Initiative has identified a vulnerability affecting the following products:
ISC - BIND
-- VULNERABILITY DETAILS ------------------------
* Version tested:9.16.9
* Installer file:bind-9.16.9.tar.xz
* Platform tested:ubuntu 20.04.1 desktop edition
---
### Analysis
```
the bug is CVE-2006-5989, ISC did not merge the patch
https://bugzilla.redhat.com/show_bug.cgi?id=206736
it leads to heap overflow off-by-4
it affected the latest Current-Stable, 9.16.9
it require the tkey-gssapi-keytab config in named.conf
```
~~~C++
static int
der_get_oid(const unsigned char *p, size_t len, oid *data, size_t *size) {
...
data->components = malloc(len * sizeof(*data->components));
if (data->components == NULL) {
return (ENOMEM);
}
data->components[0] = (*p) / 40;
data->components[1] = (*p) % 40; <--- (1) two element is written
--len; <--- (2) but len is plus one only
++p;
for (n = 2; len > 0U; ++n) {
unsigned u = 0;
do {
--len;
u = u * 128 + (*p++ % 128);
} while (len > 0U && p[-1] & 0x80);
data->components[n] = u; <--- (3) off-by-4
}
...
return (0);
}
~~~
debug log
```
(gdb) b *0x18E27E+0x7fb83fd83000
Breakpoint 1 at 0x7fb83ff1127e
(gdb) b *0x18E309+0x7fb83fd83000
Breakpoint 2 at 0x7fb83ff11309
(gdb) c
Continuing.
[Switching to Thread 0x7fb83d1f8700 (LWP 77138)]
Thread 2 "isc-net-0000" hit Breakpoint 1, 0x00007fb83ff1127e in ?? () from /lib/x86_64-linux-gnu/libdns.so.1601
(gdb) x/i $pc
=> 0x7fb83ff1127e: call 0x7fb83fdab5d0 <malloc@plt>
(gdb) i r $rdi
rdi 0x28 40
(gdb) ni
0x00007fb83ff11283 in ?? () from /lib/x86_64-linux-gnu/libdns.so.1601
(gdb) x/30xg $rax-0x10
0x7fb82c0164b0: 0x0000000000000000 0x0000000000000035
0x7fb82c0164c0: 0x00007fb82c016650 0x0000000000000000
0x7fb82c0164d0: 0x0000000000000000 0x0000000000000000
0x7fb82c0164e0: 0x0000000000000000 0x0000000000000025
0x7fb82c0164f0: 0x00007fb82c016230 0x00007fb83f55fc20
0x7fb82c016500: 0x0000000000000000 0x0000000000000025
0x7fb82c016510: 0x00007fb82c016530 0x00007fb82c0008d0
0x7fb82c016520: 0x0000000000000000 0x0000000000000025
0x7fb82c016530: 0x0000000000000000 0x00007fb82c0008d0
0x7fb82c016540: 0x0000000000000000 0x00000000000000b5
0x7fb82c016550: 0x00007fb82c015120 0x00007fb82c0008d0
0x7fb82c016560: 0x0000000000000000 0x00000000ffffffff
0x7fb82c016570: 0x0000000000000000 0x0000000000000000
0x7fb82c016580: 0x0000000000000000 0x0000000000000000
0x7fb82c016590: 0x0000000000000000 0x0000000000000000
(gdb) c
Continuing.
Thread 2 "isc-net-0000" hit Breakpoint 2, 0x00007fb83ff11309 in ?? () from /lib/x86_64-linux-gnu/libdns.so.1601
(gdb) x/i $pc
=> 0x7fb83ff11309: mov DWORD PTR [rax+rcx*4],edi // overwrite next chunk header
(gdb) x/xg $rax+$rcx*4
0x7fb82c0164e8: 0x0000000000000025
(gdb) bt
#0 0x00007fb83ff11309 in ?? () from /lib/x86_64-linux-gnu/libdns.so.1601
#1 0x00007fb83ff1144a in ?? () from /lib/x86_64-linux-gnu/libdns.so.1601
#2 0x00007fb83ff11a2d in gss_accept_sec_context_spnego () from /lib/x86_64-linux-gnu/libdns.so.1601
#3 0x00007fb83ff1d083 in dst_gssapi_acceptctx () from /lib/x86_64-linux-gnu/libdns.so.1601
#4 0x00007fb83feb65cd in dns_tkey_processquery () from /lib/x86_64-linux-gnu/libdns.so.1601
#5 0x00007fb83ffcf27f in ns_query_start () from /lib/x86_64-linux-gnu/libns.so.1601
#6 0x00007fb83ffb2131 in ns.client_request () from /lib/x86_64-linux-gnu/libns.so.1601
#7 0x00007fb83fce2b26 in ?? () from /lib/x86_64-linux-gnu/libisc.so.1601
#8 0x00007fb83fce33cd in ?? () from /lib/x86_64-linux-gnu/libisc.so.1601
#9 0x00007fb83fcdf74c in ?? () from /lib/x86_64-linux-gnu/libisc.so.1601
#10 0x00007fb83f470b01 in ?? () from /lib/x86_64-linux-gnu/libuv.so.1
#11 0x00007fb83f471638 in ?? () from /lib/x86_64-linux-gnu/libuv.so.1
#12 0x00007fb83f476ae0 in uv.io_poll () from /lib/x86_64-linux-gnu/libuv.so.1
#13 0x00007fb83f4667ac in uv_run () from /lib/x86_64-linux-gnu/libuv.so.1
#14 0x00007fb83fcdec2d in ?? () from /lib/x86_64-linux-gnu/libisc.so.1601
#15 0x00007fb83f7b4609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#16 0x00007fb83f6d5293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) c
Continuing.
Thread 3 "isc-net-0001" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fb83c8b6700 (LWP 77139)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb)
```
-- CREDIT ---------------------------------------
This vulnerability was discovered by:
Anonymous working with Trend Micro Zero Day InitiativeFebruary 2021 (9.11.28, 9.11.28-S1, 9.16.12, 9.16.12-S1, 9.17.10)https://gitlab.isc.org/isc-projects/bind9/-/issues/2340Enable logging of rpz re-writes to dnstap.2024-03-27T13:54:38ZPeter DaviesEnable logging of rpz re-writes to dnstap.### Description
Enable logging of rpz re-writes to dnstap.
The ability to send rpz rewrite information that is generated by category rpz to the dnstap output stream.
[RT #17273](https://support.isc.org/Ticket/Display.html?id=17273)### Description
Enable logging of rpz re-writes to dnstap.
The ability to send rpz rewrite information that is generated by category rpz to the dnstap output stream.
[RT #17273](https://support.isc.org/Ticket/Display.html?id=17273)Not plannedEvan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/2320Making netmgr callbacks asynchronous-only crippled performance2020-12-02T22:41:43ZMichał KępieńMaking netmgr callbacks asynchronous-only crippled performance!4386 caused a 40% performance drop in all tested Perflab scenarios:
- https://perflab.isc.org/#/config/run/5bf1959c83ba91a870b2976b/
- https://perflab.isc.org/#/config/run/5bf195a883ba91a870b2976c/
- https://perflab.isc.org/#/con...!4386 caused a 40% performance drop in all tested Perflab scenarios:
- https://perflab.isc.org/#/config/run/5bf1959c83ba91a870b2976b/
- https://perflab.isc.org/#/config/run/5bf195a883ba91a870b2976c/
- https://perflab.isc.org/#/config/run/5bf195c083ba91a870b2976e/
- https://perflab.isc.org/#/config/run/5bf195dd83ba91a870b2976f/
Fortunately, that MR was only merged into `main` so far, but this
problem must be addressed before [recent netmgr changes](#2246) are
backported to `v9_16`.December 2020 (9.11.26, 9.11.26-S1, 9.16.10, 9.16.10-S1, 9.17.8)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2227BIND 9.16.8 assertion failure2021-01-25T09:03:27ZAnand BuddhdevBIND 9.16.8 assertion failure### Summary
BIND crashed with an assertion failure in netmgr.c
### BIND version used
```
BIND 9.16.8 (Stable Release) <id:539f9f0>
running on Linux x86_64 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019
built by make wi...### Summary
BIND crashed with an assertion failure in netmgr.c
### BIND version used
```
BIND 9.16.8 (Stable Release) <id:539f9f0>
running on Linux x86_64 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019
built by make with '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--localstatedir=/var' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sysconfdir=/etc/named' '--disable-static' '--with-pic' '--without-python' '--with-libtool' '--without-lmdb' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DISC_MEM_USE_INTERNAL_MALLOC=0 -fno-omit-frame-pointer' 'LDFLAGS=-Wl,-z,relro ' 'PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig'
compiled by GCC 4.8.5 20150623 (Red Hat 4.8.5-39)
compiled with OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
linked to OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
compiled with libuv version: 1.40.0
linked to libuv version: 1.40.0
compiled with libxml2 version: 2.9.1
linked to libxml2 version: 20901
compiled with json-c version: 0.11
linked to json-c version: 0.11
compiled with zlib version: 1.2.7
linked to zlib version: 1.2.7
threads support is enabled
default paths:
named configuration: /etc/named/named.conf
rndc configuration: /etc/named/rndc.conf
DNSSEC root key: /etc/named/bind.keys
nsupdate session key: /var/run/named/session.key
named PID file: /var/run/named/named.pid
named lock file: /var/run/named/named.lock
```
### Steps to reproduce
I don't know.
### What is the current *bug* behavior?
BIND crashed and stopped serving queries.
### What is the expected *correct* behavior?
BIND should not crash.
### Relevant configuration files
```
acl "internal" {
key "main.ripe.net";
};
logging {
channel "default" {
file "/var/log/named/named.log";
severity info;
print-time yes;
print-category yes;
};
channel "ratelimit" {
file "/var/log/named/ratelimit.ringlog" versions 10 size 10485760;
print-time yes;
};
category "default" {
"default";
};
category "rate-limit" {
"ratelimit";
};
category "update" {
"null";
};
category "update-security" {
"null";
};
};
masters "hidden-main" {
IPv6-1 key "main.ripe.net";
IPv6-2 key "main.ripe.net";
IPv4-1 key "main.ripe.net";
IPv4-2 key "main.ripe.net";
};
options {
answer-cookie no;
directory "/var/named";
keep-response-order {
"any";
};
listen-on {
127.0.0.1/32;
IPv4/32;
193.0.14.129/32;
193.0.15.129/32;
};
listen-on-v6 {
::1/128;
IPv6/128;
2001:7fd::1/128;
2001:7fd:15::1/128;
};
server-id hostname;
tcp-clients 1000;
version "9.16";
dnssec-validation no;
minimal-responses yes;
recursion no;
allow-transfer {
"internal";
};
max-journal-size 10485760;
notify explicit;
zero-no-soa-ttl no;
zone-statistics none;
};
key "main.ripe.net" {
algorithm "hmac-sha256";
secret "????????????????????????????????????????????";
};
zone "." {
type slave;
file ".zone";
masters {
"hidden-main";
};
allow-transfer {
"any";
};
};
zone "arpa." {
type slave;
file "arpa.zone";
masters {
"hidden-main";
};
allow-transfer {
"any";
};
};
zone "root-servers.net." {
type slave;
file "root-servers.net.zone";
masters {
"hidden-main";
};
allow-transfer {
"any";
};
};
```
### Relevant logs and/or screenshots
```
22-Oct-2020 12:45:30.655 general: netmgr.c:1176: REQUIRE(((__builtin_expect(!!((handle) != ((void *)0)), 1) && __builtin_expect(!!(((const isc__magic_t *)(handle))->magic == ((('N') << 24 | ('M') << 16 | ('H') << 8 | ('D')))), 1)) && __atomic_load_n(&(handle)->references, memory_order_seq_cst) > 0)) failed, back trace
22-Oct-2020 12:45:30.655 general: #0 0x42b597 in ??
22-Oct-2020 12:45:30.655 general: #1 0x7f251d7ea2da in ??
22-Oct-2020 12:45:30.655 general: #2 0x7f251d801750 in ??
22-Oct-2020 12:45:30.655 general: #3 0x7f251d8088bc in ??
22-Oct-2020 12:45:30.655 general: #4 0x7f251ee402a7 in ??
22-Oct-2020 12:45:30.655 general: #5 0x7f251ee41c6a in ??
22-Oct-2020 12:45:30.655 general: #6 0x7f251ee50d2c in ??
22-Oct-2020 12:45:30.655 general: #7 0x7f251ee586ee in ??
22-Oct-2020 12:45:30.655 general: #8 0x7f251ee59305 in ??
22-Oct-2020 12:45:30.655 general: #9 0x7f251ee60a6a in ??
22-Oct-2020 12:45:30.655 general: #10 0x7f251ee5d0d1 in ??
22-Oct-2020 12:45:30.655 general: #11 0x7f251ee5f6aa in ??
22-Oct-2020 12:45:30.655 general: #12 0x7f251ee5fd36 in ??
22-Oct-2020 12:45:30.655 general: #13 0x7f251ee608ea in ??
22-Oct-2020 12:45:30.655 general: #14 0x7f251ee44fb0 in ??
22-Oct-2020 12:45:30.655 general: #15 0x7f251d807bac in ??
22-Oct-2020 12:45:30.655 general: #16 0x7f251d807feb in ??
22-Oct-2020 12:45:30.655 general: #17 0x7f251d8042a4 in ??
22-Oct-2020 12:45:30.655 general: #18 0x7f251cc2e164 in ??
22-Oct-2020 12:45:30.655 general: #19 0x7f251cc2ee7c in ??
22-Oct-2020 12:45:30.655 general: #20 0x7f251cc348c3 in ??
22-Oct-2020 12:45:30.655 general: #21 0x7f251cc240d0 in ??
22-Oct-2020 12:45:30.655 general: #22 0x7f251d8037a9 in ??
22-Oct-2020 12:45:30.655 general: #23 0x7f251c396ea5 in ??
22-Oct-2020 12:45:30.655 general: #24 0x7f251c0bf8dd in ??
22-Oct-2020 12:45:30.655 general: exiting (due to assertion failure)
```
### Possible fixes
I don't know.November 2020 (9.11.25, 9.11.25-S1, 9.16.9, 9.16.9-S1, 9.17.7)https://gitlab.isc.org/isc-projects/bind9/-/issues/2221Create unit tests for netmgr2020-12-03T10:57:40ZOndřej SurýCreate unit tests for netmgrDecember 2020 (9.11.26, 9.11.26-S1, 9.16.10, 9.16.10-S1, 9.17.8)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2166bind 9.16.7 trap divide error2020-09-24T11:36:16ZKlaus Hackenbergbind 9.16.7 trap divide errorToday I downloaded an compiled bind 9.16.7 on RHEL7. I cannot start the compiled named because of the following error:
```
Sep 17 09:44:00 vmrz0264 named[21450]: starting BIND 9.16.7 (Stable Release) <id:6fd3eb7>
Sep 17 09:44:00 vmrz0264...Today I downloaded an compiled bind 9.16.7 on RHEL7. I cannot start the compiled named because of the following error:
```
Sep 17 09:44:00 vmrz0264 named[21450]: starting BIND 9.16.7 (Stable Release) <id:6fd3eb7>
Sep 17 09:44:00 vmrz0264 named[21450]: running on Linux x86_64 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 11 19:12:04 EDT 2020
Sep 17 09:44:00 vmrz0264 named[21450]: built with '--prefix=/usr/local/adm' '--sysconfdir=/etc' '--with-libxml2' '--with-openssl' '--with-tuning=large' '--without-lmdb'
Sep 17 09:44:00 vmrz0264 named[21450]: running as: named -u named -c /etc/named.conf -t /var/named/chroot
Sep 17 09:44:00 vmrz0264 named[21450]: compiled by GCC 4.8.5 20150623 (Red Hat 4.8.5-39)
Sep 17 09:44:00 vmrz0264 named[21450]: compiled with OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
Sep 17 09:44:00 vmrz0264 named[21450]: linked to OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
Sep 17 09:44:00 vmrz0264 named[21450]: compiled with libxml2 version: 2.9.1
Sep 17 09:44:00 vmrz0264 named[21450]: linked to libxml2 version: 20901
Sep 17 09:44:00 vmrz0264 named[21450]: compiled with zlib version: 1.2.7
Sep 17 09:44:00 vmrz0264 named[21450]: linked to zlib version: 1.2.7
Sep 17 09:44:00 vmrz0264 named[21450]: ----------------------------------------------------
Sep 17 09:44:00 vmrz0264 named[21450]: BIND 9 is maintained by Internet Systems Consortium,
Sep 17 09:44:00 vmrz0264 named[21450]: Inc. (ISC), a non-profit 501(c)(3) public-benefit
Sep 17 09:44:00 vmrz0264 named[21450]: corporation. Support and training for BIND 9 are
Sep 17 09:44:00 vmrz0264 named[21450]: available at https://www.isc.org/support
Sep 17 09:44:00 vmrz0264 named[21450]: ----------------------------------------------------
Sep 17 09:44:00 vmrz0264 named[21450]: adjusted limit on open files from 4096 to 1048576
Sep 17 09:44:00 vmrz0264 named[21450]: found 4 CPUs, using 4 worker threads
Sep 17 09:44:00 vmrz0264 named[21450]: using 4 UDP listeners per interface
Sep 17 09:44:00 vmrz0264 named[21450]: using up to 21000 sockets
Sep 17 09:44:00 vmrz0264 named[21450]: loading configuration from '/etc/named.conf'
Sep 17 09:44:00 vmrz0264 named[21450]: unable to open '/etc/bind.keys'; using built-in keys instead
Sep 17 09:44:00 vmrz0264 named[21450]: statistics-channels: JSON library missing, only XML stats will be available
Sep 17 09:44:00 vmrz0264 named[21450]: statistics channel listening on 127.0.0.1#8053
Sep 17 09:44:00 vmrz0264 named[21450]: using default UDP/IPv4 port range: [1024, 65535]
Sep 17 09:44:00 vmrz0264 named[21450]: using default UDP/IPv6 port range: [1024, 65535]
Sep 17 09:44:00 vmrz0264 named[21450]: listening on IPv4 interface lo, 127.0.0.1#53
Sep 17 09:44:00 vmrz0264 named[21450]: listening on IPv4 interface eth0, 10.147.32.12#53
Sep 17 09:44:00 vmrz0264 named[21450]: listening on IPv4 interface eth0, 10.147.32.11#53
Sep 17 09:44:00 vmrz0264 named[21450]: listening on IPv4 interface eth1, 134.147.30.198#53
Sep 17 09:44:00 vmrz0264 named[21450]: IPv6 socket API is incomplete; explicitly binding to each IPv6 address separately
Sep 17 09:44:00 vmrz0264 named[21450]: listening on IPv6 interface lo, ::1#53
Sep 17 09:44:00 vmrz0264 named[21450]: listening on IPv6 interface eth0, fe80::250:56ff:fe8d:ac2%2#53
Sep 17 09:44:00 vmrz0264 named[21450]: listening on IPv6 interface eth1, fe80::250:56ff:fe8d:4ab8%3#53
Sep 17 09:44:00 vmrz0264 named[21450]: generating session key for dynamic DNS
Sep 17 09:44:00 vmrz0264 named[21450]: sizing zone task pool based on 655 zones
Sep 17 09:44:00 vmrz0264 named[21450]: none:98: 'max-cache-size 90%' - setting to 175921860444MB (out of 17592186044415MB)
Sep 17 09:44:00 vmrz0264 kernel: traps: named[21455] trap divide error ip:6345fd sp:7f49610ea190 error:0 in named[400000+313000]
Sep 17 09:44:00 vmrz0264 abrt-hook-ccpp: Process 21450 (named) of user 25 killed by SIGFPE - dumping core
Sep 17 09:44:01 vmrz0264 systemd: named-chroot.service: control process exited, code=exited status=1
Sep 17 09:44:01 vmrz0264 systemd: Failed to start Berkeley Internet Name Domain (DNS).
```October 2020 (9.11.24, 9.11.24-S1, 9.16.8, 9.16.8-S1, 9.17.6)https://gitlab.isc.org/isc-projects/bind9/-/issues/2165data race with stats channel on main2020-12-03T09:53:53ZMark Andrewsdata race with stats channel on mainJob [#1162539](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1162539) failed for d5aac19e281f65cacf954ff0c83fcf95b0c40676:
core available
```
==14330==ERROR: AddressSanitizer: unknown-crash on address 0x7f8a499cd610 at pc 0x7f8a5955...Job [#1162539](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1162539) failed for d5aac19e281f65cacf954ff0c83fcf95b0c40676:
core available
```
==14330==ERROR: AddressSanitizer: unknown-crash on address 0x7f8a499cd610 at pc 0x7f8a5955b314 bp 0x7f8a4ab102a0 sp 0x7f8a4ab10298
READ of size 2 at 0x7f8a499cd610 thread T11
#0 0x7f8a5955b313 in isc__nmsocket_init netmgr/netmgr.c:959
#1 0x7f8a59569eb1 in isc__nm_async_tcpchildaccept netmgr/tcp.c:444
#2 0x7f8a595608d4 in process_queue netmgr/netmgr.c:628
#3 0x7f8a5956187f in async_cb netmgr/netmgr.c:596
#4 0x7f8a56bcd667 (/usr/lib/x86_64-linux-gnu/libuv.so.1+0x10667)
#5 0x7f8a56bdc4af in uv__io_poll (/usr/lib/x86_64-linux-gnu/libuv.so.1+0x1f4af)
#6 0x7f8a56bcdf84 in uv_run (/usr/lib/x86_64-linux-gnu/libuv.so.1+0x10f84)
#7 0x7f8a59560a59 in nm_thread netmgr/netmgr.c:500
#8 0x7f8a56b99fa2 in start_thread /build/glibc-vjB4T1/glibc-2.28/nptl/pthread_create.c:486
#9 0x7f8a55bb34ce in clone (/lib/x86_64-linux-gnu/libc.so.6+0xf94ce)
Address 0x7f8a499cd610 is located in stack of thread T13
SUMMARY: AddressSanitizer: unknown-crash netmgr/netmgr.c:959 in isc__nmsocket_init
Shadow bytes around the buggy address:
0x0ff1c9331a70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff1c9331a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff1c9331a90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff1c9331aa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff1c9331ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0ff1c9331ac0: 00 00[00]00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff1c9331ad0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff1c9331ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff1c9331af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff1c9331b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ff1c9331b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Thread T11 created by T0 here:
#0 0x7f8a599a1db0 in __interceptor_pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x50db0)
#1 0x7f8a59696c5a in isc_thread_create pthreads/thread.c:73
#2 0x7f8a59552176 in isc_nm_start netmgr/netmgr.c:223
#3 0x557769fe5333 in create_managers /builds/isc-projects/bind9/bin/named/main.c:909
#4 0x557769fe5333 in setup /builds/isc-projects/bind9/bin/named/main.c:1223
#5 0x557769fe5333 in main /builds/isc-projects/bind9/bin/named/main.c:1523
#6 0x7f8a55ade09a in __libc_start_main ../csu/libc-start.c:308
==14330==ABORTING
```December 2020 (9.11.26, 9.11.26-S1, 9.16.10, 9.16.10-S1, 9.17.8)https://gitlab.isc.org/isc-projects/bind9/-/issues/20919.16.6 insist failure2021-01-21T09:54:38ZBrian Conry9.16.6 insist failureReceived by security-officer:
Hi,
I just upgraded the four nodes in our anycast resolver cluster to
9.16.6. However, shortly after starting, one of them decided to
exit, and in the log I find:
```
Aug 21 14:00:26 res named[20987]: re...Received by security-officer:
Hi,
I just upgraded the four nodes in our anycast resolver cluster to
9.16.6. However, shortly after starting, one of them decided to
exit, and in the log I find:
```
Aug 21 14:00:26 res named[20987]: resolver.c:5125: INSIST(dns_name_issubdomain(&fctx->name, &fctx->domain)) failed, back trace
Aug 21 14:00:26 res named[20987]: #0 0x41f368 in ??
Aug 21 14:00:26 res named[20987]: #1 0x7a49a5e168dd in ??
Aug 21 14:00:26 res named[20987]: #2 0x7a49a72fa0c7 in ??
Aug 21 14:00:26 res named[20987]: #3 0x7a49a72fbdd1 in ??
Aug 21 14:00:26 res named[20987]: #4 0x7a49a7300818 in ??
Aug 21 14:00:26 res named[20987]: #5 0x7a49a73048a8 in ??
Aug 21 14:00:26 res named[20987]: #6 0x7a49a7305395 in ??
Aug 21 14:00:26 res named[20987]: #7 0x7a49a7306831 in ??
Aug 21 14:00:26 res named[20987]: #8 0x7a49a5e3a317 in ??
Aug 21 14:00:26 res named[20987]: #9 0x7a49a340c1d8 in ??
Aug 21 14:00:26 res named[20987]: #10 0x7a49a2e87af0 in ??
Aug 21 14:00:26 res named[20987]: exiting (due to assertion failure)
```
this instance was started some minutes earlier:
Aug 21 13:48:31 res named[20987]: starting BIND 9.16.6 (Stable Release) <id:25846cf>
I wonder if this is related to an incomplete fix (?) of
CVE-2020-8621; this name server is doing forwarding via
```
options {
forwarders {
[redacted];
[redacted];
};
forward first;
};
```
It ran with "qname-minimization relaxed;" at the time (explicitly
configured), I have for now changed it to "off".January 2021 (9.11.27, 9.11.27-S1, 9.16.11, 9.16.11-S1, 9.17.9)https://gitlab.isc.org/isc-projects/bind9/-/issues/20849.11 data race in dispatch_test2021-04-02T09:09:16ZMark Andrews9.11 data race in dispatch_test```
WARNING: ThreadSanitizer: data race
Read of size 4 at 0x000000000001 by main thread:
#0 dispatch_getnext lib/dns/tests/dispatch_test.c:327
#1 <null> <null>
#2 __libc_start_main ../csu/libc-start.c:308
Previous write...```
WARNING: ThreadSanitizer: data race
Read of size 4 at 0x000000000001 by main thread:
#0 dispatch_getnext lib/dns/tests/dispatch_test.c:327
#1 <null> <null>
#2 __libc_start_main ../csu/libc-start.c:308
Previous write of size 4 at 0x000000000001 by thread T1 (mutexes: write M1):
#0 response lib/dns/tests/dispatch_test.c:234
#1 dispatch lib/isc/task.c:1157
#2 run lib/isc/task.c:1331
#3 <null> <null>
Location is global 'responses' of size 4 at 0x000000000001
Mutex M1 (0x000000000009) created at:
#0 pthread_mutex_init <null>
#1 isc__mutex_init lib/isc/pthreads/mutex.c:287
#2 dispatch_getnext lib/dns/tests/dispatch_test.c:273
#3 <null> <null>
#4 __libc_start_main ../csu/libc-start.c:308
Thread T1 (running) created by main thread at:
#0 pthread_create <null>
#1 isc_thread_create lib/isc/pthreads/thread.c:60
#2 isc__taskmgr_create lib/isc/task.c:1468
#3 isc_taskmgr_create lib/isc/task.c:2109
#4 create_managers lib/dns/tests/dnstest.c:118
#5 dns_test_begin lib/dns/tests/dnstest.c:192
#6 _setup lib/dns/tests/dispatch_test.c:53
#7 <null> <null>
#8 __libc_start_main ../csu/libc-start.c:308
SUMMARY: ThreadSanitizer: data race lib/dns/tests/dispatch_test.c:327 in dispatch_getnext
```September 2020 (9.11.23, 9.11.23-S1, 9.16.7, 9.17.5)https://gitlab.isc.org/isc-projects/bind9/-/issues/2075Enable the implicit "max-cache-size 90%;" default to be overridden2020-11-13T18:31:42ZMichał KępieńEnable the implicit "max-cache-size 90%;" default to be overridden!3865 caused RBT hash tables to be pre-allocated. This makes `named`
use more memory immediately from startup; how much more memory it uses
depends on the size of memory available on the host as the default value
of `max-cache-size` is ...!3865 caused RBT hash tables to be pre-allocated. This makes `named`
use more memory immediately from startup; how much more memory it uses
depends on the size of memory available on the host as the default value
of `max-cache-size` is `90%` (and the hash table size is derived from
that value).
This is not expected to cause problems on systems which run a single
instance of BIND, but it may trigger memory use issues e.g. on CI hosts,
which run numerous instances of BIND in parallel and each of these
instances assumes it is okay for it to use all of the memory available
on the host. The most prominent display of that problem was addressed
by !3919, but certain issues are still manifesting themselves
intermittently - most notably, the FreeBSD QEMU VMs are often being
killed off, which causes the FreeBSD GitLab CI jobs to appear to be
hung, e.g.:
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/1082066
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/1082411
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/1082632
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/1082794
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/1083286
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/1083287
Since `named` instances used in BIND system tests do not really need
large caches, what would address these memory use issues is a way of
overriding the default `max-cache-size 90%;` setting present in
`bin/named/config.c`. Any mechanism implemented for that purpose would
still need to honor explicit `max-cache-size` settings present in
`named.conf`.September 2020 (9.11.23, 9.11.23-S1, 9.16.7, 9.17.5)Michał KępieńMichał Kępieńhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2071BIND stuck after error "unable to obtain neither an IPv4 nor an IPv6 dispatch"2021-10-08T14:37:43ZAnand BuddhdevBIND stuck after error "unable to obtain neither an IPv4 nor an IPv6 dispatch"### Summary
Server appeared to be stuck in some strange state after "rndc reconfig".
### BIND version used
```
BIND 9.16.5 (Stable Release) <id:c00b458>
running on Linux x86_64 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UT...### Summary
Server appeared to be stuck in some strange state after "rndc reconfig".
### BIND version used
```
BIND 9.16.5 (Stable Release) <id:c00b458>
running on Linux x86_64 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020
built by make with '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--localstatedir=/var' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sysconfdir=/etc/named' '--disable-static' '--with-pic' '--without-python' '--with-libtool' '--without-lmdb' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic' 'LDFLAGS=-Wl,-z,relro ' 'PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig'
compiled by GCC 4.8.5 20150623 (Red Hat 4.8.5-39)
compiled with OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
linked to OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
compiled with libxml2 version: 2.9.1
linked to libxml2 version: 20901
compiled with json-c version: 0.11
linked to json-c version: 0.11
compiled with zlib version: 1.2.7
linked to zlib version: 1.2.7
threads support is enabled
default paths:
named configuration: /etc/named/named.conf
rndc configuration: /etc/named/rndc.conf
DNSSEC root key: /etc/named/bind.keys
nsupdate session key: /var/run/named/session.key
named PID file: /var/run/named/named.pid
named lock file: /var/run/named/named.lock
```
### Steps to reproduce
Haven't been able to reproduce this. We run "rndc reconfig" frequently on many of our BIND servers, and this is the first time I've seen such behaviour. On this specific server, after log rotation, logrotate ran "rndc reconfig" and BIND logged this:
### What is the current *bug* behavior?
After logging that error, BIND appeared to be stuck in a strange state. It was answering queries over UDP (did not check TCP). However, it was not refreshing any of the secondary zones. However, I don't know what was going on, because logrotate compressed the rotated log file and deleted the original, but the named process still held it open. However, it couldn't write any logs. "rndc zonestatus" for various zones showed them loaded and stuck on older serials.
### What is the expected *correct* behavior?
BIND should have reloaded its configuration and created a new log file in /var/log/named/named.log
### Relevant configuration files
I'm not including the entire config file here, but here are the relevant snippets:
```
logging {
channel "default" {
file "/var/log/named/named.log";
severity info;
print-time yes;
print-category yes;
};
channel "ratelimit" {
file "/var/log/named/ratelimit.ringlog" versions 10 size 10485760;
print-time yes;
};
category "default" {
"default";
};
category "rate-limit" {
"ratelimit";
};
category "update" {
"null";
};
category "update-security" {
"null";
};
};
options {
answer-cookie no;
directory "/var/named";
keep-response-order {
"any";
};
listen-on {
127.0.0.1/32;
IPv4 address/32;
};
listen-on-v6 {
::1/128;
IPv6 address/128;
};
server-id hostname;
tcp-clients 1000;
transfers-in 100;
transfers-out 100;
version "9.16";
dnssec-validation no;
ixfr-from-differences yes;
minimal-responses yes;
recursion no;
allow-transfer {
"internal";
};
max-journal-size 10485760;
notify explicit;
zero-no-soa-ttl no;
zone-statistics none;
};
```
### Relevant logs and/or screenshots
This was the last thing logged in the rotated file:
```
10-Aug-2020 09:21:29.548 general: received control channel command 'reconfig'
10-Aug-2020 09:21:29.548 general: loading configuration from '/etc/named/named.conf'
10-Aug-2020 09:21:29.848 general: unable to open '/etc/named/bind.keys'; using built-in keys instead
10-Aug-2020 09:21:29.851 general: using default UDP/IPv4 port range: [32768, 60999]
10-Aug-2020 09:21:29.851 general: using default UDP/IPv6 port range: [32768, 60999]
10-Aug-2020 09:21:29.853 general: sizing zone task pool based on 4615 zones
10-Aug-2020 09:21:30.253 config: none:98: 'max-cache-size 90%' - setting to 57795MB (out of 64216MB)
10-Aug-2020 09:21:30.253 general: ./server.c:4530: unexpected error:
10-Aug-2020 09:21:30.253 general: unable to obtain neither an IPv4 nor an IPv6 dispatch
10-Aug-2020 09:21:30.276 general: reloading configuration failed: unexpected error
```
We were debugging the issue the next day, on Aug 11. When we couldn't figure anything out, we tried to restart BIND. It runs under systemd on our server, and this is what appeared in the systemd journal:
```
Aug 11 09:08:00 hostname systemd[1]: Stopping BIND...
Aug 11 09:08:01 hostname named[3686]: named: src/unix/udp.c:119: uv__udp_finish_close: Assertion `handle->send_queue_size == 0' failed.
Aug 11 09:08:01 hostname named[3686]: named: src/unix/udp.c:119: uv__udp_finish_close: Assertion `handle->send_queue_size == 0' failed.
Aug 11 09:08:01 hostname named[3686]: named: src/unix/udp.c:119: uv__udp_finish_close: Assertion `handle->send_queue_size == 0' failed.
Aug 11 09:08:01 hostname named[3686]: named: src/unix/udp.c:119: uv__udp_finish_close: Assertion `handle->send_queue_size == 0' failed.
Aug 11 09:08:01 hostname named[3686]: named: src/unix/udp.c:119: uv__udp_finish_close: Assertion `handle->send_queue_size == 0' failed.
Aug 11 09:08:01 hostname named[3686]: named: src/unix/udp.c:119: uv__udp_finish_close: Assertion `handle->send_queue_size == 0' failed.
Aug 11 09:08:01 hostname named[3686]: named: src/unix/udp.c:119: uv__udp_finish_close: Assertion `handle->send_queue_size == 0' failed.
Aug 11 09:08:01 hostname named[3686]: named: src/unix/udp.c:119: uv__udp_finish_close: Assertion `handle->send_queue_size == 0' failed.
Aug 11 09:08:01 hostname named[3686]: named: src/unix/udp.c:119: uv__udp_finish_close: Assertion `handle->send_queue_size == 0' failed.
Aug 11 09:08:01 hostname named[3686]: named: src/unix/udp.c:119: uv__udp_finish_close: Assertion `handle->send_queue_size == 0' failed.
Aug 11 09:08:01 hostname named[3686]: named: src/unix/udp.c:119: uv__udp_finish_close: Assertion `handle->send_queue_size == 0' failed.
Aug 11 09:09:30 hostname systemd[1]: named.service stop-sigterm timed out. Killing.
```
BIND logged those errors, but failed to exit, so systemd sent it a KILL signal after 90s.
### Possible fixes
I don't have any suggestion for a fix.February 2021 (9.11.28, 9.11.28-S1, 9.16.12, 9.16.12-S1, 9.17.10)https://gitlab.isc.org/isc-projects/bind9/-/issues/2065"geoip2" system test fails intermittently2020-08-05T09:09:56ZMichał Kępień"geoip2" system test fails intermittentlyThe problem affects ~"v9.17" and ~"v9.16" on various runners:
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/1064670
- https://gitlab.isc.org/isc-private/bind9/-/jobs/1064060
- https://gitlab.isc.org/isc-private/bind9/-/jobs/1...The problem affects ~"v9.17" and ~"v9.16" on various runners:
- https://gitlab.isc.org/isc-projects/bind9/-/jobs/1064670
- https://gitlab.isc.org/isc-private/bind9/-/jobs/1064060
- https://gitlab.isc.org/isc-private/bind9/-/jobs/1064109
- https://gitlab.isc.org/isc-private/bind9/-/jobs/1064111
This was not happening for July releases.August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Michał KępieńMichał Kępieńhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2047netmgr: REQUIRE(VALID_NMHANDLE(handle)); assertion failures in isc_nm_pausere...2020-09-30T06:26:32ZMark Andrewsnetmgr: REQUIRE(VALID_NMHANDLE(handle)); assertion failures in isc_nm_pauseread()Job [#1039928](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1039928) failed for 45798d1e4e56f0a1ed69c708c8b518e4730ebae6:
```
D:dnstap:Core was generated by `/builds/isc-projects/bind9/bin/rndc/.libs/rndc -p 28359 -c ../common/rndc....Job [#1039928](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1039928) failed for 45798d1e4e56f0a1ed69c708c8b518e4730ebae6:
```
D:dnstap:Core was generated by `/builds/isc-projects/bind9/bin/rndc/.libs/rndc -p 28359 -c ../common/rndc.conf'.
7419 D:dnstap:Program terminated with signal SIGABRT, Aborted.
7420 D:dnstap:#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
7421 D:dnstap:[Current thread is 1 (Thread 0x7f913f5fe700 (LWP 17908))]
7422 D:dnstap:#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
7423 D:dnstap:#1 0x00007f9142373537 in __GI_abort () at abort.c:79
7424 D:dnstap:#2 0x00007f91425a313f in isc_assertion_failed (file=file@entry=0x7f91425d2000 "netmgr/netmgr.c", line=line@entry=1383, type=type@entry=isc_assertiontype_require, cond=cond@entry=0x7f91425d2760 "((__builtin_expect(!!((handle) != ((void *)0)), 1) && __builtin_expect(!!(((const isc__magic_t *)(handle))->magic == ((('N') << 24 | ('M') << 16 | ('H') << 8 | ('D')))), 1)) && __extension__ ({ __auto"...) at assertions.c:47
7425 D:dnstap:#3 0x00007f914258735a in isc_nm_pauseread (handle=handle@entry=0x7f913f768010) at netmgr/netmgr.c:1392
7426 D:dnstap:#4 0x00007f914256a3bb in recv_data (handle=0x7f913f768010, eresult=<optimized out>, region=<optimized out>, arg=0x55e92eb9d9a0 <rndc_ccmsg>) at ccmsg.c:109
7427 D:dnstap:#5 0x00007f914258b8ba in isc__nm_tcp_shutdown (sock=0x7f913f764010) at netmgr/tcp.c:1094
7428 D:dnstap:#6 0x00007f9142585681 in shutdown_walk_cb (arg=<optimized out>, handle=<optimized out>) at netmgr/netmgr.c:1451
7429 D:dnstap:#7 shutdown_walk_cb (handle=<optimized out>, arg=<optimized out>) at netmgr/netmgr.c:1446
7430 D:dnstap:#8 0x00007f914232aa24 in uv_walk () from /usr/lib/x86_64-linux-gnu/libuv.so.1
7431 D:dnstap:#9 0x00007f9142587496 in isc__nm_async_shutdown (worker=worker@entry=0x55e92fbd5840, ev0=ev0@entry=0x7f913f767430) at netmgr/netmgr.c:1461
7432 D:dnstap:#10 0x00007f9142588d3b in process_queue (worker=worker@entry=0x55e92fbd5840, queue=0x7f913f742080) at netmgr/netmgr.c:640
7433 D:dnstap:#11 0x00007f91425891a1 in async_cb (handle=<optimized out>) at netmgr/netmgr.c:580
7434 D:dnstap:#12 0x00007f914232b7d1 in ?? () from /usr/lib/x86_64-linux-gnu/libuv.so.1
7435 D:dnstap:#13 0x00007f914233c860 in ?? () from /usr/lib/x86_64-linux-gnu/libuv.so.1
7436 D:dnstap:#14 0x00007f914232bf44 in uv_run () from /usr/lib/x86_64-linux-gnu/libuv.so.1
7437 D:dnstap:#15 0x00007f9142588ed3 in nm_thread (worker0=0x55e92fbd5840) at netmgr/netmgr.c:484
7438 D:dnstap:#16 0x00007f9142303ea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
7439 D:dnstap:#17 0x00007f914244bdcf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
7440 D:dnstap:--------------------------------------------------------------------------------
```October 2020 (9.11.24, 9.11.24-S1, 9.16.8, 9.16.8-S1, 9.17.6)Witold KrecickiWitold Krecickihttps://gitlab.isc.org/isc-projects/bind9/-/issues/2036The "main" memory context may not be clean upon exit, causing crashes2020-09-30T06:24:50ZMichal NowakThe "main" memory context may not be clean upon exit, causing crashesI saw the `shutdown` test fail three times (50 % of times) on FreeBSD [11](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1034239) & [12](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1034321) on `main` (9dcf229634968dc7d808c1d23f4b...I saw the `shutdown` test fail three times (50 % of times) on FreeBSD [11](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1034239) & [12](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1034321) on `main` (9dcf229634968dc7d808c1d23f4bab5d3ba7f47f):
```
S:shutdown:2020-07-21T08:24:29+0000
T:shutdown:1:A
A:shutdown:System test shutdown
I:shutdown:PORTS:25125,25126,25127,25128,25129,25130,25131,25132,25133,25134
I:shutdown:starting servers
D:shutdown:============================= test session starts ==============================
D:shutdown:platform freebsd11 -- Python 3.7.7, pytest-4.5.0, py-1.8.1, pluggy-0.12.0 -- /usr/local/bin/python3.7
D:shutdown:cachedir: .pytest_cache
D:shutdown:rootdir: /builds/isc-projects/bind9/bin/tests/system/shutdown
D:shutdown:collecting ... collected 1 item
D:shutdown:
D:shutdown:tests-shutdown.py::test_named_shutdown FAILED [100%]
D:shutdown:
D:shutdown:=================================== FAILURES ===================================
D:shutdown:_____________________________ test_named_shutdown ______________________________
D:shutdown:
D:shutdown:named_port = 25125, control_port = 25134
D:shutdown:
D:shutdown:@pytest.mark.dnspython
D:shutdown:def test_named_shutdown(named_port, control_port):
D:shutdown:# pylint: disable-msg=too-many-locals
D:shutdown:cfg_dir = os.path.join(os.getcwd(), "resolver")
D:shutdown:assert os.path.isdir(cfg_dir)
D:shutdown:
D:shutdown:cfg_file = os.path.join(cfg_dir, "named.conf")
D:shutdown:assert os.path.isfile(cfg_file)
D:shutdown:
D:shutdown:named = os.getenv("NAMED")
D:shutdown:assert named is not None
D:shutdown:
D:shutdown:rndc = os.getenv("RNDC")
D:shutdown:assert rndc is not None
D:shutdown:
D:shutdown:systest_dir = os.getenv("SYSTEMTESTTOP")
D:shutdown:assert systest_dir is not None
D:shutdown:
D:shutdown:# rndc configuration resides in $SYSTEMTESTTOP/common/rndc.conf
D:shutdown:rndc_cfg = os.path.join(systest_dir, "common", "rndc.conf")
D:shutdown:assert os.path.isfile(rndc_cfg)
D:shutdown:
D:shutdown:# rndc command with default arguments.
D:shutdown:rndc_cmd = [rndc, "-c", rndc_cfg, "-p", str(control_port),
D:shutdown:"-s", "10.53.0.3"]
D:shutdown:
D:shutdown:# Helper function, launch named without blocking.
D:shutdown:def launch_named():
D:shutdown:proc = subprocess.Popen([named, "-c", cfg_file, "-f"], cwd=cfg_dir)
D:shutdown:# Ensure named is running
D:shutdown:assert proc.poll() is None
D:shutdown:
D:shutdown:return proc
D:shutdown:
D:shutdown:# We create a resolver instance that will be used to send queries.
D:shutdown:resolver = dns.resolver.Resolver()
D:shutdown:resolver.nameservers = ['10.53.0.3']
D:shutdown:resolver.port = named_port
D:shutdown:
D:shutdown:# We test named shutting down using two methods:
D:shutdown:# Method 1: using rndc ctop
D:shutdown:# Method 2: killing with SIGTERM
D:shutdown:# In both methods named should exit gracefully.
D:shutdown:for kill_method in ("rndc", "sigterm"):
D:shutdown:named_proc = launch_named()
D:shutdown:time.sleep(2)
D:shutdown:
D:shutdown:do_work(named_proc, resolver, rndc_cmd,
D:shutdown:kill_method, n_workers=12, n_queries=16)
D:shutdown:
D:shutdown:# Wait named to exit for a maximum of MAX_TIMEOUT seconds.
D:shutdown:MAX_TIMEOUT = 10
D:shutdown:is_dead = False
D:shutdown:for _ in range(MAX_TIMEOUT):
D:shutdown:if named_proc.poll() is not None:
D:shutdown:is_dead = True
D:shutdown:break
D:shutdown:time.sleep(1)
D:shutdown:
D:shutdown:if not is_dead:
D:shutdown:named_proc.kill()
D:shutdown:
D:shutdown:assert is_dead
D:shutdown:# Ensures that named exited gracefully.
D:shutdown:# If it crashed (abort()) exitcode will be non zero.
D:shutdown:> assert named_proc.returncode == 0
D:shutdown:E assert -6 == 0
D:shutdown:E --6
D:shutdown:E +0
D:shutdown:
D:shutdown:tests-shutdown.py:193: AssertionError
D:shutdown:----------------------------- Captured stdout call -----------------------------
D:shutdown:version: BIND 9.17.3 (Development Release) <id:b59e691>
D:shutdown:running on freebsd: FreeBSD amd64 11.4-RELEASE FreeBSD 11.4-RELEASE #0 r362094: Fri Jun 12 18:27:15 UTC 2020 root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC
D:shutdown:boot time: Tue, 21 Jul 2020 08:24:34 GMT
D:shutdown:last configured: Tue, 21 Jul 2020 08:24:34 GMT
D:shutdown:configuration file: /builds/isc-projects/bind9/bin/tests/system/shutdown/resolver/named.conf
D:shutdown:CPUs found: 4
D:shutdown:worker threads: 4
D:shutdown:UDP listeners per interface: 4
D:shutdown:number of zones: 100 (99 automatic)
D:shutdown:debug level: 0
D:shutdown:xfers running: 0
D:shutdown:xfers deferred: 0
D:shutdown:soa queries in progress: 0
D:shutdown:query logging is OFF
D:shutdown:recursive clients: 0/900/1000
D:shutdown:tcp clients: 0/150
D:shutdown:TCP high-water: 0
D:shutdown:server is up and running
D:shutdown:version: BIND 9.17.3 (Development Release) <id:b59e691>
D:shutdown:running on freebsd: FreeBSD amd64 11.4-RELEASE FreeBSD 11.4-RELEASE #0 r362094: Fri Jun 12 18:27:15 UTC 2020 root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC
D:shutdown:boot time: Tue, 21 Jul 2020 08:24:34 GMT
D:shutdown:last configured: Tue, 21 Jul 2020 08:24:34 GMT
D:shutdown:configuration file: /builds/isc-projects/bind9/bin/tests/system/shutdown/resolver/named.conf
D:shutdown:CPUs found: 4
D:shutdown:worker threads: 4
D:shutdown:UDP listeners per interface: 4
D:shutdown:number of zones: 100 (99 automatic)
D:shutdown:debug level: 0
D:shutdown:xfers running: 0
D:shutdown:xfers deferred: 0
D:shutdown:soa queries in progress: 0
D:shutdown:query logging is OFF
D:shutdown:recursive clients: 0/900/1000
D:shutdown:tcp clients: 0/150
D:shutdown:TCP high-water: 0
D:shutdown:server is up and running
D:shutdown:version: BIND 9.17.3 (Development Release) <id:b59e691>
D:shutdown:running on freebsd: FreeBSD amd64 11.4-RELEASE FreeBSD 11.4-RELEASE #0 r362094: Fri Jun 12 18:27:15 UTC 2020 root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC
D:shutdown:boot time: Tue, 21 Jul 2020 08:24:34 GMT
D:shutdown:last configured: Tue, 21 Jul 2020 08:24:34 GMT
D:shutdown:configuration file: /builds/isc-projects/bind9/bin/tests/system/shutdown/resolver/named.conf
D:shutdown:CPUs found: 4
D:shutdown:worker threads: 4
D:shutdown:UDP listeners per interface: 4
D:shutdown:number of zones: 100 (99 automatic)
D:shutdown:debug level: 0
D:shutdown:xfers running: 0
D:shutdown:xfers deferred: 0
D:shutdown:soa queries in progress: 0
D:shutdown:query logging is OFF
D:shutdown:recursive clients: 0/900/1000
D:shutdown:tcp clients: 0/150
D:shutdown:TCP high-water: 0
D:shutdown:server is up and running
D:shutdown:----------------------------- Captured stderr call -----------------------------
D:shutdown:rndc: connection to remote host closed.
D:shutdown:* This may indicate that the
D:shutdown:* remote server is using an older
D:shutdown:* version of the command protocol,
D:shutdown:* this host is not authorized to connect,
D:shutdown:* the clocks are not synchronized,
D:shutdown:* the key signing algorithm is incorrect,
D:shutdown:* or the key is invalid.
D:shutdown:rndc: connection to remote host closed.
D:shutdown:* This may indicate that the
D:shutdown:* remote server is using an older
D:shutdown:* version of the command protocol,
D:shutdown:* this host is not authorized to connect,
D:shutdown:* the clocks are not synchronized,
D:shutdown:* the key signing algorithm is incorrect,
D:shutdown:* or the key is invalid.
D:shutdown:rndc: connection to remote host closed.
D:shutdown:* This may indicate that the
D:shutdown:* remote server is using an older
D:shutdown:* version of the command protocol,
D:shutdown:* this host is not authorized to connect,
D:shutdown:* the clocks are not synchronized,
D:shutdown:* the key signing algorithm is incorrect
D:shutdown:* or the key is invalid.
D:shutdown:rndc: connection to remote host closed.
D:shutdown:* This may indicate that the
D:shutdown:* remote server is using an older
D:shutdown:* version of the command protocol,
D:shutdown:* this host is not authorized to connect,
D:shutdown:* the clocks are not synchronized,
D:shutdown:* the key signing algorithm is incorrect,
D:shutdown:* or the key is invalid.
D:shutdown:Failing assertion due to probable leaked memory in context 0x805c23000 ("main") (stats[9].gets == 3).
D:shutdown:mem.c:893: INSIST(ctx->stats[i].gets == 0U) failed
D:shutdown:=========================== 1 failed in 6.57 seconds ===========================
I:system:FAILED
I:shutdown:stopping servers
I:shutdown:Core dump(s) found: shutdown/resolver/core.43405
D:shutdown:backtrace from shutdown/resolver/core.43405:
D:shutdown:--------------------------------------------------------------------------------
D:shutdown:Core was generated by `/builds/isc-projects/bind9/bin/named/.libs/named -c /builds/isc-projects/bind9/b'.
D:shutdown:Program terminated with signal SIGABRT, Aborted.
D:shutdown:#0 0x0000000804b1b0ba in thr_kill () from /lib/libc.so.7
D:shutdown:#0 0x0000000804b1b0ba in thr_kill () from /lib/libc.so.7
D:shutdown:#1 0x0000000804b1b084 in raise () from /lib/libc.so.7
D:shutdown:#2 0x0000000804b1aff9 in abort () from /lib/libc.so.7
D:shutdown:#3 0x000000000041c612 in assertion_failed (file=<optimized out>, line=<optimized out>, type=isc_assertiontype_insist, cond=<optimized out>) at main.c:253
D:shutdown:#4 0x00000008008c144a in isc_assertion_failed (file=0x18b57 <error: Cannot access memory at address 0x18b57>, line=6, type=isc_assertiontype_require, cond=0x804b1b0da <thr_self+10> "\017\202\204\350\b") at assertions.c:46
D:shutdown:#5 0x00000008008ce783 in destroy (ctx=0x805c23000) at mem.c:893
D:shutdown:#6 0x00000008008ceb76 in isc_mem_destroy (ctxp=0x674fc0 <named_g_mctx>) at mem.c:1021
D:shutdown:#7 0x000000000041c55f in main (argc=<optimized out>, argv=<optimized out>) at main.c:1573
D:shutdown:--------------------------------------------------------------------------------
D:shutdown:full backtrace from shutdown/resolver/core.43405 saved in core.43405-backtrace.txt
D:shutdown:core dump shutdown/resolver/core.43405 archived as shutdown/resolver/core.43405.gz
R:shutdown:FAIL
E:shutdown:2020-07-21T08:24:42+0000
FAIL shutdown (exit status: 1)
```October 2020 (9.11.24, 9.11.24-S1, 9.16.8, 9.16.8-S1, 9.17.6)Witold KrecickiWitold Krecickihttps://gitlab.isc.org/isc-projects/bind9/-/issues/2018could not get query source dispatcher error after reconfig2021-03-03T10:06:19ZLaurent Frigaultcould not get query source dispatcher error after reconfig### Summary
We are running an hidden master only bind server with about many zones , most signed with auto-dnssec maintain; inline-signing yes;
Every few days (7-8 days) we get the following error in the log :
```
Jul 9 11:32:37 nsmas...### Summary
We are running an hidden master only bind server with about many zones , most signed with auto-dnssec maintain; inline-signing yes;
Every few days (7-8 days) we get the following error in the log :
```
Jul 9 11:32:37 nsmaster named[68180]: general: info: received control channel command 'reconfig'
Jul 9 11:32:48 nsmaster named[68180]: general: error: could not get query source dispatcher (213.36.252.194#0)
Jul 9 11:32:48 nsmaster named[68180]: general: error: reloading configuration failed: out of memory
```
and the server must be restarted.
### BIND version used
```
# /usr/local/sbin/named -V
BIND 9.16.4 (Stable Release) <id:0849b42>
running on FreeBSD amd64 12.1-RELEASE-p1 FreeBSD 12.1-RELEASE-p1 GENERIC
built by make with '--disable-linux-caps' '--localstatedir=/var' '--sysconfdir=/usr/local/etc/namedb' '--with-dlopen=yes' '--with-libxml2' '--with-openssl=/usr' '--with-readline=-L/usr/local/lib -ledit' '--with-dlz-filesystem=yes' '--disable-dnstap' '--disable-fixed-rrset' '--disable-geoip' '--without-maxminddb' '--without-gssapi' '--with-libidn2=/usr/local' '--with-json-c' '--disable-largefile' '--with-lmdb=/usr/local' '--disable-native-pkcs11' '--without-python' '--disable-querytrace' 'STD_CDEFINES=-DDIG_SIGCHASE=1' '--enable-tcp-fastopen' '--with-tuning=large' '--disable-symtable' '--prefix=/usr/local' '--mandir=/usr/local/man' '--infodir=/usr/local/share/info/' '--build=amd64-portbld-freebsd12.1' 'build_alias=amd64-portbld-freebsd12.1' 'CC=cc' 'CFLAGS=-O2 -pipe -DLIBICONV_PLUG -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing ' 'LDFLAGS= -L/usr/local/lib -ljson-c -fstack-protector-strong ' 'LIBS=-L/usr/local/lib' 'CPPFLAGS=-DLIBICONV_PLUG -isystem /usr/local/include' 'CPP=cpp' 'PKG_CONFIG=pkgconf'
compiled by CLANG 4.2.1 Compatible FreeBSD Clang 8.0.1 (tags/RELEASE_801/final 366581)
compiled with OpenSSL version: OpenSSL 1.1.1d-freebsd 10 Sep 2019
linked to OpenSSL version: OpenSSL 1.1.1d-freebsd 10 Sep 2019
compiled with libxml2 version: 2.9.10
linked to libxml2 version: 20910
compiled with json-c version: 0.13.1
linked to json-c version: 0.13.1
compiled with zlib version: 1.2.11
linked to zlib version: 1.2.11
threads support is enabled
default paths:
named configuration: /usr/local/etc/namedb/named.conf
rndc configuration: /usr/local/etc/namedb/rndc.conf
DNSSEC root key: /usr/local/etc/namedb/bind.keys
nsupdate session key: /var/run/named/session.key
named PID file: /var/run/named/pid
named lock file: /var/run/named/named.lock
```
It is the FreeBSD port compiled with --with-tuning=large
We had the same issue before with --with-tuning=default
We also had the same issue before with bind 9.11.20.
I try starting named with -U 20 but this does not change anything.
We have been running the same configuration WITHOUT DNSSEC signed zones for years on smaller servers without this issue.
### Steps to reproduce
We periodically regenerate our configuration to add/update/remove zones.
when needed, we use "rndc reconfig"
### What is the current *bug* behavior?
After some rndc reconfig the named server stop working with the errors:
```
Jul 9 11:32:48 nsmaster named[68180]: general: error: could not get query source dispatcher (213.36.252.194#0)
Jul 9 11:32:48 nsmaster named[68180]: general: error: reloading configuration failed: out of memory
```
and top reports:
```
last pid: 62441; load averages: 0.29, 0.57, 0.70 up 169+19:01:18 11:47:47
15 processes: 1 running, 14 sleeping
CPU: 1.4% user, 0.0% nice, 4.8% system, 0.0% interrupt, 93.8% idle
Mem: 7100M Active, 5153M Inact, 18G Laundry, 14G Wired, 582M Buf, 18G Free
ARC: 7506M Total, 4261M MFU, 1187M MRU, 30M Anon, 287M Header, 1742M Other
3994M Compressed, 17G Uncompressed, 4.45:1 Ratio
Swap: 64G Total, 422M Used, 64G Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
68180 bind 170 52 0 30G 26G sigwai 3 42.3H 0.00% named
```
When named is started, top reports :
```
last pid: 64008; load averages: 0.49, 0.68, 1.00 up 169+21:04:34 13:51:03
21 processes: 1 running, 20 sleeping
CPU: 1.4% user, 0.0% nice, 4.8% system, 0.0% interrupt, 93.8% idle
Mem: 8509M Active, 2467M Inact, 22M Laundry, 16G Wired, 582M Buf, 36G Free
ARC: 7491M Total, 4043M MFU, 1410M MRU, 11M Anon, 286M Header, 1740M Other
3990M Compressed, 18G Uncompressed, 4.52:1 Ratio
Swap: 64G Total, 98M Used, 64G Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
63924 bind 170 52 0 9180M 7033M sigwai 25 13:36 0.00% named
```
There must be some memory / resource leak issue.
The server has 64G RAM / 56 cores .
This should not be a memory issue.
When it happens, we need to stop and start the named server.
### What is the expected *correct* behavior?
There should not be any errors after the reconfig and the server should not stop working.
Its memory usage should not grow that much .
### Relevant configuration files
Too many zones (about 73000) to paste them all here
```
logging {
channel stdlog {
syslog local1;
print-category yes;
print-severity yes;
print-time no;
};
category default { stdlog; };
category queries { "null"; };
category query-errors { "null"; };
category update { "null"; };
category update-security { "null"; };
category security { "null"; };
};
options {
// All file and path names are relative to the chroot directory,
// if any, and should be fully qualified.
directory "/usr/local/etc/namedb/working";
pid-file "/var/run/named/pid";
dump-file "/var/dump/named_dump.db";
statistics-file "/var/stats/named.stats";
listen-on { 127.0.1.4; 213.36.252.194; };
listen-on-v6 { 2a01:e0d:1:2:58bf:f9c2:0:1; };
disable-empty-zone "255.255.255.255.IN-ADDR.ARPA";
disable-empty-zone "0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.IP6.ARPA";
disable-empty-zone "1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.IP6.ARPA";
query-source address 213.36.252.194 port *;
query-source-v6 address 2a01:e0d:1:2:58bf:f9c2:0:1 port *;
allow-transfer {
127.0.1.4;
213.36.252.128/25;
2a01:e0b:1:e:0:0:0:0/64;
213.36.252.32/27;
62.210.98.15;
213.36.253.14;
};
startup-notify-rate 100;
notify-source 213.36.252.194;
recursion no;
notify no;
check-integrity no;
minimal-responses yes;
max-transfer-idle-out 5;
max-transfer-time-out 10;
tcp-clients 1000;
tcp-listen-queue 100;
transfers-out 1000;
dnssec-enable yes;
sig-validity-interval 60 30;
masterfile-format text;
request-ixfr no;
provide-ixfr no;
};
zone "." { type hint; file "/usr/local/etc/namedb/named.root"; };
key "rndc-key" {
algorithm hmac-sha256;
secret "xxx";
};
controls {
inet 127.0.1.4
port 953
allow { any; } keys { "rndc-key"; };
};
// les zones
include "/usr/local/etc/namedb/named.conf.custom.inc";
include "/usr/local/etc/namedb/named.conf.custom-old.inc";
```
Most zones are signed like:
```
zone "bookmyname.be" {
type master;
file "custom/b/o/bookmyname.be/bookmyname.be";
notify explicit;
also-notify { 213.36.252.135; 62.210.98.15; 213.36.253.14; };
auto-dnssec maintain;
inline-signing yes;
key-directory "custom/b/o/bookmyname.be";
};
```
a few are not signed and have the following config:
```
zone "bookmyname.lu" {
type master;
file "custom/b/o/bookmyname.lu/bookmyname.lu";
notify explicit;
also-notify { 213.36.252.135; 62.210.98.15; 213.36.253.14; };
};
```
### Relevant logs and/or screenshots
```
# rndc status
version: BIND 9.16.4 (Stable Release) <id:0849b42>
running on nsmaster.free.org: FreeBSD amd64 12.1-RELEASE-p1 FreeBSD 12.1-RELEASE-p1 GENERIC
boot time: Thu, 09 Jul 2020 09:52:25 GMT
last configured: Thu, 09 Jul 2020 10:42:49 GMT
configuration file: /usr/local/etc/namedb/named-custom.conf
CPUs found: 56
worker threads: 56
UDP listeners per interface: 56
number of zones: 144025 (0 automatic)
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is ON
recursive clients: 0/900/1000
tcp clients: 0/1000
TCP high-water: 17
server is up and running
```
### Possible fixes
No ideaMarch 2021 (9.11.29, 9.11.29-S1, 9.16.13, 9.16.13-S1, 9.17.11)Diego dos Santos FronzaDiego dos Santos Fronza