BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2020-11-27T20:05:53Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2037[CVE-2020-8623] A flaw in native PKCS#11 code can lead to a remotely triggera...2020-11-27T20:05:53ZOndřej Surý[CVE-2020-8623] A flaw in native PKCS#11 code can lead to a remotely triggerable assertion failure in pk11.c> Came from ...@yandex.ru:
>
> BIND should be compiled with --enable-native-pkcs11 and --with-pkcs11 options.
>
> The exploit triggers abort() in pk11_numbits function.
>
> Bug details:
> from lib/isc/pk11.c
>
> ```c
> unsigned int
>...> Came from ...@yandex.ru:
>
> BIND should be compiled with --enable-native-pkcs11 and --with-pkcs11 options.
>
> The exploit triggers abort() in pk11_numbits function.
>
> Bug details:
> from lib/isc/pk11.c
>
> ```c
> unsigned int
> pk11_numbits(CK_BYTE_PTR data, unsigned int bytecnt) {
> unsigned int bitcnt, i;
> CK_BYTE top;
>
> if (bytecnt == 0) {
> return (0);
> }
> bitcnt = bytecnt * 8;
> for (i = 0; i < bytecnt; i++) {
> top = data[i];
> if (top == 0) {
> bitcnt -= 8;
> continue;
> }
> ...
> }
> INSIST(0);
> ISC_UNREACHABLE();
> }
> ```
> Which means that if all bytes are 0, abort will be triggered.
>
> How to reproduce:
> 1) configure and build softhsm 2.6.1:
> ```
> $ ./configure --prefix=/var/softhsm --with-openssl=/var/openssl --with-crypto-backend=openssl
> $ make && sudo make install
> ```
>
> 2) compile BIND with PKCS11 support
> ```
> $ ./configure --prefix=/opt/bind --disable-chroot --enable-native-pkcs11 --with-pkcs11=/var/softhsm/lib/softhsm/libsofthsm2.so
> $ make && sudo make install
> ```
>
> 3) Configure BIND
> ```
> # init softhsm (PIN 1234)
> # /var/softhsm/bin//softhsm2-util --init-token --free --label softhsm
> Slot 1 has a free/uninitialized token.
> === SO PIN (4-255 characters) ===
> Please enter SO PIN: ****
> Please reenter SO PIN: ****
> === User PIN (4-255 characters) ===
> Please enter user PIN: ****
> Please reenter user PIN: ****
> The token has been initialized and is reassigned to slot 1294545520
>
> # export SLOT=1294545520
>
> # cd bin/tests/system/pkcs11
> ```
>
> Edit ns1/example.db.in and ns1/named.conf.in, change IP from 10.53.0.1 to your server IP
> After that run included setports.sh:
> ```
> # bash setports.sh
> ```
>
> Now you can generate the keys:
> ```
> # bash setup.sh
> ```
>
> ```
> # cp ns1/* /opt/bind/etc
> ```
>
> Fix permissions:
> ```
> # chown -R bind:bind /opt/bind/var/run
> ```
>
> Edit `/opt/bind/etc/named.conf` and change all paths to `*.example.db.signed` to full path, should be like this:
> ```
> zone "ecdsap384sha384.example." {
> type master;
> file "/opt/bind/etc/ecdsap384sha384.example.db.signed";
> allow-update { any; };
> };
> ```
>
>
> 4) Run BIND
> ```
> # cd /opt/bind/var/run
> # /opt/bind/sbin/named -g -d0 -u bind -c /opt/bind/etc/named.conf
> ```
>
> 5) run t1.py
> ```
> $ ./t1.py <your_server_ip> 53
> ```
>
> Example bind log:
>
> ```
> 25-Jun-2020 01:23:14.297 pk11.c:698: INSIST(0) failed, back trace
> 25-Jun-2020 01:23:14.297 #0 0x5583e7797e9b in __do_global_dtors_aux_fini_array_entry()+0x5583e6971623
> 25-Jun-2020 01:23:14.297 #1 0x5583e705686d in __do_global_dtors_aux_fini_array_entry()+0x5583e622fff5
> 25-Jun-2020 01:23:14.301 #2 0x5583e779783d in __do_global_dtors_aux_fini_array_entry()+0x5583e6970fc5
> 25-Jun-2020 01:23:14.301 #3 0x5583e77909d1 in __do_global_dtors_aux_fini_array_entry()+0x5583e696a159
> 25-Jun-2020 01:23:14.301 #4 0x5583e76de425 in __do_global_dtors_aux_fini_array_entry()+0x5583e68b7bad
> 25-Jun-2020 01:23:14.301 #5 0x5583e76c2080 in __do_global_dtors_aux_fini_array_entry()+0x5583e689b808
> 25-Jun-2020 01:23:14.301 #6 0x5583e76b4686 in __do_global_dtors_aux_fini_array_entry()+0x5583e688de0e
> 25-Jun-2020 01:23:14.305 #7 0x5583e734667e in __do_global_dtors_aux_fini_array_entry()+0x5583e651fe06
> 25-Jun-2020 01:23:14.305 #8 0x5583e7193db9 in __do_global_dtors_aux_fini_array_entry()+0x5583e636d541
> 25-Jun-2020 01:23:14.305 #9 0x5583e71a2181 in __do_global_dtors_aux_fini_array_entry()+0x5583e637b909
> 25-Jun-2020 01:23:14.305 #10 0x5583e7826f71 in __do_global_dtors_aux_fini_array_entry()+0x5583e6a006f9
> 25-Jun-2020 01:23:14.305 #11 0x5583e782800c in __do_global_dtors_aux_fini_array_entry()+0x5583e6a01794
> 25-Jun-2020 01:23:14.305 #12 0x7fd791aba6db in __do_global_dtors_aux_fini_array_entry()+0x7fd790c93e63
> 25-Jun-2020 01:23:14.305 #13 0x7fd7913d988f in __do_global_dtors_aux_fini_array_entry()+0x7fd7905b3017
> 25-Jun-2020 01:23:14.309 exiting (due to assertion failure)
> Aborted (core dumped)
> ```August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1996[CVE-2020-8620]: A specially crafted large TCP payload can trigger an asserti...2023-03-21T13:42:51ZOndřej Surý[CVE-2020-8620]: A specially crafted large TCP payload can trigger an assertion failure in tcpdns.cNone (published patch date)
TALOS-2020-1100
CVE-2020-8620
Internet Systems Consortium's BIND TCP Receive Buffer Length Assertion Check Denial of Service Vulnerability
### Summary
An assertion failure exists within the Internet Sy...None (published patch date)
TALOS-2020-1100
CVE-2020-8620
Internet Systems Consortium's BIND TCP Receive Buffer Length Assertion Check Denial of Service Vulnerability
### Summary
An assertion failure exists within the Internet Systems Consortium's BIND server versions 9.16.1 through 9.17.1 when processing TCP traffic via the libuv library. Due to a length specified within a callback for the library, flooding the server's TCP port used for larger DNS requests (AXFR) can cause the libuv library to pass a length to the server which will violate an assertion check in the server's verifications. This assertion check will terminate the service resulting in a denial of service condition. An attacker can flood the port with unauthenticated packets in order to trigger this vulnerability.
### Tested Versions
Internet Systems Consortium BIND 9.16.1
Internet Systems Consortium BIND 9.16.2
Internet Systems Consortium BIND 9.16.3
Internet Systems Consortium BIND 9.17.0
Internet Systems Consortium BIND 9.17.1
### Product URLs
[https://www.isc.org/bind](https://www.isc.org/bind)
### CVSSv3 Score
7.5 - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
### CWE
CWE-617 - Reachable Assertion
### Details
The BIND nameserver is considered the reference implementation of the Domain Name System of the Internet. It is capable of being an authoritative name server as well as a recursive cache for domain name queries on a network.
The BIND nameserver is based on a custom event queueing system that wraps around the `libuv` library (http://libuv.org) for performing asynchronous I/O as needed by the server. The `libuv` library was introduced as a new network manager during the release of version 9.16 in order to allow the server to be used on both the Posix and Windows environments and to simplify the management of processing network I/O distributed amongst a configurable number of threads within the server.
The BIND nameserver combines its own queue for scheduling with the `libuv` library in order to process requests and queries asynchronously for both the UDP and TCP protocols. In order to accomplish this, the server must first initialize the `libuv` library by allocating and initializing the `uv_loop_t` which is a handle directly used by the library to manage the event loop used by the server. After the server has initialized its core memory handling functions in the setup phase of the daemon, the `isc_nm_start` function will be used to construct a `uv_loop_t` for a number of workers using the `uv_loop_init` function at [1]. After initializing the loop that is to be used for each worker, the server will then allocate space for a receive buffer at [2] and then assign it into the context of the worker. This loop is then used to bind to the configured UDP and TCP ports as used by the server.
```
lib/isc/netmgr/netmgr.c:142
isc_nm_t *
isc_nm_start(isc_mem_t *mctx, uint32_t workers) {
isc_nm_t *mgr = NULL;
char name[32];
mgr = isc_mem_get(mctx, sizeof(*mgr));
*mgr = (isc_nm_t){ .nworkers = workers };
isc_mem_attach(mctx, &mgr->mctx);
...
r = uv_loop_init(&worker->loop); // [1] Initialize a uv_loop_t for each worker
RUNTIME_CHECK(r == 0);
...
r = uv_async_init(&worker->loop, &worker->async, async_cb);
RUNTIME_CHECK(r == 0);
...
worker->ievents = isc_queue_new(mgr->mctx, 128);
worker->ievents_prio = isc_queue_new(mgr->mctx, 128);
worker->recvbuf = isc_mem_get(mctx, ISC_NETMGR_RECVBUF_SIZE); // [2] Allocate a receive buffer of the size ISC_NETMGR_RECVBUF_SIZE
...
```
After the server has initialized each worker and bound to the configured ports, the server must use `libuv` to assign a callback to dispatch to when receiving a connection on a port. The callback that is used for processing TCP is the following `dnslisten_acceptcb` function. After performing a few validations, at [3] the function will call the `isc_nm_read` function with a callback, `dnslisten_readcb`, as its second parameter. This callback will be stored into a structure and then later passed to `libuv` in order to inform the library what to call when the server needs to read data from a connected TCP client.
```
lib/isc/netmgr/tcpdns.c:98
/*
* Accept callback for TCP-DNS connection.
*/
static void
dnslisten_acceptcb(isc_nmhandle_t *handle, isc_result_t result, void *cbarg) {
isc_nmsocket_t *dnslistensock = (isc_nmsocket_t *)cbarg;
isc_nmsocket_t *dnssock = NULL;
REQUIRE(VALID_NMSOCK(dnslistensock));
REQUIRE(dnslistensock->type == isc_nm_tcpdnslistener);
/* If accept() was unnsuccessful we can't do anything */
if (result != ISC_R_SUCCESS) {
return;
}
...
isc_nm_read(handle, dnslisten_readcb, dnssock); // [3] Pass the dnslisten_readcb callback for reading as a parameter
```
Inside the `isc_nm_read` function, the server will then assign the `dnslisten_readcb` callback that was passed as its second parameter into a `isc_nmsocket_t` structure at [4] as `sock->rcb.recv`. After preparing the `isc_nmsocket_t` structure, the server will fetch a new event and then assign the `isc_nmsocket_t` into it. Eventually at [5], the function will pass the event as the second parameter to the `isc__nm_async_startread` function. The `isc__nm_async_startread` function is directly responsible for calling into the `libuv` library with the necessary callbacks in order for the server to process any received DNS packets.
```
lib/isc/netmgr/tcp.c:521
isc_result_t
isc_nm_read(isc_nmhandle_t *handle, isc_nm_recv_cb_t cb, void *cbarg) {
isc_nmsocket_t *sock = NULL;
isc__netievent_startread_t *ievent = NULL;
...
sock = handle->sock;
sock->rcb.recv = cb; // [4] Store callback into sock
sock->rcbarg = cbarg;
...
ievent = isc__nm_get_ievent(sock->mgr, netievent_tcpstartread);
ievent->sock = sock; // Assign sock into the event
if (sock->tid == isc_nm_tid()) {
isc__nm_async_startread(&sock->mgr->workers[sock->tid], // [5] Pass the event containing the callback as the second parameter
(isc__netievent_t *)ievent);
isc__nm_put_ievent(sock->mgr, ievent);
} else {
...
return (ISC_R_SUCCESS);
}
```
Once inside the `isc__nm_async_startread` function, the `isc_nmsocket_t` will then be extracted from a field belonging to the event. After starting a timer with `libuv` in order to determine when to timeout the connection, the server will execute the `uv_read_start` function at [6]. The `uv_read_start` function belongs to `libuv` and is used in order to inform the library which callbacks to use when allocating space for the receive buffer during processing of a TCP stream and which callback to actually use for processing the data from the buffer that was received. The vulnerability referred to by this document is specifically due to the way these two callbacks are implemented by the server.
```
lib/isc/netmgr/tcp.c:548
void
isc__nm_async_startread(isc__networker_t *worker, isc__netievent_t *ev0) {
isc__netievent_startread_t *ievent = (isc__netievent_startread_t *)ev0;
isc_nmsocket_t *sock = ievent->sock;
int r;
...
r = uv_read_start(&sock->uv_handle.stream, isc__nm_alloc_cb, read_cb); // [6] Use libuv to assign callbacks for reading
...
}
```
When the `libuv` library needs the server to allocate a buffer to receive packet data into, it will call the following function. This function's responsibility is to allocate a buffer to receive packet data into, and then write the buffer along with its length into one of the function's parameters. The `libuv` library provides the `uv_buf_t` object to modify and a suggested size in its arguments. The implementation chosen by the server was to preallocate the read buffer for each worker during the setup process of the worker. Therefore in this function, the server will only need to assign the preallocated buffer and its size at [7] which prevents needing to allocate during the receiving of a packet.
```
lib/isc/netmgr/netmgr.c:972
void
isc__nm_alloc_cb(uv_handle_t *handle, size_t size, uv_buf_t *buf) {
isc_nmsocket_t *sock = uv_handle_get_data(handle);
isc__networker_t *worker = NULL;
REQUIRE(VALID_NMSOCK(sock));
REQUIRE(isc__nm_in_netthread());
REQUIRE(size <= ISC_NETMGR_RECVBUF_SIZE);
worker = &sock->mgr->workers[sock->tid];
INSIST(!worker->recvbuf_inuse);
buf->base = worker->recvbuf; // [7] Assign worker's receive buffer into buf->base
worker->recvbuf_inuse = true;
buf->len = ISC_NETMGR_RECVBUF_SIZE; // [7] Assign the length for the worker's receive buffer.
}
```
The size that was used for the allocation of the receive buffer and is assigned to the buffer length for `libuv` to use is defined in the following file. As described in the comments, this length is taken from the `libuv` source and contains the maximum size of a message on Posix platforms. Due to a smaller buffer size being used for the Windows platforms, the vulnerability described by this document does not affect that class of particular platforms.
```
lib/isc/netmgr/netmgr-int.h:38
#if !defined(WIN32)
/*
* New versions of libuv support recvmmsg on unices.
* Since recvbuf is only allocated per worker allocating a bigger one is not
* that wasteful.
* 20 here is UV__MMSG_MAXWIDTH taken from the current libuv source, nothing
* will break if the original value changes.
*/
#define ISC_NETMGR_RECVBUF_SIZE (20 * 65536)
#else
#define ISC_NETMGR_RECVBUF_SIZE (65536)
#endif
```
After the `libuv` library has dispatched to the allocation callback in order to allocate a buffer to read packet data into, the library can now execute the callback that is responsible for processing the actual data from the packet. The server implements this with the following `read_cb` function. This function will simply take the `uv_buf_t` and length that is passed as parameters to assign them into an `isc_region_t` at [8]. After initializing the `isc_region_t`, the server will then dispatch to the `dnslisten_readcb` callback at [9] that was previously assigned in the `isc_nm_read` function.
```
lib/isc/netmgr/tcp.c:639
static void
read_cb(uv_stream_t *stream, ssize_t nread, const uv_buf_t *buf) {
isc_nmsocket_t *sock = uv_handle_get_data((uv_handle_t *)stream);
REQUIRE(VALID_NMSOCK(sock));
REQUIRE(buf != NULL);
if (nread >= 0) {
isc_region_t region = { .base = (unsigned char *)buf->base, // [8] Initialize the isc_region_t with the buffer and size from libuv
.length = nread };
if (sock->rcb.recv != NULL) {
sock->rcb.recv(sock->tcphandle, ®ion, sock->rcbarg); // [9] Pass the region to the worker's callback
}
...
```
The `dnslisten_readcb` function has the responsibility for taking the packet data that was dispatched as a parameter by `libuv`, and aggregating it into a buffer containing a full DNS packet packet confirming to RFC1035. This is done by taking the packet data and its length from the `region` parameter of the type `isc_region_t` which was initialized by the calling function and using it to grow a buffer that will later be processed. At [10], the packet data and its length are extracted from the `isc_region_t` and assigned into local variables. Once determining the length, it is then used to check if the current packet buffer size that the server will process is large enough to fit the newly read data from the TCP socket. If the sum of the current buffer length and the number of bytes read from the packet is larger than the buffer size, then at [11] the server will use the `alloc_dnsbuf` function to reallocate the buffer to fit the calculated size. After performing the resize, at [12] the server will copy the new packet data from the `isc_region_t` directly into the current packet buffer and then process it at [13].
```
lib/isc/netmgr/tcpdns.c:198
static void
dnslisten_readcb(isc_nmhandle_t *handle, isc_region_t *region, void *arg) {
isc_nmsocket_t *dnssock = (isc_nmsocket_t *)arg;
unsigned char *base = NULL;
bool done = false;
size_t len;
...
base = region->base; // [10] Extract the libuv buffer and its length from the region
len = region->length;
if (dnssock->buf_len + len > dnssock->buf_size) {
alloc_dnsbuf(dnssock, dnssock->buf_len + len); // [11] Allocate the DNS buffer if it is too small
}
memmove(dnssock->buf + dnssock->buf_len, base, len); // [12] Copy new packet data to the end of the current packet buffer
dnssock->buf_len += len;
...
do {
isc_result_t result;
isc_nmhandle_t *dnshandle = NULL;
result = processbuffer(dnssock, &dnshandle); // [13] Process the contents off the packet data
...
```
When reallocating the packet buffer, the following `alloc_dnsbuf` function is used. The comment in front of this function indicates that the buffer size being defined for `NM_BIG_BUF` is for two full DNS packet lengths as this should be enough. However, due to the way that `libuv` works the length that was allocated for the worker receive buffer that was assigned in `isc__nm_alloc_cb` is used instead. This length is `20 * 64k` and is passed to the `read` system call by the `libuv` library. This results in the value of the `len` field that was passed to this function capable of being up to 0x140000 bytes. At [14], an assertion is used to validate that the length is less than the `NM_BIG_BUF` definition. If the length does not validate, then the assertion will log itself and follow up by calling the `abort` library function. This will directly terminate the server resulting in a denial of service condition.
```
lib/isc/netmgr/tcpdns.c:58
/*
* Two full DNS packets with lengths.
* netmgr receives 64k at most so there's no risk
* of overrun.
*/
#define NM_BIG_BUF (65535 + 2) * 2
static inline void
alloc_dnsbuf(isc_nmsocket_t *sock, size_t len) {
REQUIRE(len <= NM_BIG_BUF); // [14] Assertion
if (sock->buf == NULL) {
/* We don't have the buffer at all */
size_t alloc_len = len < NM_REG_BUF ? NM_REG_BUF : NM_BIG_BUF;
sock->buf = isc_mem_allocate(sock->mgr->mctx, alloc_len);
sock->buf_size = alloc_len;
} else {
/* We have the buffer but it's too small */
sock->buf = isc_mem_reallocate(sock->mgr->mctx, sock->buf,
NM_BIG_BUF);
sock->buf_size = NM_BIG_BUF;
}
}
```
### Crash Information
First we attach to the process, and then resume its execution.
```
$ gdb -p `pgrep named`
(gdb) c
Continuing.
```
After running the provided proof-of-concept, gdb will break due to the `SIGABRT` signal that was raised by the assertion.
```
(gdb)
Thread 5 "isc-net-0003" received signal SIGABRT, Aborted.
[Switching to Thread 0x7f95a443a700 (LWP 7)]
0x00007f95a822a18b in raise () from target:/lib/x86_64-linux-gnu/libc.so.6
```
The following backtrace is at the time of the signal being dispatched to the process.
```
(gdb) bt
#0 0x00007f95a822a18b in raise () from target:/lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f95a8209859 in abort () from target:/lib/x86_64-linux-gnu/libc.so.6
#2 0x0000563409bc75c6 in assertion_failed (file=0x563409f56eff "tcpdns.c", line=66, type=isc_assertiontype_require,
cond=0x563409f56ee8 "len <= (65535 + 2) * 2") at ./main.c:260
#3 0x0000563409e83070 in isc_assertion_failed (file=0x563409f56eff "tcpdns.c", line=66, type=isc_assertiontype_require,
cond=0x563409f56ee8 "len <= (65535 + 2) * 2") at assertions.c:46
#4 0x0000563409ea9453 in alloc_dnsbuf (sock=0x7f9598ce6be0, len=1310749) at tcpdns.c:66
#5 0x0000563409ea9d2a in dnslisten_readcb (handle=0x7f957c003180, region=0x7f95a44369b0, arg=0x7f9598ce6be0) at tcpdns.c:223
#6 0x0000563409ea696a in read_cb (stream=0x7f9598ce6920, nread=1310720, buf=0x7f95a4436a20) at tcp.c:651
#7 0x00007f95a841bad1 in ?? () from target:/lib/x86_64-linux-gnu/libuv.so.1
#8 0x00007f95a841c608 in ?? () from target:/lib/x86_64-linux-gnu/libuv.so.1
#9 0x00007f95a8421ab0 in uv.io_poll () from target:/lib/x86_64-linux-gnu/libuv.so.1
#10 0x00007f95a84117ac in uv_run () from target:/lib/x86_64-linux-gnu/libuv.so.1
#11 0x0000563409ea0bc2 in nm_thread (worker0=0x56340b6dd048) at netmgr.c:481
#12 0x00007f95a83e5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#13 0x00007f95a8306103 in clone () from target:/lib/x86_64-linux-gnu/libc.so.6
```
In frame 5 belonging to the `dnslisten_readcb` function, we can see that the region length corresponds directly to the value defined for `ISC_NETMGR_RECVBUF_SIZE`.
```
(gdb) frame 5
#5 0x0000563409ea9d2a in dnslisten_readcb (handle=0x7f957c003180, region=0x7f95a44369b0, arg=0x7f9598ce6be0) at tcpdns.c:223
223 alloc_dnsbuf(dnssock, dnssock->buf_len + len);
(gdb) p *regio
$3 = {base = 0x7f95a443b010 "l", length = 1310720}
The current buffer size that is to be grown is only 0x20002 bytes.
(gdb) p dnssock.buf_size
$7 = 131074
(gdb) p dnssock.buf_len
$8 = 29
```
### Exploit Proof of Concept
To use the provided proof-of-concept, it must first be modified. Change both the DST_IP and DST_PORT variables to point to the host the BIND daemon is listening on, and then run it with Python 2.x.
### Mitigation
Flood protection could mitigate this denial of service if configured properly.
### Credit
Discovered by Emanuel Almeida of Cisco Systems, Inc..
https://talosintelligence.com/vulnerability_reports/
### Timeline
None - Vendor Disclosure
None - Public ReleaseAugust 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1775Resizing (growing) of cache hash tables causes delays in processing of client...2020-11-13T18:31:42ZCathy AlmondResizing (growing) of cache hash tables causes delays in processing of client queriesFrom [Support ticket #16212](https://support.isc.org/Ticket/Display.html?id=16212)
During investigations of intermittent 'brownouts' - periods in which named seemingly stops actioning client queries for a short period, and then resumes ...From [Support ticket #16212](https://support.isc.org/Ticket/Display.html?id=16212)
During investigations of intermittent 'brownouts' - periods in which named seemingly stops actioning client queries for a short period, and then resumes processing a second or two later (yes, delays of seconds not ms from this) we 'caught' one culprit red-handed in a pstack run that was automatically triggered by an 'alarm' in monitoring inbound and outbound server traffic rates.
The thread in question was holding the cache tree lock, while growing the hash table:
```
Thread 21 (Thread 0x7f54d8b2f700 (LWP 19115)):
#0 0x000000000052bc7b in rehash (rbt=0x7f54b8c04058, newcount=<optimized out>) at rbt.c:2376
#1 0x000000000052da99 in hash_node (name=0x7f53d9562bb0, node=0x7f541cf79538, rbt=0x7f54b8c04058) at rbt.c:2389
#2 dns_rbt_addnode (rbt=0x7f54b8c04058, name=0x7f53d9562bb0, nodep=0x7f54d8b2dd28) at rbt.c:1451
#3 0x00000000005367ef in rbt_addnode_withdata (rbtdb=0x7f54b8c03010, rbt=0x7f54b8c04058, name=<optimized out>, nodep=0x7f54d8b2dd28) at rbtdb.c:2016
#4 0x000000000053ba42 in findnodeintree (rbtdb=0x7f54b8c03010, tree=0x7f54b8c04058, name=0x7f53d9562bb0, create=true, nodep=0x7f54d8b2ed30) at rbtdb.c:3339
#5 0x00000000005babb5 in cache_name (now=1587326409, zerottl=false, name=0x7f53d9562bb0, section=1, query=0x7f54600100d0, fctx=0x7f5449e172d0) at resolver.c:5876
#6 cache_message (now=1587326409, zerottl=false, query=0x7f54600100d0, fctx=0x7f5449e172d0) at resolver.c:6336
#7 resquery_response (task=0x7f5387cbb628, event=<optimized out>) at resolver.c:9166
#8 0x000000000068a8b1 in dispatch (manager=0x7f54dedc7010) at task.c:1157
#9 run (uap=0x7f54dedc7010) at task.c:1331
#10 0x00007f54dd90cdd5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007f54dd635ead in clone () from /lib64/libc.so.6
```
The other cause of similar problems is when growing the ADB tables - that one however is logged, whereas it doesn't look like 'rehash' or anything that calls it owns up (via logging) to what it is doing.
Our immediate quick-fix wish is for a solution to the delays caused by growing hash tables that is along the lines of being able to specify the starting size as named is launched. This needs to be either run-time or configurable in named.conf. (It is *not* helpful to make it build-time only because in many environments there will be a single build that is distributed to many servers whose needs/sizing can vary.)
It would also be really helpful if any hash table growing could be logged - to include what the size is expanding to (this will help admins to tune their servers accordingly).
====
Longer term, I understand that the wish is to replace the current and now fairly ancient hashing solution with something more modern, faster, and in particular, that doesn't need to block access when resizing - I'll leave engineering to open a new and independent ticket for that. For the here and now, we need a quicker fix, not a new development feature that can't be back-ported or easily applied.August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Ondřej SurýOndřej Surý