'named' can trigger assertion failure in resolver.c due to uninitialized link data
Summary
Introduced by commit ac889684, "Unlink the query under cleanup_query".
At the four places where "goto cleanup_query" is used, "ISC_LINK_INIT(query.link)" has not been called yet, leaving query->link.next and query->link.prev uninitialized. ISC_LINK_LINKED checks if link.prev != ISC_LINK_TOMBSTONE which it is (its 0, not -1), thus unlinking is tried though no links are there. ISC_LIST_UNLINK then triggers the assertion.
BIND version used
9.18.16-1~deb12u1 (debian packet on debian bookworm)
Also affected: versions 9.18.9-9.18.19
Steps to reproduce
Run i.e. "query_perf" with a data file triggering one of the gotos. These are all code paths in which a dns_dispatch(something) does NOT return ISC_R_SUCCESS, thus the INSIST will only be hit in (IMHO) non-trivial "abnormal" situations.
I hit this testing a bind version 9.18.16 (debian bookworm packet) in a setup where some zones can not be loaded from primaries because of the test setup missing transfer access, and entries in my data file hitting these "resolver black holes". (i am not REALLY sure that that is the cause hitting the assert, but looking at the code I think this is a clear bug.
Because this could potentially used to crash named I am marking this confidential.
What is the current bug behavior?
if one of the "gotos" is used, named hits the assertion and exits.
What is the expected correct behavior?
named should cope with all kinds of unreachable things.
Relevant logs and/or screenshots
(Paste any relevant logs - please use code blocks (```) to format console output, logs, and code, as it's very hard to read otherwise.)
18-Sep-2023 17:04:51.575 resolver.c:2308: INSIST((fctx->queries).tail == (query)) failed, back trace
18-Sep-2023 17:04:51.575 /usr/sbin/named(+0x23774) [0x55e0c5262774]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libisc-9.18.16-1~deb12u1-Debian.so(isc_assertion_failed+0xa) [0x7f2151839a4a]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libdns-9.18.16-1~deb12u1-Debian.so(+0x11604d) [0x7f215151604d]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libdns-9.18.16-1~deb12u1-Debian.so(+0x118ba6) [0x7f2151518ba6]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libdns-9.18.16-1~deb12u1-Debian.so(+0x11b9d2) [0x7f215151b9d2]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libdns-9.18.16-1~deb12u1-Debian.so(+0x11c4b8) [0x7f215151c4b8]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libdns-9.18.16-1~deb12u1-Debian.so(+0x538e5) [0x7f21514538e5]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libisc-9.18.16-1~deb12u1-Debian.so(isc__nm_async_readcb+0xac) [0x7f21518249dc]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libisc-9.18.16-1~deb12u1-Debian.so(isc__nm_readcb+0xa8) [0x7f2151824b18]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libisc-9.18.16-1~deb12u1-Debian.so(+0x34ef0) [0x7f2151834ef0]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libisc-9.18.16-1~deb12u1-Debian.so(isc__nm_udp_read_cb+0x46) [0x7f2151836636]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libuv.so.1(+0x1f14c) [0x7f215176a14c]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libuv.so.1(+0x22e3c) [0x7f215176de3c]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libuv.so.1(uv_run+0xc4) [0x7f215175a9e4]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libisc-9.18.16-1~deb12u1-Debian.so(+0x27634) [0x7f2151827634]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libisc-9.18.16-1~deb12u1-Debian.so(isc__trampoline_run+0x15) [0x7f2151861415]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libc.so.6(+0x89044) [0x7f2150afc044]
18-Sep-2023 17:04:51.575 /lib/x86_64-linux-gnu/libc.so.6(+0x1095fc) [0x7f2150b7c5fc]
18-Sep-2023 17:04:51.575 exiting (due to assertion failure)
Possible fixes
I checked this patch and it seams to aliviate the probem: Unlink the query ABOVE "cleanup_query:".
--- bind9-9.18.19.orig/lib/dns/resolver.c
+++ bind9-9.18.19/lib/dns/resolver.c
@@ -2298,7 +2298,6 @@ cleanup_dispatch:
dns_dispatch_detach(&query->dispatch);
}
-cleanup_query:
LOCK(&res->buckets[fctx->bucketnum].lock);
if (ISC_LINK_LINKED(query, link)) {
atomic_fetch_sub_release(&fctx->nqueries, 1);
@@ -2306,6 +2305,7 @@ cleanup_query:
}
UNLOCK(&res->buckets[fctx->bucketnum].lock);
+cleanup_query:
query->magic = 0;
dns_message_detach(&query->rmessage);
isc_mem_put(fctx->mctx, query, sizeof(*query));