ISC Open Source Projects issueshttps://gitlab.isc.org/groups/isc-projects/-/issues2020-07-16T07:31:53Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1995gssapictx.c:681:10: error: implicit declaration of function 'gsskrb5_register...2020-07-16T07:31:53ZMichal Nowakgssapictx.c:681:10: error: implicit declaration of function 'gsskrb5_register_acceptor_identity' on illumosOn OpenIndiana 2020.04 (an illumos distribution) compilation of BIND `main` commit 78a4ed31322271ff324994ab058b8448ae4a2252 fails in `lib/dns/gssapictx.c` with:
```
gssapictx.c: In function 'dst_gssapi_acceptctx':
gssapictx.c:681:10: err...On OpenIndiana 2020.04 (an illumos distribution) compilation of BIND `main` commit 78a4ed31322271ff324994ab058b8448ae4a2252 fails in `lib/dns/gssapictx.c` with:
```
gssapictx.c: In function 'dst_gssapi_acceptctx':
gssapictx.c:681:10: error: implicit declaration of function 'gsskrb5_register_acceptor_identity' [-Werror=implicit-function-declaration]
gret = gsskrb5_register_acceptor_identity(gssapi_keytab);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
Perhaps `#if defined(ISC_PLATFORM_GSSAPI_KRB5_HEADER)` around `gsskrb5_register_acceptor_identity()` is missing (like `v9_16` has)?
```patch
--- a/lib/dns/gssapictx.c
+++ b/lib/dns/gssapictx.c
@@ -678,6 +678,7 @@ dst_gssapi_acceptctx(gss_cred_id_t cred, const char *gssapi_keytab,
}
if (gssapi_keytab != NULL) {
+#if defined(ISC_PLATFORM_GSSAPI_KRB5_HEADER) || defined(WIN32)
gret = gsskrb5_register_acceptor_identity(gssapi_keytab);
if (gret != GSS_S_COMPLETE) {
gss_log(3,
@@ -687,6 +688,27 @@ dst_gssapi_acceptctx(gss_cred_id_t cred, const char *gssapi_keytab,
gss_error_tostring(gret, 0, buf, sizeof(buf)));
return (DNS_R_INVALIDTKEY);
}
+#else /* if defined(ISC_PLATFORM_GSSAPI_KRB5_HEADER) || defined(WIN32) */
+ /*
+ * Minimize memory leakage by only setting KRB5_KTNAME
+ * if it needs to change.
+ */
+ const char *old = getenv("KRB5_KTNAME");
+ if (old == NULL || strcmp(old, gssapi_keytab) != 0) {
+ size_t size;
+ char *kt;
+
+ size = strlen(gssapi_keytab) + 13;
+ kt = malloc(size);
+ if (kt == NULL) {
+ return (ISC_R_NOMEMORY);
+ }
+ snprintf(kt, size, "KRB5_KTNAME=%s", gssapi_keytab);
+ if (putenv(kt) != 0) {
+ return (ISC_R_NOMEMORY);
+ }
+ }
+#endif /* if defined(ISC_PLATFORM_GSSAPI_KRB5_HEADER) || defined(WIN32) */
}
log_cred(cred);
```August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)https://gitlab.isc.org/isc-projects/bind9/-/issues/1994netscope.c:23:50: error: unused parameter 'addr' when HAVE_IF_NAMETOINDEX und...2020-08-05T13:09:26ZMichal Nowaknetscope.c:23:50: error: unused parameter 'addr' when HAVE_IF_NAMETOINDEX undefined on illumosOn OpenIndiana 2020.04 (an illumos distribution) compilation of BIND `main` commit 78a4ed31322271ff324994ab058b8448ae4a2252 fails in `lib/isc/netscope.c` with:
```
netscope.c: In function 'isc_netscope_pton':
netscope.c:23:50: error: unu...On OpenIndiana 2020.04 (an illumos distribution) compilation of BIND `main` commit 78a4ed31322271ff324994ab058b8448ae4a2252 fails in `lib/isc/netscope.c` with:
```
netscope.c: In function 'isc_netscope_pton':
netscope.c:23:50: error: unused parameter 'addr' [-Werror=unused-parameter]
isc_netscope_pton(int af, char *scopename, void *addr, uint32_t *zoneid) {
^~~~
cc1: all warnings being treated as errors
```
It seems that `addr` is used only when `HAVE_IF_NAMETOINDEX` is defined.August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)https://gitlab.isc.org/isc-projects/bind9/-/issues/1993check.c:1576:37: error: expected identifier before numeric constant on illumos2020-07-13T22:52:29ZMichal Nowakcheck.c:1576:37: error: expected identifier before numeric constant on illumosOn OpenIndiana 2020.04 (an illumos distribution) compilation of BIND `v9_16` commit 38ca3fbcdc5c02e2f985d5ff8937d473b50d6aef fails in `lib/bind9/check.c` with:
```
libtool: compile: /usr/gcc/7/bin/gcc -include /export/home/newman/bind9/...On OpenIndiana 2020.04 (an illumos distribution) compilation of BIND `v9_16` commit 38ca3fbcdc5c02e2f985d5ff8937d473b50d6aef fails in `lib/bind9/check.c` with:
```
libtool: compile: /usr/gcc/7/bin/gcc -include /export/home/newman/bind9/config.h -I/export/home/newman/bind9 -I../.. -I. -I/export/home/newman/bind9/lib/bind9/include -I../../lib/bind9/include -I/export/home/newman/bind9/lib/dns/include -I../../lib/dns/include -I/export/home/newman/bind9/lib/isc/include -I../../lib/isc -I../../lib/isc/include -I../../lib/isc/unix/include -I../../lib/isc/pthreads/include -I/export/home/newman/bind9/lib/isccfg/include -I../../lib/isccfg/include -I/export/home/newman/bind9/lib/ns/include -I../../lib/ns/include -DISC_MEM_DEFAULTFILL=1 -DISC_LIST_CHECKINIT=1 -m64 -O3 -D_XOPEN_SOURCE=600 -D__EXTENSIONS__=1 -D_XPG6 -D_POSIX_PTHREAD_SEMANTICS -pthread -fPIC -W -Wall -Wmissing-prototypes -Wcast-qual -Wwrite-strings -Wformat -Wpointer-arith -Wno-missing-field-initializers -fno-strict-aliasing -Wshadow -Werror -c check.c -fPIC -DPIC -o .libs/check.o
In file included from /usr/include/sys/select.h:53:0,
from /usr/include/sys/types.h:640,
from /usr/include/sys/wait.h:37,
from /usr/include/stdlib.h:45,
from check.c:16:
check.c: In function 'check_options':
check.c:1576:37: error: expected identifier before numeric constant
enum { MAS = 1, PRI = 2, SLA = 4, SEC = 8 } values = 0;
^
gmake[2]: *** [Makefile:273: check.lo] Error 1
```
It turns out that `/usr/include/sys/time.h` has `SEC` [defined](https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/sys/time.h#L242).
Renaming `SEC` to `SCN` in `lib/bind9/check.c` does the trick. The introduction of `SEC` was in dca3658720cfb7f40b75e418a87d85552fe2a09c.August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)https://gitlab.isc.org/isc-projects/bind9/-/issues/1991Cleanup redundant non-NULL check.2020-07-06T00:30:57ZMark AndrewsCleanup redundant non-NULL check.```
1407 if (sigrdataset != NULL) {
1408 putrdataset(client->mctx, &sigrdataset);
1409 }
CID 288001 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking rctx suggest...```
1407 if (sigrdataset != NULL) {
1408 putrdataset(client->mctx, &sigrdataset);
1409 }
CID 288001 (#1 of 1): Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking rctx suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
1410 if (rctx != NULL) {
1411 isc_mutex_destroy(&rctx->lock);
1412 isc_mem_put(mctx, rctx, sizeof(*rctx));
1413 }
```August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Mark AndrewsMark Andrewshttps://gitlab.isc.org/isc-projects/bind9/-/issues/1990Bad isc_mem_put size.2020-07-16T07:05:26ZMark AndrewsBad isc_mem_put size.```
331 result = dns_rdatatype_fromtext(&types[i++].type, &r);
332 if (result != ISC_R_SUCCESS) {
333 cfg_obj_log(identity, named_g_lctx,
334 ...```
331 result = dns_rdatatype_fromtext(&types[i++].type, &r);
332 if (result != ISC_R_SUCCESS) {
333 cfg_obj_log(identity, named_g_lctx,
334 ISC_LOG_ERROR,
335 "'%.*s' is not a valid type",
336 (int)r.length, str);
CID 302775 (#1 of 1): Sizeof not portable (SIZEOF_MISMATCH)
suspicious_sizeof: Passing argument types of type dns_ssuruletype_t * and argument n * 8UL /* sizeof (types) */ to function isc__mem_put is suspicious. In this case, sizeof (dns_ssuruletype_t *) is equal to sizeof (dns_ssuruletype_t), but this is not a portable assumption.
Did you intend to use sizeof (*types) instead of sizeof (types)?
337 isc_mem_put(mctx, types, n * sizeof(types));
338 goto cleanup;
339 }
```August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Mark AndrewsMark Andrewshttps://gitlab.isc.org/isc-projects/bind9/-/issues/1989'rndc dnstap --roll' with too big a argument (>128) can cause a buffer overflow.2020-08-04T11:14:51ZMark Andrews'rndc dnstap --roll' with too big a argument (>128) can cause a buffer overflow.```
1158 if (versions > 0) {
1159 /*
1160 * First we fill 'to_keep' structure using insertion sort
1161 */
5. index_parm: Indexing array of size 2048 with versions minus an offse...```
1158 if (versions > 0) {
1159 /*
1160 * First we fill 'to_keep' structure using insertion sort
1161 */
5. index_parm: Indexing array of size 2048 with versions minus an offset in call to memset. [Note: The source code implementation of the function has been overridden by a builtin model.]
1162 memset(to_keep, 0, versions * sizeof(long long));
1163 while (isc_dir_read(&dir) == ISC_R_SUCCESS) {
```August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)https://gitlab.isc.org/isc-projects/bind9/-/issues/1988bad output rndc dnssec -status on Windows2020-07-16T07:03:47ZMatthijs Mekkingmatthijs@isc.orgbad output rndc dnssec -status on WindowsThe following discussion from !3780 should be addressed:
- [ ] @michal started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/3780#note_144605): (+1 comment)
> This looks fine to me, though I started a p...The following discussion from !3780 should be addressed:
- [ ] @michal started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/3780#note_144605): (+1 comment)
> This looks fine to me, though I started a pipeline including Windows
> system tests that I would like to complete successfully before merging
> this MR:
>
> https://gitlab.isc.org/isc-projects/bind9/pipelines/45755August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Matthijs Mekkingmatthijs@isc.orgMatthijs Mekkingmatthijs@isc.orghttps://gitlab.isc.org/isc-projects/bind9/-/issues/1976LMDB 0.9.26 will break "rndc reconfig" (+ other LMDB issues)2020-08-20T19:36:01ZMichał KępieńLMDB 0.9.26 will break "rndc reconfig" (+ other LMDB issues)The way BIND uses LMDB is at odds with what the authors of that library
expect and intend. ~~While I cannot find any mention of that
recommendation in the [docs][1], the LMDB author himself [says][2] that
"you should only ever open an e...The way BIND uses LMDB is at odds with what the authors of that library
expect and intend. ~~While I cannot find any mention of that
recommendation in the [docs][1], the LMDB author himself [says][2] that
"you should only ever open an environment once in any particular
process".~~[^1] BIND calls `mdb_env_open()` and `mdb_env_close()`
[multiple][3] [times][4] within the lifetime of a single process, but
AFAICT, doing that in an "open → close → open → close → ..." type of
sequence works just fine. Unfortunately, BIND does something worse and
it is related to what happens during an `rndc reconfig`: new views are
configured first and only after that happens, the old views get torn
down. For LMDB, this means that we call `mdb_env_open()` for a
previously `mdb_env_open()`ed LMDB environment *from the same process*
and then close the old "instance" of the environment. I am not sure we
ever cared about this a lot because it seemed to Just Work™. It did
[bite us][5] once (see 40a90fbf89738c1aa867a5f09ef7243ef3ae52e4), but we
worked around the problem and moved on.
Still, the [docs][6] clearly state:
> Do not have open an LMDB database twice in the same process at the
> same time.
We are not getting away with it as easily this time around.
In December 2019, the FreeBSD port for LMDB [started using robust
mutexes][7], which FreeBSD started supporting in the 11.0 release. This
broke LMDB-related BIND system tests on FreeBSD. I investigated it and
my conclusion was that the problem was likely caused by some low-level
FreeBSD issue that was over my head. I [reported it][8] and it was
ultimately determined to be an [undefined-behavior-type issue][9] with
what the FreeBSD threading library does when a mutex is unmapped from
memory without being destroyed first. This prompted LMDB maintainers to
merge a [fix][10] which causes LMDB to destroy locktable mutexes when
`mdb_env_close()` is called and the process calling that function is the
only remaining user of the LMDB environment. This change has been
already [applied][11] as a patch to the FreeBSD port of LMDB 0.9.24 and
I fully expect it to be a part of the next LMDB release, i.e. version
0.9.26, as it has also been [merged][12] into the LMDB 0.9 release
branch.
The problem is that the aforementioned fix breaks `rndc reconfig` on
*all* platforms we test on because it breaks the kludge we have been
effectively relying on so far. This is what we do (note that all calls
are for the same LMDB environment on disk, even though the pointers used
below - `envA` and `envB` - are different!):
1. `mdb_env_open(envA)`
2. (`rndc reconfig` is invoked)
3. `mdb_env_open(envB)`
4. `mdb_env_close(envA)`
Since all of this happens within a single process, the `mdb_env_close()`
call from step 4 destroys the locktable mutexes (because it correctly
observes that the current process is the only remaining user of the LMDB
environment at hand), which prevents any subsequent `mdb_txn_begin()`
calls from succeeding. Here is an example system test job which
triggers the problem:
https://gitlab.isc.org/isc-projects/bind9/-/jobs/975003
This problem affects all maintained BIND branches.
The only way I can see to work around the problem yet again without
redesigning the whole thing from the ground up is to employ `MDB_NOLOCK`
and use a mutex for controlling concurrent access to the LMDB database
ourselves. What saves us here is that we already [have][13] a mutex
handy and we can just broaden its scope without bumping the API versions
for our libraries in 9.11. I will submit a merge request implementing
this workaround shortly.
Honestly, though, I am afraid that this will just be another bandaid.
Call me an Eastern European grumbler, but I am not happy with the way
LMDB support has been implemented in BIND. We seem to have [chosen][14]
LMDB because it was apparently performing slightly better than SQLite 3.
The thing is, I do not think our use case needs fast *concurrent* access
to the database; we need something that allows us to add, remove, and
query zone configuration *faster than scanning a flat file sequentially*
(which is what pretty much any sane database should be capable of).
LMDB lives up to its promises about speed, but it comes with a set of
caveats that we need to cater for, which complicates our code given how
BIND works. To make things worse, our implementation of LMDB uses
`#ifdef` guards, which means it shares *some* of the code with the
non-LMDB variant (NZF, a text file), but not *all* of it, which makes
the code harder to follow than it has to be.
Here are some ideas for what we can do in the future to improve the
state of things:
- Rework the LMDB implementation in BIND so that it matches the
intended use of that library. This could be achieved by keeping a
global list of reference-counted LMDB environment objects, each of
which would be associated with a specific view name (not view
instance!). This approach should allow `rndc reconfig` to do
without calling `mdb_env_open()` or `mdb_env_close()`. I think such
a change would be too severe to go into 9.16, though.
- Use a different database that will likely be slower than LMDB, but
might be simpler to use.
- Move LMDB support to a module (easier said than done).
- Drop LMDB support altogether :-)
[1]: http://www.lmdb.tech/doc/group__mdb.html
[2]: https://bugs.openldap.org/show_bug.cgi?id=9278#c4
[3]: https://gitlab.isc.org/isc-projects/bind9/-/blob/bcbc7e2b10f85451466a3cc098f15cddd019ae0f/bin/named/server.c#L12796
[4]: https://gitlab.isc.org/isc-projects/bind9/-/blob/bcbc7e2b10f85451466a3cc098f15cddd019ae0f/lib/dns/view.c#L2135
[5]: https://bugs.isc.org/Ticket/Display.html?id=46556#txn-508940
[6]: http://www.lmdb.tech/doc/index.html
[7]: https://svnweb.freebsd.org/ports?view=revision&revision=519246
[8]: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=244493
[9]: https://bugs.openldap.org/show_bug.cgi?id=9278#c3
[10]: https://git.openldap.org/openldap/openldap/-/commit/2fd44e325195ae81664eb5dc36e7d265927c5ebc
[11]: https://svnweb.freebsd.org/ports?view=revision&revision=539380
[12]: https://git.openldap.org/openldap/openldap/-/commit/f683ffdc81d0edb20437cb7d655cf15a60e31249
[13]: https://gitlab.isc.org/isc-projects/bind9/-/blob/d35101e4338c3254113b8f51d178ac44170d412f/lib/dns/include/dns/view.h#L227
[14]: https://bugs.isc.org/Ticket/Display.html?id=39837#txn-430786
[^1]: The docs *do* in fact state the same thing, see below.
August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)https://gitlab.isc.org/isc-projects/bind9/-/issues/1829Don't include RRsets with TTL=0 in when preserving cache content for use by t...2020-08-05T10:24:31ZCathy AlmondDon't include RRsets with TTL=0 in when preserving cache content for use by the serve-stale featurePer [Support ticket #16297](https://support.isc.org/Ticket/Display.html?id=16297), even if serve-stale functionality is disabled, versions of BIND that have this feature will still observe max-stale-ttl and maintain a cache of expired co...Per [Support ticket #16297](https://support.isc.org/Ticket/Display.html?id=16297), even if serve-stale functionality is disabled, versions of BIND that have this feature will still observe max-stale-ttl and maintain a cache of expired content in case of emergency. In an emergency, serve-stale can be toggled 'on' via rndc - in which case access to relatively recently expired cache content would be very desirable ('you don't know you want it until it's gone!').
However, it doesn't seem at all reasonable to apply this same logic to RRsets that have been received with TTL=0.
In BIND, they are added to cache because this is the way the BIND architecture works - with server side that corresponds with clients and a resolver-side that does back-end fetching and puts the answers in cache for retrieval, even if they're for single-use only (TTL=0).
I assert that authoritative providers of TTL=0 RRsets are doing so because these are dynamic answers and they do not want resolvers to cache them. So therefore it makes no sense at all to preserve them and serve them stale if the authoritative servers become unreachable. Depending on specific scenarios, retaining TTL=0 RRsets could also cause unexpected cache bloat.
Please can we address this in all versions of BIND that have serve-stale functionality in their cache management.August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Matthijs Mekkingmatthijs@isc.orgMatthijs Mekkingmatthijs@isc.orghttps://gitlab.isc.org/isc-projects/bind9/-/issues/1775Resizing (growing) of cache hash tables causes delays in processing of client...2020-11-13T18:31:42ZCathy AlmondResizing (growing) of cache hash tables causes delays in processing of client queriesFrom [Support ticket #16212](https://support.isc.org/Ticket/Display.html?id=16212)
During investigations of intermittent 'brownouts' - periods in which named seemingly stops actioning client queries for a short period, and then resumes ...From [Support ticket #16212](https://support.isc.org/Ticket/Display.html?id=16212)
During investigations of intermittent 'brownouts' - periods in which named seemingly stops actioning client queries for a short period, and then resumes processing a second or two later (yes, delays of seconds not ms from this) we 'caught' one culprit red-handed in a pstack run that was automatically triggered by an 'alarm' in monitoring inbound and outbound server traffic rates.
The thread in question was holding the cache tree lock, while growing the hash table:
```
Thread 21 (Thread 0x7f54d8b2f700 (LWP 19115)):
#0 0x000000000052bc7b in rehash (rbt=0x7f54b8c04058, newcount=<optimized out>) at rbt.c:2376
#1 0x000000000052da99 in hash_node (name=0x7f53d9562bb0, node=0x7f541cf79538, rbt=0x7f54b8c04058) at rbt.c:2389
#2 dns_rbt_addnode (rbt=0x7f54b8c04058, name=0x7f53d9562bb0, nodep=0x7f54d8b2dd28) at rbt.c:1451
#3 0x00000000005367ef in rbt_addnode_withdata (rbtdb=0x7f54b8c03010, rbt=0x7f54b8c04058, name=<optimized out>, nodep=0x7f54d8b2dd28) at rbtdb.c:2016
#4 0x000000000053ba42 in findnodeintree (rbtdb=0x7f54b8c03010, tree=0x7f54b8c04058, name=0x7f53d9562bb0, create=true, nodep=0x7f54d8b2ed30) at rbtdb.c:3339
#5 0x00000000005babb5 in cache_name (now=1587326409, zerottl=false, name=0x7f53d9562bb0, section=1, query=0x7f54600100d0, fctx=0x7f5449e172d0) at resolver.c:5876
#6 cache_message (now=1587326409, zerottl=false, query=0x7f54600100d0, fctx=0x7f5449e172d0) at resolver.c:6336
#7 resquery_response (task=0x7f5387cbb628, event=<optimized out>) at resolver.c:9166
#8 0x000000000068a8b1 in dispatch (manager=0x7f54dedc7010) at task.c:1157
#9 run (uap=0x7f54dedc7010) at task.c:1331
#10 0x00007f54dd90cdd5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007f54dd635ead in clone () from /lib64/libc.so.6
```
The other cause of similar problems is when growing the ADB tables - that one however is logged, whereas it doesn't look like 'rehash' or anything that calls it owns up (via logging) to what it is doing.
Our immediate quick-fix wish is for a solution to the delays caused by growing hash tables that is along the lines of being able to specify the starting size as named is launched. This needs to be either run-time or configurable in named.conf. (It is *not* helpful to make it build-time only because in many environments there will be a single build that is distributed to many servers whose needs/sizing can vary.)
It would also be really helpful if any hash table growing could be logged - to include what the size is expanding to (this will help admins to tune their servers accordingly).
====
Longer term, I understand that the wish is to replace the current and now fairly ancient hashing solution with something more modern, faster, and in particular, that doesn't need to block access when resizing - I'll leave engineering to open a new and independent ticket for that. For the here and now, we need a quicker fix, not a new development feature that can't be back-ported or easily applied.August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1759convert rndc and the command channel to use netmgr2022-01-21T13:28:33ZEvan Huntconvert rndc and the command channel to use netmgrThe netmgr is currently only used for incoming DNS queries and responses, and will need to be generalized to support other network functions (rndc/command channel, dispatch, statschannel, dig/delv, nsupdate, etc).
The first step is to c...The netmgr is currently only used for incoming DNS queries and responses, and will need to be generalized to support other network functions (rndc/command channel, dispatch, statschannel, dig/delv, nsupdate, etc).
The first step is to convert rndc. This entails:
- add a working 'tcpconnect' function to netmgr so it can initiate TCP connections (currently it can only just accept them as a server).
- modify rndc to use this.
- modify isccc to accept netmgr connections.
- address any netmgr bugs or design deficiencies that turn up during this process.
- modify rndc command handler functions that depend on running under the taskmgr, so they are able to run in the netmgr event loop instead.August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Evan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/1727Drop use of "$FEATURETEST --have-dlopen"2020-07-30T12:38:06ZMichał KępieńDrop use of "$FEATURETEST --have-dlopen"The following discussion from !985 should be addressed:
- [ ] @michal started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/985#note_121125):
> `dlopen()` support seems to be a hard build-time requireme...The following discussion from !985 should be addressed:
- [ ] @michal started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/985#note_121125):
> `dlopen()` support seems to be a hard build-time requirement now, so I
> would drop all uses of `$FEATURETEST --have-dlopen` (a follow-up MR is
> fine).August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Michal NowakMichal Nowakhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1719Observed stats underflow in multiple stats2021-12-03T13:02:15ZBrian ConryObserved stats underflow in multiple statsWhile looking at a customer-provided named.stats file it was observed that several counters had underflowed in BIND version 9.11.16-S1.
A selection of the underflowing stats:
```
+++ Statistics Dump +++ (1584405000)
++ Cache DB RRsets +...While looking at a customer-provided named.stats file it was observed that several counters had underflowed in BIND version 9.11.16-S1.
A selection of the underflowing stats:
```
+++ Statistics Dump +++ (1584405000)
++ Cache DB RRsets ++
[View: default]
18446744073709551614 ~A
18446744073709551615 ~NS
++ Socket I/O Statistics ++
18446744073709551556 UDP/IPv4 sockets active
18446744073709551584 UDP/IPv6 sockets active
18446744073709551600 TCP/IPv4 sockets active
+++ Statistics Dump +++ (1584405300)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 ~CNAME
18446744073709551614 ~!AAAA
+++ Statistics Dump +++ (1584405900)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 ~AAAA
+++ Statistics Dump +++ (1584408000)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 ~RRSIG
+++ Statistics Dump +++ (1584418500)
++ Resolver Statistics ++
[View: default]
18446744073709551615 active fetches
+++ Statistics Dump +++ (1584447000)
++ Cache DB RRsets ++
[View: default]
18446744073709551608 ~NXDOMAIN
+++ Statistics Dump +++ (1584485100)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 ~NULL
+++ Statistics Dump +++ (1584504300)
++ Cache DB RRsets ++
[View: default]
18446744073709551521 NULL
+++ Statistics Dump +++ (1584515400)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 ~TXT
+++ Statistics Dump +++ (1584518400)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 ~NSEC
+++ Statistics Dump +++ (1584676500)
++ Cache DB RRsets ++
[View: default]
18446744073709551557 !AAAA
+++ Statistics Dump +++ (1584737700)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 !CNAME
+++ Statistics Dump +++ (1584886800)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 ~MX
+++ Statistics Dump +++ (1585119900)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 !MX
+++ Statistics Dump +++ (1585137000)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 ~DS
+++ Statistics Dump +++ (1585282500)
++ Cache DB RRsets ++
[View: default]
18446744073709551610 !A
+++ Statistics Dump +++ (1585457400)
++ Cache DB RRsets ++
[View: default]
18446744073709551506 CNAME
```
I'd like to specifically call out `active fetches` which might get lost in the noise near the middle of those lines.
The same stats file has history going back to unknown prior versions (August 2019) also has underflows on:
```
+++ Statistics Dump +++ (1566255596)
++ Cache DB RRsets ++
[View: default]
18446744073709547273 #A
18446744073709551608 #TXT
18446744073709551247 #AAAA
18446744073709551615 #SRV
18446744073709548950 #!AAAA
+++ Statistics Dump +++ (1566260101)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 #NS
+++ Statistics Dump +++ (1566307500)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 #RRSIG
+++ Statistics Dump +++ (1566355500)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 #!A
+++ Statistics Dump +++ (1567977300)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 !A6
+++ Statistics Dump +++ (1568136600)
++ Name Server Statistics ++
18446744073709551466 recursing clients
+++ Statistics Dump +++ (1570120200)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 #!SRV
+++ Statistics Dump +++ (1571707501)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 #CNAME
+++ Statistics Dump +++ (1571768400)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 #!TXT
+++ Statistics Dump +++ (1574442900)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 #DS
+++ Statistics Dump +++ (1574886601)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 #NXDOMAIN
+++ Statistics Dump +++ (1575558600)
++ Cache DB RRsets ++
[View: default]
18446744073709551615 #PTR
```
The most significant of the last set is probably `recursing clients`.August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Diego dos Santos FronzaDiego dos Santos Fronzahttps://gitlab.isc.org/isc-projects/bind9/-/issues/1712Serve-stale feature is not operationally useful - some suggestions for making...2022-01-19T11:20:49ZCathy AlmondServe-stale feature is not operationally useful - some suggestions for making it betterAs described in Support ticket [#16171](https://support.isc.org/Ticket/Display.html?id=16171) :
```
The problem with serve-stale was (and still is after some testing on 9.16.1),
that every client that asks for e.g. "isc.org A" will all ...As described in Support ticket [#16171](https://support.isc.org/Ticket/Display.html?id=16171) :
```
The problem with serve-stale was (and still is after some testing on 9.16.1),
that every client that asks for e.g. "isc.org A" will all have to wait for 10
seconds before they get the stale answer. There seem to be no table of stale
resolvers so each time a request comes in, BIND seems to try the resolver
again to find out if it answers or not.
```
This really is not helpful - most clients will have given up and gone away and will never get a usable answer.
IF the name is one that is popular, then because of 'clients-per-query' and the fact that we attach any future waiting clients for the same query to the already-existing fetch process, then the late arrivals stand a fighting chance of getting a response from stale cache before they give up - but the majority won't.
See also #1688 - we haven't documented very thoroughly how this works anyway, and we certainly have not documented how it interacts with fetch-limits and other resolver-protecting features.
Here's a sample config that was being used for testing:
```
stale-answer-enable yes;
stale-answer-ttl 600;
max-stale-ttl 1w;
```
There is nothing there that provides for a configurable period of 'staleness' so that after the first time the failure to refresh has taken place, a server can immediately serve this stale content to any clients who come along later instead of repeating the refresh attempt (and likely failing again).
I think the issue is that although we do have some control over how stale an answer can be before we stop serving it, we haven't thought sufficiently about how long clients will be prepared to wait for a query response if we have to attempt to refresh and then fail for each client (or set of clients) when queried.
Note: I **do not** think we should immediately serve stale answers whenever there's cache content available that has recently expired - this is not what we're trying to achieve. The idea of serve-stale as the converse of pre-fetch ('post-fetch'?) is somehow terribly tempting because it feels like it would be faster and a better experience for the clients, plus there's this nice symmetry with pre-fetch logic. But I think it's wrong - and would absolutely break how we handle TTL=0 answers today. Authoritative server operators **expect** resolvers to come back to them as soon as their cached content expired. We should not skip this step.
But what would be more helpful (to both clients and to servers) when there are non-responding authoritative servers, would be a way to flag a stale answer with the timestamp of when the last failing refresh attempt occurred, and if a client queries the same name again within a suitable time period (configurable? Something like 10s feels like a good default here), then the stale answer gets used right away.
We're preserving resolver resources by doing this (and anyway, if we couldn't resolve this name 1s ago, why are we trying again immediately if we've got something usable-but-stale in cache we could use instead?)August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Matthijs Mekkingmatthijs@isc.orgMatthijs Mekkingmatthijs@isc.orghttps://gitlab.isc.org/isc-projects/bind9/-/issues/1619RPZ wildcard passthru ignored2020-08-20T19:36:01ZDaniel StirnimannRPZ wildcard passthru ignored### Summary
I have two response-policy zones configured. The first zone is a local whitelist with the policy passthru. The second zone is a local blacklist with the policy given. If the blacklist rpz zone contains `www.example.com CNAME...### Summary
I have two response-policy zones configured. The first zone is a local whitelist with the policy passthru. The second zone is a local blacklist with the policy given. If the blacklist rpz zone contains `www.example.com CNAME .` (nxdomain) and the whitelist rpz zone contains a wildcard to whitelist *.example.com with `*.example.com CNAME rpz-passthru.` then this wildcard is ignored.
This has worked from 9.8 up to 9.14.5 and started not working in 9.14.6 and later (I tested up to 9.16.1)
### BIND version used
```
BIND 9.14.6 (Stable Release) <id:efd3496>
running on Linux x86_64 3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020
built by make with '--build=x86_64-koji-linux-gnu' '--host=x86_64-koji-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/opt/named' '--bindir=/opt/named/bin' '--sbindir=/opt/named/sbin' '--sysconfdir=/etc' '--datadir=/opt/named/share' '--includedir=/opt/named/include' '--libdir=/opt/named/lib64' '--libexecdir=/opt/named/libexec' '--localstatedir=/var' '--sharedstatedir=/var/lib' '--mandir=/opt/named/share/man' '--infodir=/opt/named/share/info' '--exec-prefix=/opt/named' '--disable-static' '--enable-threads' '--enable-ipv6' '--enable-dnstap' '--disable-openssl-version-check' '--enable-largefile' '--with-tuning=large' '--with-randomdev=/dev/urandom' '--with-pic' '--with-libjson' '--with-libtool' '--with-libxml2' '--with-python-install-dir=/opt/named/usr/lib/python2.7/site-packages' '--with-docbook-xsl=/opt/named/share/sgml/docbook/xsl-stylesheets' '--includedir=/opt/named/include/bind9' 'build_alias=x86_64-koji-linux-gnu' 'host_alias=x86_64-koji-linux-gnu' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic' 'LDFLAGS=-Wl,-z,relro ' 'PKG_CONFIG_PATH=:/opt/named/lib64/pkgconfig:/opt/named/share/pkgconfig'
compiled by GCC 4.8.5 20150623 (Red Hat 4.8.5-39)
compiled with OpenSSL version: OpenSSL 1.0.2k 26 Jan 2017
linked to OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
compiled with libxml2 version: 2.9.1
linked to libxml2 version: 20901
compiled with libjson-c version: 0.11
linked to libjson-c version: 0.11
compiled with zlib version: 1.2.7
linked to zlib version: 1.2.7
threads support is enabled
default paths:
named configuration: /etc/named.conf
rndc configuration: /etc/rndc.conf
DNSSEC root key: /etc/bind.keys
nsupdate session key: /var/run/named/session.key
named PID file: /var/run/named/named.pid
named lock file: /var/run/named/named.lock
```
### Steps to reproduce
1. Install bind release 9.14.6
2. Use the `named.conf` and `whitelist.zone`, `blacklist.zone` listed in the configuration file section.
3. Start bind e.g. systemctl start named
4. Use dig to check the behavior and check the logs
```
dig @::1 www.example.com
```
### What is the current *bug* behavior?
The wildcard passthru entry in the `whitelist.zone` is ignored.
### What is the expected *correct* behavior?
The wildcard passthru entry in the `whitelist.zone` is used.
### Relevant configuration files
Used `named.conf`
```
logging {
channel "default_debug" {
file "named.log";
severity info;
print-time yes;
print-severity yes;
print-category yes;
};
};
options {
directory "/var/named/data";
listen-on port 53 {
127.0.0.1/32;
};
listen-on-v6 port 53 {
::1/128;
};
dnssec-enable yes;
dnssec-validation auto;
empty-zones-enable yes;
recursion yes;
response-policy {
zone "whitelist.zone" policy passthru;
zone "blacklist.zone" policy given;
} break-dnssec yes;
allow-query {
"localhost";
};
allow-transfer {
"localhost";
};
};
zone "whitelist.zone" {
type master;
file "whitelist.zone";
allow-query {
"none";
};
};
zone "blacklist.zone" {
type master;
file "blacklist.zone";
allow-query {
"none";
};
};
```
Used `whitelist.zone`
```
$ORIGIN whitelist.zone.
$TTL 3600
@ IN SOA ns.whitelist.zone. hostmaster.whitelist.zone. 1 600 300 604800 3600
IN NS ns2.switch.ch.
example.com CNAME rpz-passthru.
*.example.com CNAME rpz-passthru.
```
Used `blacklist.zone`
```
$ORIGIN blacklist.zone.
$TTL 3600
@ IN SOA ns.blacklist.zone. hostmaster.blacklist.zone. 1 600 300 604800 3600
IN NS ns2.switch.ch.
www.example.com CNAME .
; test record
test.example.org CNAME .
```
### Relevant logs and/or screenshots
Log entry on 9.14.6 where it breaks wildcards for passthru:
```
12-Feb-2020 15:12:30.481 rpz: info: client @0x7fd7200a2cb0 ::1#58427 (www.example.com): rpz QNAME Local-Data rewrite www.example.com/A/IN via www.example.com.blacklist.zone
```
Log entry on 9.14.5 where wildcard passthru still works:
```
12-Feb-2020 15:14:36.229 rpz: info: client @0x7fd5440a2cb0 ::1#45028 (www.example.com): rpz QNAME PASSTHRU rewrite www.example.com/A/IN via www.example.com.whitelist.zone
```
### Possible fixes
It may has something to do with this change in 9.14.6
```
5282. [bug] Fixed a bug in searching for possible wildcard matches
for query names in the RPZ summary database. [GL #1146]
```
https://ftp.isc.org/isc/bind9/9.14.6/CHANGESAugust 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Diego dos Santos FronzaDiego dos Santos Fronzahttps://gitlab.isc.org/isc-projects/bind9/-/issues/1606named assertion failed in interfacemgr.c on FreeBSD 12.12020-07-09T06:41:59ZMichal Nowaknamed assertion failed in interfacemgr.c on FreeBSD 12.1(This might be a similar issue as https://gitlab.isc.org/isc-projects/bind9/issues/1604, but with a different backtrace and that I can't reproduce it.)
I have a FreeBSD 12.1 under KVM (6 vCPU, 8 G RAM), where I triggered a core dump of ...(This might be a similar issue as https://gitlab.isc.org/isc-projects/bind9/issues/1604, but with a different backtrace and that I can't reproduce it.)
I have a FreeBSD 12.1 under KVM (6 vCPU, 8 G RAM), where I triggered a core dump of `named` by running system test under a tight loop (e.g. `while true; do make -j6 -k test V=1; done`) and pressed `Ctrl-C`.
Here's the backtrace:
```
Core was generated by `/usr/home/newman/bind9/bin/named/.libs/named -D addzone-ns2 -X named.lock -m rec'.
Program terminated with signal SIGABRT, Aborted.
#0 0x0000000800e0845a in thr_kill () from /lib/libc.so.7
[Current thread is 1 (LWP 100813)]
(gdb) bt
#0 0x0000000800e0845a in thr_kill () from /lib/libc.so.7
#1 0x0000000800e06844 in raise () from /lib/libc.so.7
#2 0x0000000800d79079 in abort () from /lib/libc.so.7
#3 0x00000000002309e5 in assertion_failed (file=0x8002b45db "interfacemgr.c", line=335, type=<optimized out>,
cond=0x8002b67e8 "(__builtin_expect(!!((mgr) != ((void *)0)), 1) && __builtin_expect(!!(((const isc__magic_t *)(mgr))->magic == ((('I') << 24 | ('F') << 16 | ('M') << 8 | ('G')))), 1))") at ./main.c:261
#4 0x000000080070333a in isc_assertion_failed (file=0x189cd <error: Cannot access memory at address 0x189cd>, line=6, type=isc_assertiontype_require, cond=0x800e0847a <thr_self+10> "\017\202\224\064") at assertions.c:48
#5 0x00000008002ce1e3 in ns_interfacemgr_getaclenv (mgr=<optimized out>) at interfacemgr.c:335
#6 0x000000000022ec3e in address_ok (sockaddr=0x7fffdf1f6e08, acl=0x80181bb30) at controlconf.c:228
#7 0x000000000022eb12 in control_newconn (task=<optimized out>, event=<optimized out>) at controlconf.c:627
#8 0x0000000800727329 in dispatch (manager=0x801901bd0, threadid=<optimized out>) at task.c:1150
#9 0x00000008007254af in run (queuep=<optimized out>) at task.c:1340
#10 0x0000000800c32776 in ?? () from /lib/libthr.so.3
#11 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdf1f7000
```
A more detailed backtrace follows, but the core file and the `named` binary are gone: [core.gdb](/uploads/6f0d368331b9e8c36320a0026d42bf8b/core.gdb).August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)https://gitlab.isc.org/isc-projects/bind9/-/issues/1475ThreadSanitizer: data race lib/dns/rbtdb.c:1545 in mark_header_stale and chec...2020-08-26T21:24:35ZOndřej SurýThreadSanitizer: data race lib/dns/rbtdb.c:1545 in mark_header_stale and check_stale_headerFound in `zero` test:
```
WARNING: ThreadSanitizer: data race (pid=7941)
Read of size 2 at 0x7b3000026adc by thread T2 (mutexes: read M633172806149932752, read M641898633507182144):
#0 mark_header_stale /home/ondrej/Projects/bind9/...Found in `zero` test:
```
WARNING: ThreadSanitizer: data race (pid=7941)
Read of size 2 at 0x7b3000026adc by thread T2 (mutexes: read M633172806149932752, read M641898633507182144):
#0 mark_header_stale /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:1545 (libdns.so.1505+0x10bd0b)
#1 check_stale_header /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4345 (libdns.so.1505+0x10bd0b)
#2 cache_find /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4797 (libdns.so.1505+0x122c03)
#3 dns_db_findext /home/ondrej/Projects/bind9/lib/dns/db.c:551 (libdns.so.1505+0x6673c)
#4 query_lookup /home/ondrej/Projects/bind9/lib/ns/query.c:5515 (libns.so.1502+0x3f6a0)
#5 ns__query_start /home/ondrej/Projects/bind9/lib/ns/query.c:5441 (libns.so.1502+0x40209)
#6 query_setup /home/ondrej/Projects/bind9/lib/ns/query.c:5162 (libns.so.1502+0x48c13)
#7 ns_query_start /home/ondrej/Projects/bind9/lib/ns/query.c:11239 (libns.so.1502+0x49444)
#8 ns__client_request /home/ondrej/Projects/bind9/lib/ns/client.c:2157 (libns.so.1502+0x15890)
#9 udp_recv_cb /home/ondrej/Projects/bind9/lib/isc/netmgr/udp.c:317 (libisc.so.1504+0x46926)
#10 <null> <null> (libuv.so.1+0x1d6d4)
#11 <null> <null> (libtsan.so.0+0x29b3d)
Previous write of size 2 at 0x7b3000026adc by thread T6 (mutexes: read M633172806149932752, read M641898633507182144):
#0 mark_header_stale /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:1557 (libdns.so.1505+0x10bd67)
#1 check_stale_header /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4345 (libdns.so.1505+0x10bd67)
#2 cache_find /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4797 (libdns.so.1505+0x122c03)
#3 dns_db_findext /home/ondrej/Projects/bind9/lib/dns/db.c:551 (libdns.so.1505+0x6673c)
#4 query_lookup /home/ondrej/Projects/bind9/lib/ns/query.c:5515 (libns.so.1502+0x3f6a0)
#5 ns__query_start /home/ondrej/Projects/bind9/lib/ns/query.c:5441 (libns.so.1502+0x40209)
#6 query_setup /home/ondrej/Projects/bind9/lib/ns/query.c:5162 (libns.so.1502+0x48c13)
#7 ns_query_start /home/ondrej/Projects/bind9/lib/ns/query.c:11239 (libns.so.1502+0x49444)
#8 ns__client_request /home/ondrej/Projects/bind9/lib/ns/client.c:2157 (libns.so.1502+0x15890)
#9 udp_recv_cb /home/ondrej/Projects/bind9/lib/isc/netmgr/udp.c:317 (libisc.so.1504+0x46926)
#10 <null> <null> (libuv.so.1+0x1d6d4)
#11 <null> <null> (libtsan.so.0+0x29b3d)
Location is heap block of size 181 at 0x7b3000026ac0 allocated by thread T11:
#0 malloc <null> (libtsan.so.0+0x2b1a3)
#1 default_memalloc /home/ondrej/Projects/bind9/lib/isc/mem.c:685 (libisc.so.1504+0x33fee)
#2 mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:598 (libisc.so.1504+0x34c7e)
#3 mem_allocateunlocked /home/ondrej/Projects/bind9/lib/isc/mem.c:1222 (libisc.so.1504+0x34c7e)
#4 isc___mem_allocate /home/ondrej/Projects/bind9/lib/isc/mem.c:1242 (libisc.so.1504+0x34c7e)
#5 isc__mem_allocate /home/ondrej/Projects/bind9/lib/isc/mem.c:2387 (libisc.so.1504+0x3be64)
#6 isc___mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:1007 (libisc.so.1504+0x3c6ca)
#7 isc__mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:2365 (libisc.so.1504+0x3aef1)
#8 dns_rdataslab_fromrdataset /home/ondrej/Projects/bind9/lib/dns/rdataslab.c:266 (libdns.so.1505+0x17a212)
#9 addrdataset /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:6461 (libdns.so.1505+0x119b45)
#10 dns_db_addrdataset /home/ondrej/Projects/bind9/lib/dns/db.c:744 (libdns.so.1505+0x673cf)
#11 cache_name /home/ondrej/Projects/bind9/lib/dns/resolver.c:6316 (libdns.so.1505+0x19404b)
#12 cache_message /home/ondrej/Projects/bind9/lib/dns/resolver.c:6413 (libdns.so.1505+0x1ae663)
#13 resquery_response /home/ondrej/Projects/bind9/lib/dns/resolver.c:7631 (libdns.so.1505+0x1ae663)
#14 dispatch /home/ondrej/Projects/bind9/lib/isc/task.c:1134 (libisc.so.1504+0x56fa6)
#15 run /home/ondrej/Projects/bind9/lib/isc/task.c:1319 (libisc.so.1504+0x56fa6)
#16 <null> <null> (libtsan.so.0+0x29b3d)
Mutex M633172806149932752 is already destroyed.
Mutex M641898633507182144 is already destroyed.
Thread T2 'isc-net-0001' (tid=7987, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x2be1b)
#1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:75 (libisc.so.1504+0x7bcc4)
#2 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:149 (libisc.so.1504+0x3ec7a)
#3 create_managers main.c:895 (named+0x1ae90)
#4 setup main.c:1235 (named+0x1ae90)
#5 main main.c:1515 (named+0x1ae90)
Thread T6 'isc-net-0005' (tid=8016, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x2be1b)
#1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:75 (libisc.so.1504+0x7bcc4)
#2 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:149 (libisc.so.1504+0x3ec7a)
#3 create_managers main.c:895 (named+0x1ae90)
#4 setup main.c:1235 (named+0x1ae90)
#5 main main.c:1515 (named+0x1ae90)
Thread T11 'isc-worker0002' (tid=8040, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x2be1b)
#1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:75 (libisc.so.1504+0x7bcc4)
#2 isc_taskmgr_create /home/ondrej/Projects/bind9/lib/isc/task.c:1410 (libisc.so.1504+0x59d63)
#3 create_managers main.c:902 (named+0x1aeec)
#4 setup main.c:1235 (named+0x1aeec)
#5 main main.c:1515 (named+0x1aeec)
SUMMARY: ThreadSanitizer: data race /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:1545 in mark_header_stale
```
```
WARNING: ThreadSanitizer: data race (pid=7941)
Read of size 2 at 0x7b340000271c by thread T3 (mutexes: read M633172806149932752, read M641617158530471408):
#0 mark_header_stale /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:1545 (libdns.so.1505+0x10bd0b)
#1 check_stale_header /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4345 (libdns.so.1505+0x10bd0b)
#2 find_deepest_zonecut /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4503 (libdns.so.1505+0x10eed1)
#3 cache_find /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4758 (libdns.so.1505+0x1236b0)
#4 dns_db_findext /home/ondrej/Projects/bind9/lib/dns/db.c:551 (libdns.so.1505+0x6673c)
#5 query_lookup /home/ondrej/Projects/bind9/lib/ns/query.c:5515 (libns.so.1502+0x3f6a0)
#6 ns__query_start /home/ondrej/Projects/bind9/lib/ns/query.c:5441 (libns.so.1502+0x40209)
#7 query_setup /home/ondrej/Projects/bind9/lib/ns/query.c:5162 (libns.so.1502+0x48c13)
#8 ns_query_start /home/ondrej/Projects/bind9/lib/ns/query.c:11239 (libns.so.1502+0x49444)
#9 ns__client_request /home/ondrej/Projects/bind9/lib/ns/client.c:2157 (libns.so.1502+0x15890)
#10 udp_recv_cb /home/ondrej/Projects/bind9/lib/isc/netmgr/udp.c:317 (libisc.so.1504+0x46926)
#11 <null> <null> (libuv.so.1+0x1d6d4)
#12 <null> <null> (libtsan.so.0+0x29b3d)
Previous write of size 2 at 0x7b340000271c by thread T5 (mutexes: read M633172806149932752, read M641617158530471408):
#0 mark_header_stale /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:1557 (libdns.so.1505+0x10bd67)
#1 check_stale_header /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4345 (libdns.so.1505+0x10bd67)
#2 find_deepest_zonecut /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4503 (libdns.so.1505+0x10eed1)
#3 cache_find /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4758 (libdns.so.1505+0x1236b0)
#4 dns_db_findext /home/ondrej/Projects/bind9/lib/dns/db.c:551 (libdns.so.1505+0x6673c)
#5 query_lookup /home/ondrej/Projects/bind9/lib/ns/query.c:5515 (libns.so.1502+0x3f6a0)
#6 ns__query_start /home/ondrej/Projects/bind9/lib/ns/query.c:5441 (libns.so.1502+0x40209)
#7 query_setup /home/ondrej/Projects/bind9/lib/ns/query.c:5162 (libns.so.1502+0x48c13)
#8 ns_query_start /home/ondrej/Projects/bind9/lib/ns/query.c:11239 (libns.so.1502+0x49444)
#9 ns__client_request /home/ondrej/Projects/bind9/lib/ns/client.c:2157 (libns.so.1502+0x15890)
#10 udp_recv_cb /home/ondrej/Projects/bind9/lib/isc/netmgr/udp.c:317 (libisc.so.1504+0x46926)
#11 <null> <null> (libuv.so.1+0x1d6d4)
#12 <null> <null> (libtsan.so.0+0x29b3d)
Location is heap block of size 197 at 0x7b3400002700 allocated by thread T11:
#0 malloc <null> (libtsan.so.0+0x2b1a3)
#1 default_memalloc /home/ondrej/Projects/bind9/lib/isc/mem.c:685 (libisc.so.1504+0x33fee)
#2 mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:598 (libisc.so.1504+0x34c7e)
#3 mem_allocateunlocked /home/ondrej/Projects/bind9/lib/isc/mem.c:1222 (libisc.so.1504+0x34c7e)
#4 isc___mem_allocate /home/ondrej/Projects/bind9/lib/isc/mem.c:1242 (libisc.so.1504+0x34c7e)
#5 isc__mem_allocate /home/ondrej/Projects/bind9/lib/isc/mem.c:2387 (libisc.so.1504+0x3be64)
#6 isc___mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:1007 (libisc.so.1504+0x3c6ca)
#7 isc__mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:2365 (libisc.so.1504+0x3aef1)
#8 dns_rdataslab_fromrdataset /home/ondrej/Projects/bind9/lib/dns/rdataslab.c:266 (libdns.so.1505+0x17a212)
#9 addrdataset /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:6461 (libdns.so.1505+0x119b45)
#10 dns_db_addrdataset /home/ondrej/Projects/bind9/lib/dns/db.c:744 (libdns.so.1505+0x673cf)
#11 cache_name /home/ondrej/Projects/bind9/lib/dns/resolver.c:6316 (libdns.so.1505+0x19404b)
#12 cache_message /home/ondrej/Projects/bind9/lib/dns/resolver.c:6413 (libdns.so.1505+0x1ae663)
#13 resquery_response /home/ondrej/Projects/bind9/lib/dns/resolver.c:7631 (libdns.so.1505+0x1ae663)
#14 dispatch /home/ondrej/Projects/bind9/lib/isc/task.c:1134 (libisc.so.1504+0x56fa6)
#15 run /home/ondrej/Projects/bind9/lib/isc/task.c:1319 (libisc.so.1504+0x56fa6)
#16 <null> <null> (libtsan.so.0+0x29b3d)
Mutex M633172806149932752 is already destroyed.
Mutex M641617158530471408 is already destroyed.
Thread T3 'isc-net-0002' (tid=7993, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x2be1b)
#1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:75 (libisc.so.1504+0x7bcc4)
#2 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:149 (libisc.so.1504+0x3ec7a)
#3 create_managers main.c:895 (named+0x1ae90)
#4 setup main.c:1235 (named+0x1ae90)
#5 main main.c:1515 (named+0x1ae90)
Thread T5 'isc-net-0004' (tid=8008, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x2be1b)
#1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:75 (libisc.so.1504+0x7bcc4)
#2 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:149 (libisc.so.1504+0x3ec7a)
#3 create_managers main.c:895 (named+0x1ae90)
#4 setup main.c:1235 (named+0x1ae90)
#5 main main.c:1515 (named+0x1ae90)
Thread T11 'isc-worker0002' (tid=8040, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x2be1b)
#1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:75 (libisc.so.1504+0x7bcc4)
#2 isc_taskmgr_create /home/ondrej/Projects/bind9/lib/isc/task.c:1410 (libisc.so.1504+0x59d63)
#3 create_managers main.c:902 (named+0x1aeec)
#4 setup main.c:1235 (named+0x1aeec)
#5 main main.c:1515 (named+0x1aeec)
SUMMARY: ThreadSanitizer: data race /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:1545 in mark_header_stale
```
```
WARNING: ThreadSanitizer: data race (pid=7941)
Read of size 2 at 0x7b34000310fc by thread T5 (mutexes: read M633172806149932752, read M641617158530471408):
#0 check_stale_header /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4336 (libdns.so.1505+0x10bceb)
#1 find_deepest_zonecut /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4503 (libdns.so.1505+0x10eed1)
#2 cache_find /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4758 (libdns.so.1505+0x1236b0)
#3 dns_db_findext /home/ondrej/Projects/bind9/lib/dns/db.c:551 (libdns.so.1505+0x6673c)
#4 query_lookup /home/ondrej/Projects/bind9/lib/ns/query.c:5515 (libns.so.1502+0x3f6a0)
#5 ns__query_start /home/ondrej/Projects/bind9/lib/ns/query.c:5441 (libns.so.1502+0x40209)
#6 query_setup /home/ondrej/Projects/bind9/lib/ns/query.c:5162 (libns.so.1502+0x48c13)
#7 ns_query_start /home/ondrej/Projects/bind9/lib/ns/query.c:11239 (libns.so.1502+0x49444)
#8 ns__client_request /home/ondrej/Projects/bind9/lib/ns/client.c:2157 (libns.so.1502+0x15890)
#9 udp_recv_cb /home/ondrej/Projects/bind9/lib/isc/netmgr/udp.c:317 (libisc.so.1504+0x46926)
#10 <null> <null> (libuv.so.1+0x1d6d4)
#11 <null> <null> (libtsan.so.0+0x29b3d)
Previous write of size 2 at 0x7b34000310fc by thread T11 (mutexes: write M57556908275267488, read M633172806149932752, read M641617158530471408):
#0 mark_header_stale /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:1557 (libdns.so.1505+0x10bd67)
#1 check_stale_header /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4345 (libdns.so.1505+0x10bd67)
#2 find_deepest_zonecut /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4503 (libdns.so.1505+0x10eed1)
#3 cache_find /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4758 (libdns.so.1505+0x1236b0)
#4 dns_db_find /home/ondrej/Projects/bind9/lib/dns/db.c:511 (libdns.so.1505+0x6648d)
#5 dns_view_find /home/ondrej/Projects/bind9/lib/dns/view.c:1019 (libdns.so.1505+0x1f6fdd)
#6 dbfind_name /home/ondrej/Projects/bind9/lib/dns/adb.c:3678 (libdns.so.1505+0x3f65f)
#7 dns_adb_createfind /home/ondrej/Projects/bind9/lib/dns/adb.c:3070 (libdns.so.1505+0x529ad)
#8 findname /home/ondrej/Projects/bind9/lib/dns/resolver.c:3382 (libdns.so.1505+0x186a47)
#9 fctx_getaddresses /home/ondrej/Projects/bind9/lib/dns/resolver.c:3669 (libdns.so.1505+0x19a933)
#10 fctx_try /home/ondrej/Projects/bind9/lib/dns/resolver.c:4029 (libdns.so.1505+0x1a1a94)
#11 fctx_start /home/ondrej/Projects/bind9/lib/dns/resolver.c:4651 (libdns.so.1505+0x1a5a0b)
#12 dispatch /home/ondrej/Projects/bind9/lib/isc/task.c:1134 (libisc.so.1504+0x56fa6)
#13 run /home/ondrej/Projects/bind9/lib/isc/task.c:1319 (libisc.so.1504+0x56fa6)
#14 <null> <null> (libtsan.so.0+0x29b3d)
Location is heap block of size 197 at 0x7b34000310e0 allocated by thread T10:
#0 malloc <null> (libtsan.so.0+0x2b1a3)
#1 default_memalloc /home/ondrej/Projects/bind9/lib/isc/mem.c:685 (libisc.so.1504+0x33fee)
#2 mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:598 (libisc.so.1504+0x34c7e)
#3 mem_allocateunlocked /home/ondrej/Projects/bind9/lib/isc/mem.c:1222 (libisc.so.1504+0x34c7e)
#4 isc___mem_allocate /home/ondrej/Projects/bind9/lib/isc/mem.c:1242 (libisc.so.1504+0x34c7e)
#5 isc__mem_allocate /home/ondrej/Projects/bind9/lib/isc/mem.c:2387 (libisc.so.1504+0x3be64)
#6 isc___mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:1007 (libisc.so.1504+0x3c6ca)
#7 isc__mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:2365 (libisc.so.1504+0x3aef1)
#8 dns_rdataslab_fromrdataset /home/ondrej/Projects/bind9/lib/dns/rdataslab.c:266 (libdns.so.1505+0x17a212)
#9 addrdataset /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:6461 (libdns.so.1505+0x119b45)
#10 dns_db_addrdataset /home/ondrej/Projects/bind9/lib/dns/db.c:744 (libdns.so.1505+0x673cf)
#11 cache_name /home/ondrej/Projects/bind9/lib/dns/resolver.c:6316 (libdns.so.1505+0x19404b)
#12 cache_message /home/ondrej/Projects/bind9/lib/dns/resolver.c:6413 (libdns.so.1505+0x1ae663)
#13 resquery_response /home/ondrej/Projects/bind9/lib/dns/resolver.c:7631 (libdns.so.1505+0x1ae663)
#14 dispatch /home/ondrej/Projects/bind9/lib/isc/task.c:1134 (libisc.so.1504+0x56fa6)
#15 run /home/ondrej/Projects/bind9/lib/isc/task.c:1319 (libisc.so.1504+0x56fa6)
#16 <null> <null> (libtsan.so.0+0x29b3d)
Mutex M633172806149932752 is already destroyed.
Mutex M641617158530471408 is already destroyed.
Mutex M57556908275267488 is already destroyed.
Thread T5 'isc-net-0004' (tid=8008, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x2be1b)
#1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:75 (libisc.so.1504+0x7bcc4)
#2 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:149 (libisc.so.1504+0x3ec7a)
#3 create_managers main.c:895 (named+0x1ae90)
#4 setup main.c:1235 (named+0x1ae90)
#5 main main.c:1515 (named+0x1ae90)
Thread T11 'isc-worker0002' (tid=8040, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x2be1b)
#1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:75 (libisc.so.1504+0x7bcc4)
#2 isc_taskmgr_create /home/ondrej/Projects/bind9/lib/isc/task.c:1410 (libisc.so.1504+0x59d63)
#3 create_managers main.c:902 (named+0x1aeec)
#4 setup main.c:1235 (named+0x1aeec)
#5 main main.c:1515 (named+0x1aeec)
Thread T10 'isc-worker0001' (tid=8038, running) created by main thread at:
#0 pthread_create <null> (libtsan.so.0+0x2be1b)
#1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:75 (libisc.so.1504+0x7bcc4)
#2 isc_taskmgr_create /home/ondrej/Projects/bind9/lib/isc/task.c:1410 (libisc.so.1504+0x59d63)
#3 create_managers main.c:902 (named+0x1aeec)
#4 setup main.c:1235 (named+0x1aeec)
#5 main main.c:1515 (named+0x1aeec)
SUMMARY: ThreadSanitizer: data race /home/ondrej/Projects/bind9/lib/dns/rbtdb.c:4336 in check_stale_header
```August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Mark AndrewsMark Andrewshttps://gitlab.isc.org/isc-projects/bind9/-/issues/1456always check return from isc_refcount_decrement2020-08-04T09:45:08ZMark Andrewsalways check return from isc_refcount_decrementCoverity, correctly, complains that isc_refcount_decrement return is not always checked.
Additionally isc_refcount_decrement shouldn't be calling inside INSIST, INSIST should not
have side effects as it can be compiled out.Coverity, correctly, complains that isc_refcount_decrement return is not always checked.
Additionally isc_refcount_decrement shouldn't be calling inside INSIST, INSIST should not
have side effects as it can be compiled out.August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Mark AndrewsMark Andrewshttps://gitlab.isc.org/isc-projects/bind9/-/issues/1235system tests fail with new /etc/bind.keys installed2020-07-16T07:23:52ZMark Andrewssystem tests fail with new /etc/bind.keys installed```
20-Sep-2019 08:44:42.396 loading configuration from '/Users/marka/git/bind9/bin/tests/system/qmin2/ns1/named.conf'
20-Sep-2019 08:44:42.399 reading built-in trust anchors from file '/etc/bind.keys'
20-Sep-2019 08:44:42.399 /etc/bind....```
20-Sep-2019 08:44:42.396 loading configuration from '/Users/marka/git/bind9/bin/tests/system/qmin2/ns1/named.conf'
20-Sep-2019 08:44:42.399 reading built-in trust anchors from file '/etc/bind.keys'
20-Sep-2019 08:44:42.399 /etc/bind.keys:29: unknown option 'dnssec-keys'
20-Sep-2019 08:44:42.400 load_configuration: failure
20-Sep-2019 08:44:42.400 loading configuration: failure
20-Sep-2019 08:44:42.400 exiting (due to fatal error)
```
One can work around this temporarily by reverting /etc/bind.keys back to being managed keys but the correct fix
is to port to named to a private bind.keys instance.August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)https://gitlab.isc.org/isc-projects/bind9/-/issues/48Drop $SYSTEMTESTTOP from bin/tests/system/2020-08-04T10:01:40ZMichał KępieńDrop $SYSTEMTESTTOP from bin/tests/system/This was suggested by @ondrej in !7.
The `$SYSTEMTESTTOP` shell variable if often set to `..` in various shell scripts inside `bin/tests/system/`, but most of the time it is only used one line later, while sourcing `conf.sh`. This hard...This was suggested by @ondrej in !7.
The `$SYSTEMTESTTOP` shell variable if often set to `..` in various shell scripts inside `bin/tests/system/`, but most of the time it is only used one line later, while sourcing `conf.sh`. This hardly improves code readability.
`$SYSTEMTESTTOP` is also used for the purpose of referencing scripts/files living in `bin/tests/system/`, but given that the variable is always set to a short, relative path, we could ponder dropping it altogether and replacing all of its occurrences with the relative path without really adversely affecting code readability.August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)Michal NowakMichal Nowak