BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2021-11-10T01:51:51Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3003Greedy regular expression causes intermittent "nsupdate" system test failures2021-11-10T01:51:51ZMichał KępieńGreedy regular expression causes intermittent "nsupdate" system test failuresOne of the checks in the `nsupdate` system test [prepares][1] an
`nsupdate` script by processing a response to a DNSKEY query.
Specifically, it attempts to change the TTL of the DNSKEY RRset (from 10
to 600). However, a greedy regular e...One of the checks in the `nsupdate` system test [prepares][1] an
`nsupdate` script by processing a response to a DNSKEY query.
Specifically, it attempts to change the TTL of the DNSKEY RRset (from 10
to 600). However, a greedy regular expression involved in that process
may cause DNSKEY RDATA to be mangled instead of the TTL:
https://gitlab.isc.org/isc-private/bind9/-/jobs/2088895
```
05-Nov-2021 11:50:17.573 received client packet from 10.53.0.3#60245
;; ->>HEADER<<- opcode: UPDATE, status: NOERROR, id: 38838
;; flags:; ZONE: 1, PREREQ: 0, UPDATE: 2, ADDITIONAL: 0
;; ZONE SECTION:
;dnskey.test. IN SOA
;; UPDATE SECTION:
;dnskey.test. 600 IN DNSKEY 256 3 5 (
; AwEAAdS72SeIDeDR/y7ZxEToyLSQ
; Q/rm7f3dQBo/GK8RjRZTjTxMchRW
; itmi/kCJxSOW0rFV/ueWJTwcJbSq
; upYYo1bgNUGNmLDoYfPEDIsClZrK
; jaLjlSWb2v7nYGVuMpLGJX5D2NCm
; QJz5uOQR+b7r/8uSW1eQzodpsLTm
; XQCnuKvj
; ) ; ZSK; alg = RSASHA1 ; key id = 40375
;dnskey.test. 10 IN DNSKEY 257 3 5 (
; AwEAAa600INEzZ8hHtv3d2j5grzq
; 7gAvaWk2TxHTuFhRUuIVJxUNTpTa
; vHvSbZglx/AXSGIIgfXDKd0VVXTa
; sW0eewfCpjNol5Cgfnb+VlO5kmjW
; 6nr1UnLgd+H/sRdG1Ip8amR+D0Xi
; pYmXnOFuO2VvFRBizPlWCFu1sQFr
; sCRYXhB/
; ) ; KSK; alg = RSASHA1 ; key id = 19267
```
Note that the second DNSKEY RR still has a TTL of 10 seconds and
contains the string `600` in its RDATA. Looking at the contents of
`ns3/dnskey.test.db` confirms that the relevant RDATA originally
contained a string matching the regular expression `10.IN`, breaking the
replacement:
```
$TTL 10
dnskey.test. IN SOA dnskey.test. hostmaster.dnskey.test. 1 3600 900 2419200 3600
dnskey.test. IN NS dnskey.test.
dnskey.test. IN A 10.53.0.3
; This is a key-signing key, keyid 18947, for dnskey.test.
; Created: 20211105114907 (Fri Nov 5 11:49:07 2021)
; Publish: 20211105114907 (Fri Nov 5 11:49:07 2021)
; Activate: 20211105114907 (Fri Nov 5 11:49:07 2021)
dnskey.test. IN DNSKEY 257 3 5 AwEAAa100INEzZ8hHtv3d2j5grzq7gAvaWk2TxHTuFhRUuIVJxUNTpTa vHvSbZglx/AXSGIIgfXDKd0VVXTasW0eewfCpjNol5Cgfnb+VlO5kmjW 6nr1UnLgd+H/sRdG1Ip8amR+D0XipYmXnOFuO2VvFRBizPlWCFu1sQFr sCRYXhB/
```
This cannot end well:
```
05-Nov-2021 11:50:17.573 dns_dnssec_findzonekeys2: error reading Kdnskey.test.+005+19267.private: file not found
```
[1]: https://gitlab.isc.org/isc-projects/bind9/-/blob/b69dfd6a7503ebb02496e115c3c05cbbf5f5f4bc/bin/tests/system/nsupdate/tests.sh#L751-755December 2021 (9.16.24, 9.16.24-S1, 9.17.21)https://gitlab.isc.org/isc-projects/bind9/-/issues/2457ThreadSanitizer: data race in closedir2021-11-09T12:19:28ZMichal NowakThreadSanitizer: data race in closedirCustom experimental snapshot of gcc version 11.0.0 20210124 on Fedora 33 (as well as GCC 10.2.1 the default Fedora 33 compiler) identified following data race in `closedir` when `integrity`, `idna`, and few other system test were run:
``...Custom experimental snapshot of gcc version 11.0.0 20210124 on Fedora 33 (as well as GCC 10.2.1 the default Fedora 33 compiler) identified following data race in `closedir` when `integrity`, `idna`, and few other system test were run:
```
WARNING: ThreadSanitizer: data race
Write of size 8 at 0x000000000001 by thread T1:
#0 closedir <null>
#1 isc_dir_close lib/isc/unix/dir.c:134
#2 dns_dnssec_findmatchingkeys lib/dns/dnssec.c:1526
#3 statefile_exist lib/dns/zone.c:5858
#4 dns_zone_secure_to_insecure lib/dns/zone.c:5905
#5 dns_zone_use_kasp lib/dns/zone.c:5917
#6 add_sigs lib/dns/zone.c:6901
#7 dns__zone_updatesigs lib/dns/zone.c:8225
#8 zone_nsec3chain lib/dns/zone.c:8921
#9 zone_maintenance lib/dns/zone.c:11211
#10 zone_timer lib/dns/zone.c:14674
#11 dispatch lib/isc/task.c:1152
#12 run lib/isc/task.c:1344
#13 <null> <null>
Previous read of size 8 at 0x000000000001 by thread T2:
#0 epoll_ctl <null>
#1 uv__platform_invalidate_fd <null>
#2 uv_run <null>
#3 <null> <null>
Location is file descriptor 143 created by thread T2 at:
#0 accept4 <null>
#1 uv__accept <null>
#2 <null> <null>
Thread T1 (running) created by main thread at:
#0 pthread_create <null>
#1 isc_thread_create lib/isc/pthreads/thread.c:73
#2 isc_taskmgr_create lib/isc/task.c:1434
#3 create_managers bin/named/main.c:940
#4 setup bin/named/main.c:1248
#5 main bin/named/main.c:1548
Thread T2 (running) created by main thread at:
#0 pthread_create <null>
#1 isc_thread_create lib/isc/pthreads/thread.c:73
#2 isc_nm_start lib/isc/netmgr/netmgr.c:290
#3 create_managers bin/named/main.c:934
#4 setup bin/named/main.c:1248
#5 main bin/named/main.c:1548
SUMMARY: ThreadSanitizer: data race in closedir
```November 2021 (9.16.23, 9.16.23-S1, 9.17.20)https://gitlab.isc.org/isc-projects/bind9/-/issues/784bind 9.12.3-P1 fatal error2021-11-08T22:09:25ZHakan Gustafssonbind 9.12.3-P1 fatal error### Summary
I upgraded from 9.12.3 to 9.12.3-P1 2 days ago and today 2 of our 5 servers get fatal error and crashes with about 2 hours between each other. I started them again and everything seems normal again. We have no dynamic zones ...### Summary
I upgraded from 9.12.3 to 9.12.3-P1 2 days ago and today 2 of our 5 servers get fatal error and crashes with about 2 hours between each other. I started them again and everything seems normal again. We have no dynamic zones on these 2 servers.
Now i has happen again with another server, same version and the same behavior.
### BIND version used
BIND 9.12.3-P1 <id:cfdd35f>
running on Linux x86_64 3.10.0-862.14.4.el7.x86_64 #1 SMP Fri Sep 21 09:07:21 UTC 2018
built by make with '--prefix=/service/dns/bind-9.12.3-P1' '--sysconfdir=/data/dns/named' '--localstatedir=/var' '--with-openssl=/service/dns/openssl' '--enable-threads' '--with-libxml2' '--with-randomdev=/dev/urandom'
compiled by GCC 4.8.5 20150623 (Red Hat 4.8.5-28)
compiled with OpenSSL version: OpenSSL 1.0.2q 20 Nov 2018
linked to OpenSSL version: OpenSSL 1.0.2q 20 Nov 2018
compiled with libxml2 version: 2.9.1
linked to libxml2 version: 20901
compiled with zlib version: 1.2.7
linked to zlib version: 1.2.7
threads support is enabled
### Steps to reproduce
Don't know.
### What is the current *bug* behavior?
Fatal error and bind craches.
### What is the expected *correct* behavior?
Not relevant
### Relevant configuration files
options {
recursion yes;
dnssec-enable yes;
dnssec-validation no;
version "hakan 181213";
directory "/data/dns/named";
query-source address 194.165.224.190;
allow-recursion { norrnod; };
listen-on { interfaces; };
allow-transfer {
transferservers;
key dns.umea.se.key;
};
};
### Relevant logs and/or screenshots
named1
15-Dec-2018 12:22:17.925 general: critical: rbtdb.c:1509: fatal error:
15-Dec-2018 12:22:17.925 general: critical: RUNTIME_CHECK(rbtdb->next_serial != 0) failed
15-Dec-2018 12:22:17.925 general: critical: exiting (due to fatal error in library)
named2
15-Dec-2018 13:50:58.260 general: critical: rbtdb.c:1509: fatal error:
15-Dec-2018 13:50:58.260 general: critical: RUNTIME_CHECK(rbtdb->next_serial != 0) failed
15-Dec-2018 13:50:58.260 general: critical: exiting (due to fatal error in library)
### Possible fixes
Not relevanthttps://gitlab.isc.org/isc-projects/bind9/-/issues/3008Bind9 going down with error rbtdb->next_serial2021-11-08T22:09:24ZAleksandr NikitinBind9 going down with error rbtdb->next_serial### Summary
I have some instances bind9, and some of them going down with error
```
Nov 04 02:30:46 vm-name named[10257]: 04-Nov-2021 02:30:46.319 general: critical: ../../../lib/dns/rbtdb.c:1497: fatal error:
Nov 04 02:30:46 vm-name na...### Summary
I have some instances bind9, and some of them going down with error
```
Nov 04 02:30:46 vm-name named[10257]: 04-Nov-2021 02:30:46.319 general: critical: ../../../lib/dns/rbtdb.c:1497: fatal error:
Nov 04 02:30:46 vm-name named[10257]: 04-Nov-2021 02:30:46.319 general: critical: RUNTIME_CHECK(rbtdb->next_serial != 0) failed
Nov 04 02:30:46 vm-name named[10257]: 04-Nov-2021 02:30:46.319 general: critical: exiting (due to fatal error in library)
Nov 04 02:30:46 vm-name systemd[1]: bind9.service: Main process exited, code=killed, status=6/ABRT
Nov 04 02:30:46 vm-name systemd[1]: bind9.service: Failed with result 'signal'.
```
### BIND version used
```
BIND 9.11.5-P4-5.1+deb10u6-Debian (Extended Support Version) <id:998753c>
running on Linux x86_64 4.19.0-18-cloud-amd64 #1 SMP Debian 4.19.208-1 (2021-09-29)
built by make with '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=/usr/include' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-silent-rules' '--libdir=/usr/lib/x86_64-linux-gnu' '--libexecdir=/usr/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--disable-dependency-tracking' '--libdir=/usr/lib/x86_64-linux-gnu' '--sysconfdir=/etc/bind' '--with-python=python3' '--localstatedir=/' '--enable-threads' '--enable-largefile' '--with-libtool' '--enable-shared' '--enable-static' '--with-gost=no' '--with-openssl=/usr' '--with-gssapi=/usr' '--disable-isc-spnego' '--with-libidn2' '--with-libjson=/usr' '--with-lmdb=/usr' '--with-gnu-ld' '--with-geoip=/usr' '--with-atf=no' '--enable-ipv6' '--enable-rrl' '--enable-filter-aaaa' '--enable-native-pkcs11' '--with-pkcs11=/usr/lib/softhsm/libsofthsm2.so' '--with-randomdev=/dev/urandom' '--enable-dnstap' 'build_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -fdebug-prefix-map=/build/bind9-gHNcz0/bind9-9.11.5.P4+dfsg=. -fstack-protector-strong -Wformat -Werror=format-security -fno-strict-aliasing -fno-delete-null-pointer-checks -DNO_VERSION_DATE -DDIG_SIGCHASE' 'LDFLAGS=-Wl,-z,relro -Wl,-z,now' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2'
compiled by GCC 8.3.0
compiled with OpenSSL version: OpenSSL 1.1.1d 10 Sep 2019
linked to OpenSSL version: OpenSSL 1.1.1d 10 Sep 2019
compiled with libxml2 version: 2.9.4
linked to libxml2 version: 20904
compiled with libjson-c version: 0.12.1
linked to libjson-c version: 0.12.1
threads support is enabled
```
### Steps to reproduce
Run bind9 with configuration, files down in text
### What is the current *bug* behavior?
Bind9 fails and not automaticaly restarts
```
Nov 04 02:30:46 vm-name named[10257]: 04-Nov-2021 02:30:46.319 general: critical: ../../../lib/dns/rbtdb.c:1497: fatal error:
Nov 04 02:30:46 vm-name named[10257]: 04-Nov-2021 02:30:46.319 general: critical: RUNTIME_CHECK(rbtdb->next_serial != 0) failed
Nov 04 02:30:46 vm-name named[10257]: 04-Nov-2021 02:30:46.319 general: critical: exiting (due to fatal error in library)
Nov 04 02:30:46 vm-name systemd[1]: bind9.service: Main process exited, code=killed, status=6/ABRT
Nov 04 02:30:46 vm-name systemd[1]: bind9.service: Failed with result 'signal'.
```
### What is the expected *correct* behavior?
Bind9 not going down
### Relevant configuration files
My configuration is:
named.conf
```
key key.for.internal.domain {
algorithm HMAC-MD5;
secret "[masked]";
};
include "/etc/bind/named.conf.options";
include "/etc/bind/named.conf.local";
include "/etc/bind/named.conf.default-zones";
```
named.conf.options
```
acl "private" {
127.0.0.1/32; # localhost
172.16.1.0/24; # wg
};
options {
directory "/var/cache/bind";
allow-query { private; };
recursion yes;
forwarders {
127.0.0.1 port 5053;
1.1.1.1;
1.0.0.1;
8.8.8.8;
8.8.4.4;
};
dnssec-enable no;
dnssec-validation no;
listen-on { any; };
listen-on-v6 { none; };
};
```
named.conf.local
```
zone "dc1.internal.domain" {
type master;
file "/etc/bind/dc1.internal.domain.db";
allow-update { key "key.for.internal.domain"; };
};
zone "156.10.in-addr.arpa" {
type master;
file "/etc/bind/156.10.in-addr.arpa.db";
allow-update { key "key.for.internal.domain"; };
};
zone "dc2.internal.domain" {
type slave;
file "/etc/bind/slave_dc2.internal.domain.db";
masters { 10.60.0.1; };
};
zone "60.10.in-addr.arpa" {
type slave;
file "/etc/bind/slave_60.10.in-addr.arpa.db";
masters { 10.60.0.1; };
};
zone "dc3.internal.domain" {
type slave;
file "/etc/bind/slave_dc3.internal.domain.db";
masters { 10.200.0.1; };
};
zone "200.10.in-addr.arpa" {
type slave;
file "/etc/bind/slave_200.10.in-addr.arpa.db";
masters { 10.200.0.1; };
};
zone "dc4.internal.domain" {
type slave;
file "/etc/bind/slave_dc4.internal.domain.db";
masters { 10.90.0.1; };
};
zone "90.10.in-addr.arpa" {
type slave;
file "/etc/bind/slave_90.10.in-addr.arpa.db";
masters { 10.90.0.1; };
};
zone "dc5.internal.domain" {
type slave;
file "/etc/bind/slave_dc5.internal.domain.db";
masters { 10.9.96.1; };
};
zone "9.10.in-addr.arpa" {
type slave;
file "/etc/bind/slave_9.10.in-addr.arpa.db";
masters { 10.9.96.1; };
};
```
named.conf.default-zones
```
zone "." {
type hint;
file "/usr/share/dns/root.hints";
};
zone "internal.domain" {
type master;
file "/etc/bind/internal.domain.db";
};
zone "another.domain" {
type master;
file "/etc/bind/another.domain.db";
};
zone "third.domain" {
type master;
file "/etc/bind/third.domain.db";
};
zone "fourth.domain" {
type master;
file "/etc/bind/fourth.domain.db";
};
```https://gitlab.isc.org/isc-projects/bind9/-/issues/3007DNS resolution fails temporarily2021-11-08T12:02:25ZK VDNS resolution fails temporarilyWe are using two Named Servers in our Production system.
BIND 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.3 (Extended Support Version) on Linux x86_64 4.9.215-36.el7.x86_64
Recently, we started to see a trend when the DNS resolution fails betwe...We are using two Named Servers in our Production system.
BIND 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.3 (Extended Support Version) on Linux x86_64 4.9.215-36.el7.x86_64
Recently, we started to see a trend when the DNS resolution fails between a specific time period for random domain names(out of over 100 records). Each record may fail for max 10 minutes. At all other times, it works absolutely fine.
AT THE TIME OF ISSUE:
```
dig @MY_DNS_SERVER docker.mycompany.net
; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.3 <<>> @MY_DNS_SERVER docker.mycompany.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 33467
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;docker.mycompany.net. IN A
;; AUTHORITY SECTION:
mycompany.net. 58 IN SOA ns-604.awsdns-11.net. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400
;; Query time: 0 msec
;; SERVER: MY_DNS_SERVER#53(MY_DNS_SERVER)
;; WHEN: Thu Nov 04 10:22:01 UTC 2021
;; MSG SIZE rcvd: 128
```
NORMAL TIMES:
```
dig @MY_DNS_SERVER docker.mycompany.net
; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.3 <<>> @MY_DNS_SERVER docker.mycompany.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 28581
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 4, ADDITIONAL: 7
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;docker.mycompany.net. IN A
;; ANSWER SECTION:
docker.mycompany.net. 54 IN A PROPER IP
docker.mycompany.net. 54 IN A PROPER IP
;; AUTHORITY SECTION:
mycompany.net. 81323 IN NS ns-16.awsdns-02.com.
mycompany.net. 81323 IN NS ns-1158.awsdns-16.org.
mycompany.net. 81323 IN NS ns-604.awsdns-11.net.
mycompany.net. 81323 IN NS ns-1731.awsdns-24.co.uk.
;; ADDITIONAL SECTION:
ns-1158.awsdns-16.org. 47096 IN A 205.251.196.134
ns-16.awsdns-02.com. 25941 IN A 205.251.192.16
ns-604.awsdns-11.net. 81323 IN A 205.251.194.92
ns-1158.awsdns-16.org. 68043 IN AAAA 2600:9000:5304:8600::1
ns-16.awsdns-02.com. 60863 IN AAAA 2600:9000:5300:1000::1
ns-1731.awsdns-24.co.uk. 38132 IN AAAA 2600:9000:5306:c300::1
;; Query time: 0 msec
;; SERVER: MY_DNS_SERVER#53(MY_DNS_SERVER)
;; WHEN: Thu Nov 04 10:29:03 UTC 2021
;; MSG SIZE rcvd: 347
```
The BIND Cache metrics shows a trend where it exactly starts to increase around the start of the issue(2:30pm IST). Though the DNS Resolution resolves within an hour, the graph shows a continuous upward trend which decreases only beyond midnight that day.
![BIND_CACHE_METRICS](/uploads/e26bda06452d8554a835d37e73128beb/BIND_CACHE_METRICS.PNG)
This is badly affecting Production users. Please share your suggestions as soon as possible.https://gitlab.isc.org/isc-projects/bind9/-/issues/2942Replace the CHANGES file with a more practical alternative2021-11-05T08:27:39ZMichał KępieńReplace the CHANGES file with a more practical alternativeI am creating this issue so that we can discuss whether we can come up
with a usable alternative for maintaining the `CHANGES` file in the BIND
9 source repository.
This would really be a topic for an All Hands meeting, but it was raise...I am creating this issue so that we can discuss whether we can come up
with a usable alternative for maintaining the `CHANGES` file in the BIND
9 source repository.
This would really be a topic for an All Hands meeting, but it was raised
often enough recently that it arguably should not wait until the next
All Hands actually happens. Releasing BIND 9.18.0 sounds like a nice
milestone at which we could change things.
Let's start with the pros:
- `CHANGES` is a quick way for people not necessarily proficient with
Git (support folks, users, etc.) to get a quick summary of what
changed in a given BIND 9 release.[^1]
- `CHANGES` is trivial to search for keywords.
- Each `CHANGES` entry is a unique[^2] identifier of a given set of
changes *across various BIND 9 branches*.
Then, there are the cons:
- These days, `CHANGES` entries are pretty much duplicates of commit
log messages (which have become more verbose in the past months).
However, they cannot be as long or as exhaustive as the latter,
which necessitates edits/rewrites, which in certain cases causes a
significant amount of time to be spent on making them legible and/or
correct and/or precise (enough), both when merge requests are
discussed and when monthly releases are prepared. As `CHANGES` has
a more limited target audience than release notes, it does not
necessarily feel like time well spent.
- There are no strict rules governing what gets into `CHANGES`, in
what form, and under what circumstances. We try to adhere to a rule
of thumb which says that "anything that might be of interest to
users and/or important to developers should be listed", but that
turns out to be a very fuzzy line in practice.
- Since the `CHANGES` file is under version control, all entries need
to be cleaned up and prepared *before* any BIND 9 release is
prepared. If a `CHANGES` tweak/fix/addition turns out to be
necessary after a release is prepared, a respin is in order just to
fix a text file.
- `CHANGES` is generally a superset of release notes, which are
supposed to list all important user-facing changes. However, the
form and/or detail level of a given `CHANGES` entry often differs
from its release notes counterpart. This means more duplicate work
for every release.
Replacing `CHANGES` with some other solution acceptable by other ISC
teams would allow SWENG to save time on discussing repeated/derivative
texts, allowing us to spend that time on preparing accurate, verbose
commit messages which are not limited to a few lines in size. It would
also help avoid burning engineering time on silly stuff like retesting
an entire release because of tweaks which do not affect the code itself
or preparing & backporting missing `CHANGES` entries which were
forgotten about (or consciously omitted) for random MRs.
[^1]: There is a a catch, though: adding a `CHANGES` entry for any given
set of changes is not mandatory and therefore arbitrary in
practice. Sometimes we list trivial stuff, sometimes we gloss
over modifications which turn out to have critical consequences
down the line, sometimes we spend time on arguing whether
something needs to be listed or not.
[^2]: Sometimes we assign different texts to the same entry. Example:
entry 5712 in `main` vs. `v9_16`.November 2021 (9.16.23, 9.16.23-S1, 9.17.20)Michał KępieńMichał Kępieńhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1426Intermittent failure in the qmin system test2021-11-05T07:58:34ZOndřej SurýIntermittent failure in the qmin system test* https://gitlab.isc.org/isc-projects/bind9/-/jobs/440093* https://gitlab.isc.org/isc-projects/bind9/-/jobs/440093BIND 9.19.xhttps://gitlab.isc.org/isc-projects/bind9/-/issues/481krb5-subdomain update policy fails to correctly enforce restrictions?2021-11-05T07:09:02ZMichael McNallykrb5-subdomain update policy fails to correctly enforce restrictions?<!--
If the bug you are reporting is potentially security-related - for example,
if it involves an assertion failure or other crash in `named` that can be
triggered repeatedly - then please do *NOT* report it here, but send an
email to [...<!--
If the bug you are reporting is potentially security-related - for example,
if it involves an assertion failure or other crash in `named` that can be
triggered repeatedly - then please do *NOT* report it here, but send an
email to [security-officer@isc.org](security-officer@isc.org).
-->
### Summary
Submitter Dominik George (nik@naturalnet.de) reports a problem with the krb5-subdomain update policy. If confirmed it sounds as though it would improperly permit update to unauthorized clients.
### Submitter's message (to security-officer@isc.org)
Hi,
while debugging a malfunction of our BIND9 setup, we discovered what we
consider a security issue in BIND9's GSS-TSIG handling, to be more precise
in the handling of nams in the krb5-subdomain update-policy rule.
The following relates to BIND9 9.10.3, but the errornous code is also
present in 9.11.
As laid out in the documentation:
> This rule takes a Kerberos machine principal (host/machine@REALM) for
> machine in REALM and converts it to machine.realm allowing the machine to
> update subdomains of machine.realm. The REALM to be matched is specified in
> the identity field. The name field should be set to "."
So, the following rule…
> grant EXAMPLE.COM krb5-subdomain .
…together with an incoming signed request from the principal
host/foo.example.com@EXAMPLE.COM should succedd if, and only if…
* the realm of the client matches EXAMPLE.COM - and -
* the service name matches the string "host" - and -
* the requested name is a subdomain of the principal name.
The code in lib/dns/ssu.c, in case DNS_SSUMATCHTYPE_SUBDOMAINKRB, however,
completely lacks the last check. Instead, it checks whether the requested
update name is a subdomain of the name in the rule. This check always
returns true if configured in line with the docs, which propose always using
"." as the rule name.
Thus, I conclude that the krb5-subdomain check, along with the ms-subdomain
check, is completely ineffective.
I verified that, in fact, the above rule allows anyone with a
host/*@EXAMPLE.COM principal to update arbitrary DNS records.
As a bonus, the krb5-self and krb5-subdomain check types are out of sync
(with each other, and with the docs) in that krb5-subdomain does check the
requested name against the rule name, and krb5-self doesn't, and the docs
say both don't. It should probably be removed in or added to both. Also,
according to the docs, the realm fro mthe signer's principal is appended to
the machine name from the client principal - which is not the case either.
If this is correct and quoted, please refer to Dominik George and Thorsten
Glaser, and to the sponsors Teckids e.V. and tarent solutions GmbH.
We are preparing and testing a patch.
Cheers,
Nik
--
PGP-Fingerprint: 3C9D 54A4 7575 C026 FB17 FD26 B79A 3C16 A0C4 F296
Dominik George · Hundeshagenstr. 26 · 53225 Bonn
Phone: +49 228 92934581 · https://www.dominik-george.de/
Teckids e.V. · FrOSCon e.V. · Debian Developer
LPIC-3 Linux Enterprise Professional (Security)November 2021 (9.16.23, 9.16.23-S1, 9.17.20)Mark AndrewsMark Andrewshttps://gitlab.isc.org/isc-projects/bind9/-/issues/2821TCP timeout allowance too short2021-11-04T14:48:12ZMark AndrewsTCP timeout allowance too shortWhen testing tcp-on-no-cookie there where lots of fetch timeouts with the existing 1 second allowance. Updating this to 2 seconds reduced most of these and updating this to 3 second removed all of these.When testing tcp-on-no-cookie there where lots of fetch timeouts with the existing 1 second allowance. Updating this to 2 seconds reduced most of these and updating this to 3 second removed all of these.December 2021 (9.16.24, 9.16.24-S1, 9.17.21)https://gitlab.isc.org/isc-projects/bind9/-/issues/2884Sometimes dig aborts on an AXFR query over TLS2021-11-04T13:43:09ZCesar KuroiwaSometimes dig aborts on an AXFR query over TLS<!--
If the bug you are reporting is potentially security-related - for example,
if it involves an assertion failure or other crash in `named` that can be
triggered repeatedly - then please do *NOT* report it here, but send an
email to [...<!--
If the bug you are reporting is potentially security-related - for example,
if it involves an assertion failure or other crash in `named` that can be
triggered repeatedly - then please do *NOT* report it here, but send an
email to [security-officer@isc.org](security-officer@isc.org).
-->
### Summary
Sometimes (not always) `dig` crashes with a core dump on a AXFR query over TLS
### BIND version used
DiG 9.17.17
### Steps to reproduce
Using dig to make successive AXFR requests over TLS, using a HMAC-SHA256 TSIG key
```
% dig @<server> -p 853 +tls bom axfr -y hmac-sha256:<tsig-key>:<secret>
```
### Relevant logs and/or screenshots
```
bom. 172800 IN SOA a.dns.br. hostmaster.registro.br. 2021238532 1800 900 604800 900
bom. 21600 IN DNSKEY 256 3 13 fBrEkmLy0X3ANfdXKdkj8fNPAUbwhpC5VvlBaQMzi+8h63iePUcM/dcT aJVVpWgas+HgkKlA3wCTAAFuJJ1uCA==
bom. 21600 IN DNSKEY 257 3 13 jBWA/GVSitmW8erjfZc6plKFXq2j8OOR5FR+3qgAz8nM8yW4+8egKfNO fV1ynbGKAzOzyiC3xuGm7x3RfPXmNg==
bom. 172800 IN NSEC3PARAM 1 0 10 D5193D4EFD6031FF646F
tsig-gtlds. 0 ANY TSIG hmac-sha256. 1630015887 300 32 xcJBeqP007hmCBgx9MXXGbN1m6MieWJJUJzQOGhPhrE= 9519 NOERROR 0
nic.bom. 3600 IN NS ns.dns.br.
nic.bom. 3600 IN NS ns2.dns.br.
nic.bom. 3600 IN DS 40126 13 2 2B666C945F28C2ADA55279798382CC1BCA19DED7A32E3E0B1680FE84 4B1C154F
nic.bom. 3600 IN RRSIG DS 13 2 3600 20210903143501 20210819133501 16861 bom. az4R7y6/rjAQia9lTlcjcISj182zosM35mIlSlub108MrMXVbOZUsrzZ 3rHNKGch4j23G0ljP8ELgo5+L3AGNw==
bom. 172800 IN NS a.dns.br.
bom. 172800 IN NS b.dns.br.
bom. 172800 IN NS c.dns.br.
bom. 172800 IN NS d.dns.br.
bom. 172800 IN NS e.dns.br.
bom. 172800 IN NS f.dns.br.
bom. 21600 IN RRSIG DNSKEY 13 1 21600 20210915000000 20210611000000 14600 bom. a8Tag5dWGGgKNF0nbZfIO+ODDnaDstPjrE7BG7rRojU+KUw8uewTN/Yd 5PrgjC4wUBVbaDTNkG0evhO9dGmVXA==
bom. 172800 IN RRSIG SOA 13 1 172800 20210910221001 20210826211001 16861 bom. 23yXB1pgr7N+SdI1MtDKELA7Vrjt7/rWT1wx5AXPbNXNBFiEPBRE30Fo vJE4fBT0l5ZLbffEfJKm65XmBGBtjw==
bom. 172800 IN RRSIG NSEC3PARAM 13 1 172800 20210910221001 20210826211001 16861 bom. LzCugkc0fnBsssqz7zkj/gasbnjKy5zkSWRrfgFI2c5HW6KpQ2ImXfSG xFtJgaVWl5+QjDWNzaMVublFHe/A5A==
bom. 172800 IN RRSIG NS 13 1 172800 20210910221001 20210826211001 16861 bom. kMcSAe8somkybKerjoCV8WG0WfnUXLAO88b0sutOrAVFJR1X4655hw9R CQZKqsC3RpqnDPcbN2QSVT91wcU5jg==
6mom8bvsn8og38k5j20nubclk3l258vi.bom. 900 IN RRSIG NSEC3 13 2 900 20210903143501 20210819133501 16861 bom. zSfR9WRP+xHbBflvdOpTQDleukg+sTaTB62FvPNC15pxroPkRdTlIOLW jg54yxwye+bBcnHH+HWRtzF0QbfxOA==
6mom8bvsn8og38k5j20nubclk3l258vi.bom. 900 IN NSEC3 1 1 10 D5193D4EFD6031FF646F SD3KLP44SLP2IB3IRND61I31V6ROG2PE NS DS RRSIG
sd3klp44slp2ib3irnd61i31v6rog2pe.bom. 900 IN RRSIG NSEC3 13 2 900 20210910221001 20210826211001 16861 bom. NL2n5iahzZF59faktxT+x22UA8548NwIoDZc/WHub8wdQ02F134kq0Uo 8NbEzvKJdB0/FQNWf222+l5HWYJzOw==
sd3klp44slp2ib3irnd61i31v6rog2pe.bom. 900 IN NSEC3 1 1 10 D5193D4EFD6031FF646F 6MOM8BVSN8OG38K5J20NUBCLK3L258VI NS SOA RRSIG DNSKEY NSEC3PARAM
bom. 172800 IN SOA a.dns.br. hostmaster.registro.br. 2021238532 1800 900 604800 900
tsig-gtlds. 0 ANY TSIG hmac-sha256. 1630015887 300 32 Y1j1OjfZqqmzPHeTvNvVrEIfiK4UbMLiWntNw/Vo5A4= 9519 NOERROR 0
;; Query time: 1 msec
;; WHEN: Thu Aug 26 19:11:27 -03 2021
;; XFR size: 23 records (messages 2, bytes 1777)
netmgr/netmgr.c:1703: REQUIRE(((__builtin_expect(!!((handle) != ((void *)0)), 1) && __builtin_expect(!!(((const isc__magic_t *)(handle))->magic == ((('N') << 24 | ('M') << 16 | ('H') << 8 | ('D')))), 1)) && __c11_atomic_load(&(handle)->references, memory_order_seq_cst) > 0)) failed, back trace
0x8002ba7a0 <isc_assertion_setcallback+0x50> at /home/cesar/named/lib/libisc-9.17.17.so
0x8002ba74a <isc_assertion_failed+0xa> at /home/cesar/named/lib/libisc-9.17.17.so
0x8002aae6b <isc__nm_get_read_req+0x11b> at /home/cesar/named/lib/libisc-9.17.17.so
0x8002b5295 <isc__nm_tlsdns_processbuffer+0x95> at /home/cesar/named/lib/libisc-9.17.17.so
0x8002ab338 <isc__nm_process_sock_buffer+0x48> at /home/cesar/named/lib/libisc-9.17.17.so
0x8002b4926 <isc__nm_async_tlsdnsshutdown+0x366> at /home/cesar/named/lib/libisc-9.17.17.so
0x8002b5565 <isc__nm_tlsdns_read_cb+0xb5> at /home/cesar/named/lib/libisc-9.17.17.so
0x8007c5590 <uv__stream_init+0x500> at /usr/local/lib/libuv.so.1
0x8007cc12b <uv__io_poll+0x82b> at /usr/local/lib/libuv.so.1
0x8007bb551 <uv_run+0x1b1> at /usr/local/lib/libuv.so.1
0x8002a4deb <isc__netmgr_create+0x71b> at /home/cesar/named/lib/libisc-9.17.17.so
0x8002e9914 <isc__trampoline_run+0x54> at /home/cesar/named/lib/libisc-9.17.17.so
Abort (core dumped)
```October 2021 (9.11.36, 9.11.36-S1, 9.16.22, 9.16.22-S1, 9.17.19)Artem BoldarievArtem Boldarievhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2998CID 340918: Uninitialized variables (UNINIT)2021-11-03T14:35:18ZMichal NowakCID 340918: Uninitialized variables (UNINIT)It seems to point to 78066157145b6a75f58ff843ac32ffabe62b2143:
```
790 static isc_result_t
791. opensslrsa_tofile(const dst_key_t *key, const char *directory) {
792 isc_result_t ret;
1. var_decl: Declaring variable p...It seems to point to 78066157145b6a75f58ff843ac32ffabe62b2143:
```
790 static isc_result_t
791. opensslrsa_tofile(const dst_key_t *key, const char *directory) {
792 isc_result_t ret;
1. var_decl: Declaring variable priv without initializer.
793 dst_private_t priv;
794 unsigned char *bufs[8] = { NULL };
795 unsigned short i = 0;
796 EVP_PKEY *pkey;
797. #if OPENSSL_VERSION_NUMBER < 0x30000000L
798 RSA *rsa = NULL;
799 const BIGNUM *n = NULL, *e = NULL, *d = NULL;
800 const BIGNUM *p = NULL, *q = NULL;
801 const BIGNUM *dmp1 = NULL, *dmq1 = NULL, *iqmp = NULL;
802 #else
803 BIGNUM *n = NULL, *e = NULL, *d = NULL;
804 BIGNUM *p = NULL, *q = NULL;
805 BIGNUM *dmp1 = NULL, *dmq1 = NULL, *iqmp = NULL;
806. #endif /* OPENSSL_VERSION_NUMBER < 0x30000000L */
807
2. Condition key->keydata.pkey == NULL, taking true branch.
808 if (key->keydata.pkey == NULL) {
3. Jumping to label err.
809 DST_RET(DST_R_NULLKEY);
810 }
```
```
*** CID 340918: Uninitialized variables (UNINIT)
/lib/dns/opensslrsa_link.c: 937 in opensslrsa_tofile()
931 priv.nelements = i;
932 ret = dst__privstruct_writefile(key, &priv, directory);
933
934 err:
935 for (i = 0; i < ARRAY_SIZE(bufs); i++) {
936 if (bufs[i] != NULL) {
>>> CID 340918: Uninitialized variables (UNINIT)
>>> Using uninitialized value "priv.elements[i].length" when calling "isc__mem_put".
937 isc_mem_put(key->mctx, bufs[i],
938 priv.elements[i].length);
939 }
940 }
941 #if OPENSSL_VERSION_NUMBER < 0x30000000L
942 RSA_free(rsa);
```November 2021 (9.16.23, 9.16.23-S1, 9.17.20)Mark AndrewsMark Andrewshttps://gitlab.isc.org/isc-projects/bind9/-/issues/2946Investigate resolver hangs in Perflab2021-11-03T13:52:54ZMichał KępieńInvestigate resolver hangs in PerflabAfter #2401/!4601 got merged, recursive tests in Perflab started
triggering what looks like hangs in the new resolver code:
- https://perflab.isc.org/#/config/run/5bf195dd83ba91a870b2976f/
- https://perflab.isc.org/#/config/run/5cd6...After #2401/!4601 got merged, recursive tests in Perflab started
triggering what looks like hangs in the new resolver code:
- https://perflab.isc.org/#/config/run/5bf195dd83ba91a870b2976f/
- https://perflab.isc.org/#/config/run/5cd6a166643076f6c1f6c26f/
- https://perflab.isc.org/#/config/run/5db74b6264458967f762143a/
- https://perflab.isc.org/#/config/run/5db74b7264458967f762143b/
- https://perflab.isc.org/#/config/run/5db74c2764458967f7621440/
- https://perflab.isc.org/#/config/run/5db74c3464458967f7621441/
This was already [pointed out][1] on Mattermost. What seems to be
happening is that during the recursive tests in Perflab, the tested
resolver only responds to queries for a few (dozen) seconds and then
stops responding indefinitely. Such failure modes were *not* observed
in AWS-based resolver benchmarks.
We should definitely get to the bottom of what is happening here.
[1]: https://mattermost.isc.org/isc/pl/oxnmi51mstyt7fs5pbdmhybpdyNovember 2021 (9.16.23, 9.16.23-S1, 9.17.20)https://gitlab.isc.org/isc-projects/bind9/-/issues/2997nsupdate command return infomation: dns_request_createvia3: address in use2021-11-03T06:10:24Z395096713nsupdate command return infomation: dns_request_createvia3: address in usehi admin:
i meet a problem ,when use nudapte to add record ,the return infomation is dns_request_createvia3: address in use
my os is centos7
my bind-ver is bind-9.11.0.3
please help me,what's the reason of this problem?hi admin:
i meet a problem ,when use nudapte to add record ,the return infomation is dns_request_createvia3: address in use
my os is centos7
my bind-ver is bind-9.11.0.3
please help me,what's the reason of this problem?https://gitlab.isc.org/isc-projects/bind9/-/issues/2767Run `cleanup_dead_nodes_callback` on the uv threadpool2021-11-02T18:35:04ZOndřej SurýRun `cleanup_dead_nodes_callback` on the uv threadpoolThe `cleanup_dead_nodes` processing would be an excellent candidate to run on the background thread.The `cleanup_dead_nodes` processing would be an excellent candidate to run on the background thread.December 2021 (9.16.24, 9.16.24-S1, 9.17.21)https://gitlab.isc.org/isc-projects/bind9/-/issues/2887respdiff is really slow on Bullseyes2021-11-02T15:59:53ZMichal Nowakrespdiff is really slow on Bullseyes[respdiff on Bullseyes](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1945178) takes 91 minutes and identifies 1.5 % query disagreements (80 % of that are timeouts). On Buster the [test](https://gitlab.isc.org/isc-projects/bind9/-/job...[respdiff on Bullseyes](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1945178) takes 91 minutes and identifies 1.5 % query disagreements (80 % of that are timeouts). On Buster the [test](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1946373) takes 27 minutes and identifies just 0.34 % query disagreements (8 % of that are timeouts).
Locally I am able to identify something similar but instead of timeouts I get 15 % of SERVFAILs from the 100,000 query set. I can make it more stable by setting `jobs = 2` instead of the default `jobs = 16` in `respdiff.cfg`. When running `dig` manually for known working query, I get SERVFAIL in 99 % of attempts as if the query was over some limit. I don't know what might be the difference between Buster and Bullseye (I checked `./configure` and `sysctl` and there are not significant differences). Though, it might an unrelated problem on my system.
This needs to be fixed for the base image being switched to Bullseye.November 2021 (9.16.23, 9.16.23-S1, 9.17.20)Michal NowakMichal Nowakhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2664Data race in fwrite, memcpy, and free in dnstap_test with atomics disabled2021-11-02T15:49:26ZMichal NowakData race in fwrite, memcpy, and free in dnstap_test with atomics disabled[GCC](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1672799) and [Clang](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1672800) report TSAN errors on ~"v9.11" with dnstap enabled and atomics disabled in the `dnstap_test`. These err...[GCC](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1672799) and [Clang](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1672800) report TSAN errors on ~"v9.11" with dnstap enabled and atomics disabled in the `dnstap_test`. These errors should not come from `libfstrm.so` as these are suppressed. Before !4618 gets integrated see the `mnowak/configure-with-enable-dnstap-by-default-v9_11` branch and set `CI_REGISTRY_IMAGE=registry.gitlab.isc.org/isc-projects/images/bind9-staging`.
```
WARNING: ThreadSanitizer: data race
Read of size 8 at 0x000000000001 by thread T1:
#0 memcpy <null>
#1 <null> <null>
Previous write of size 8 at 0x000000000001 by main thread:
#0 memcpy <null>
#1 <null> <null>
#2 dns_dt_send lib/dns/dnstap.c:780:3
#3 send_test lib/dns/tests/dnstap_test.c:259:3
#4 <null> <null>
#5 __libc_start_main /build/glibc-vjB4T2/glibc-2.28/csu/../csu/libc-start.c:308:16
Location is heap block of size 16384 at 0x000000000009 allocated by main thread:
#0 calloc <null>
#1 <null> <null>
#2 send_test lib/dns/tests/dnstap_test.c:176:11
#3 <null> <null>
#4 __libc_start_main /build/glibc-vjB4T2/glibc-2.28/csu/../csu/libc-start.c:308:16
Thread T2 (running) created by main thread at:
#0 pthread_create <null>
#1 fstrm_iothr_init <null>
#2 send_test lib/dns/tests/dnstap_test.c:176:11
#3 <null> <null>
#4 __libc_start_main /build/glibc-vjB4T2/glibc-2.28/csu/../csu/libc-start.c:308:16
SUMMARY: ThreadSanitizer: data race in memcpy
```
```
WARNING: ThreadSanitizer: data race
Write of size 8 at 0x000000000001 by thread T1:
#0 free <null>
#1 <null> <null>
Previous write of size 8 at 0x000000000001 by main thread:
#0 malloc <null>
#1 pack_dt lib/dns/dnstap.c:543:14
#2 dns_dt_send lib/dns/dnstap.c:779:6
#3 send_test lib/dns/tests/dnstap_test.c:257:3
#4 <null> <null>
#5 __libc_start_main /build/glibc-vjB4T2/glibc-2.28/csu/../csu/libc-start.c:308:16
Thread T2 (running) created by main thread at:
#0 pthread_create <null>
#1 fstrm_iothr_init <null>
#2 send_test lib/dns/tests/dnstap_test.c:176:11
#3 <null> <null>
#4 __libc_start_main /build/glibc-vjB4T2/glibc-2.28/csu/../csu/libc-start.c:308:16
SUMMARY: ThreadSanitizer: data race in free
```
```
WARNING: ThreadSanitizer: data race
Read of size 8 at 0x000000000001 by thread T1:
#0 fwrite <null>
#1 <null> <null>
Previous write of size 8 at 0x000000000001 by main thread:
#0 malloc <null>
#1 pack_dt lib/dns/dnstap.c:543:14
#2 dns_dt_send lib/dns/dnstap.c:779:6
#3 send_test lib/dns/tests/dnstap_test.c:254:3
#4 <null> <null>
#5 __libc_start_main /build/glibc-vjB4T2/glibc-2.28/csu/../csu/libc-start.c:308:16
Location is heap block of size 256 at 0x000000000001 allocated by main thread:
#0 malloc <null>
#1 pack_dt lib/dns/dnstap.c:543:14
#2 dns_dt_send lib/dns/dnstap.c:779:6
#3 send_test lib/dns/tests/dnstap_test.c:254:3
#4 <null> <null>
#5 __libc_start_main /build/glibc-vjB4T2/glibc-2.28/csu/../csu/libc-start.c:308:16
Thread T2 (running) created by main thread at:
#0 pthread_create <null>
#1 fstrm_iothr_init <null>
#2 send_test lib/dns/tests/dnstap_test.c:176:11
#3 <null> <null>
#4 __libc_start_main /build/glibc-vjB4T2/glibc-2.28/csu/../csu/libc-start.c:308:16
SUMMARY: ThreadSanitizer: data race in fwrite
```
I'll just disable dnstap in `unit:gcc:tsan:noatomics` and `unit:clang:tsan:noatomics` if we decide not to fix these errors given the phase ~"v9.11" is in.November 2021 (9.16.23, 9.16.23-S1, 9.17.20)https://gitlab.isc.org/isc-projects/bind9/-/issues/2350Dead code guarded by PK11_RSA_PKCS_REPLACE in pkcs11rsa_link.c2021-11-02T15:46:20ZMichal NowakDead code guarded by PK11_RSA_PKCS_REPLACE in pkcs11rsa_link.c`PK11_RSA_PKCS_REPLACE` is not defined in `configure` or described in documentation and is therefore invisible for the user on `main` and `v9_16`.
In `v9_11` one might set it via `./configure --enable-native-pkcs11 --with-pkcs11=/opt/Ke...`PK11_RSA_PKCS_REPLACE` is not defined in `configure` or described in documentation and is therefore invisible for the user on `main` and `v9_16`.
In `v9_11` one might set it via `./configure --enable-native-pkcs11 --with-pkcs11=/opt/Keyper/PKCS11Provider/pkcs11.so` (see `configure.ac`, line 2324), which sets `-DPK11_FLAVOR=PK11_AEP_FLAVOR`, which in `lib/isc/include/pk11/site.h` via `#if PK11_FLAVOR == PK11_AEP_FLAVOR` defines `PK11_RSA_PKCS_REPLACE`.
`#else /* ifndef PK11_RSA_PKCS_REPLACE */` blocks in `lib/dns/pkcs11rsa_link.c` are either dead code forgotten when OpenSSL was made mandatory with c3b8130fe8267185e786e9c12527df7c53b37589 in `v9_13_3` and thus should be dropped, made available by `configure`, or the least documented in e.g. `OPTIONS.md`.November 2021 (9.16.23, 9.16.23-S1, 9.17.20)https://gitlab.isc.org/isc-projects/bind9/-/issues/2932TSAN reports indicating reference counting issues with dispatch@netmgr2021-11-02T15:39:31ZMichał KępieńTSAN reports indicating reference counting issues with dispatch@netmgrIn the following GitLab CI job, previously unseen TSAN reports have been
generated during the `fetchlimit` system test:
https://gitlab.isc.org/isc-projects/bind9/-/jobs/2020086
If I am reading these reports correctly, it seems that a f...In the following GitLab CI job, previously unseen TSAN reports have been
generated during the `fetchlimit` system test:
https://gitlab.isc.org/isc-projects/bind9/-/jobs/2020086
If I am reading these reports correctly, it seems that a fetch context
is simultaneously being destroyed and started, which does not look quite
right. It needs a look from @each and/or @ondrej, though, as I may be
misinterpreting these reports.
<details>
<summary>Click here to expand/fold TSAN reports</summary>
### Report 1 (for `fctx->altfinds`)
```
WARNING: ThreadSanitizer: data race
Write of size 8 at 0x000000000001 by thread T1:
#0 sort_finds lib/dns/resolver.c:3178
#1 fctx_getaddresses lib/dns/resolver.c:3633
#2 fctx_try lib/dns/resolver.c:3912
#3 fctx_start lib/dns/resolver.c:4471
#4 task_run lib/isc/task.c:827
#5 isc_task_run lib/isc/task.c:907
#6 isc__nm_async_task netmgr/netmgr.c:827
#7 process_netievent netmgr/netmgr.c:906
#8 process_queue netmgr/netmgr.c:998
#9 process_all_queues netmgr/netmgr.c:746
#10 async_cb netmgr/netmgr.c:775
#11 <null> <null>
#12 isc__trampoline_run lib/isc/trampoline.c:185
#13 <null> <null>
Previous read of size 8 at 0x000000000001 by thread T2 (mutexes: write M1):
#0 fctx_decreference lib/dns/resolver.c:6881
#1 dns_resolver_destroyfetch lib/dns/resolver.c:10604
#2 fetch_callback lib/ns/query.c:6253
#3 task_run lib/isc/task.c:827
#4 isc_task_run lib/isc/task.c:907
#5 isc__nm_async_task netmgr/netmgr.c:827
#6 process_netievent netmgr/netmgr.c:906
#7 process_queue netmgr/netmgr.c:998
#8 process_all_queues netmgr/netmgr.c:746
#9 async_cb netmgr/netmgr.c:775
#10 <null> <null>
#11 isc__trampoline_run lib/isc/trampoline.c:185
#12 <null> <null>
Location is heap block of size 3728 at 0x000000000017 allocated by thread T2:
#0 malloc <null>
#1 mallocx lib/isc/jemalloc_shim.h:30
#2 mem_get lib/isc/mem.c:341
#3 isc__mem_get lib/isc/mem.c:754
#4 fctx_create lib/dns/resolver.c:4574
#5 dns_resolver_createfetch lib/dns/resolver.c:10463
#6 ns_query_recurse lib/ns/query.c:6455
#7 query_delegation_recurse lib/ns/query.c:8924
#8 query_delegation lib/ns/query.c:8870
#9 query_gotanswer lib/ns/query.c:7607
#10 query_lookup lib/ns/query.c:5989
#11 ns__query_start lib/ns/query.c:5631
#12 query_setup lib/ns/query.c:5344
#13 ns_query_start lib/ns/query.c:12183
#14 ns__client_request lib/ns/client.c:2153
#15 isc__nm_async_readcb netmgr/netmgr.c:2748
#16 isc__nm_readcb netmgr/netmgr.c:2721
#17 udp_recv_cb netmgr/udp.c:418
#18 <null> <null>
#19 isc__trampoline_run lib/isc/trampoline.c:185
#20 <null> <null>
Mutex M1 (0x000000000035) created at:
#0 pthread_mutex_init <null>
#1 isc__mutex_init lib/isc/mutex.c:288
#2 dns_resolver_create lib/dns/resolver.c:9915
#3 dns_view_createresolver lib/dns/view.c:819
#4 configure_view bin/named/server.c:4714
#5 load_configuration bin/named/server.c:9199
#6 loadconfig bin/named/server.c:10380
#7 named_server_reconfigcommand bin/named/server.c:10777
#8 named_control_docommand bin/named/control.c:248
#9 control_command bin/named/controlconf.c:392
#10 task_run lib/isc/task.c:827
#11 isc_task_run lib/isc/task.c:907
#12 isc__nm_async_task netmgr/netmgr.c:827
#13 process_netievent netmgr/netmgr.c:906
#14 process_queue netmgr/netmgr.c:998
#15 process_all_queues netmgr/netmgr.c:746
#16 async_cb netmgr/netmgr.c:775
#17 <null> <null>
#18 isc__trampoline_run lib/isc/trampoline.c:185
#19 <null> <null>
Thread T1 (running) created by main thread at:
#0 pthread_create <null>
#1 isc_thread_create lib/isc/thread.c:79
#2 isc__netmgr_create netmgr/netmgr.c:321
#3 isc_managers_create lib/isc/managers.c:39
#4 create_managers bin/named/main.c:927
#5 setup bin/named/main.c:1200
#6 main bin/named/main.c:1472
Thread T2 (running) created by main thread at:
#0 pthread_create <null>
#1 isc_thread_create lib/isc/thread.c:79
#2 isc__netmgr_create netmgr/netmgr.c:321
#3 isc_managers_create lib/isc/managers.c:39
#4 create_managers bin/named/main.c:927
#5 setup bin/named/main.c:1200
#6 main bin/named/main.c:1472
SUMMARY: ThreadSanitizer: data race lib/dns/resolver.c:3178 in sort_finds
```
### Report 2 (for `fctx->finds`)
```
WARNING: ThreadSanitizer: data race
Write of size 8 at 0x000000000001 by thread T1:
#0 findname lib/dns/resolver.c:3257
#1 fctx_getaddresses lib/dns/resolver.c:3522
#2 fctx_try lib/dns/resolver.c:3912
#3 fctx_start lib/dns/resolver.c:4471
#4 task_run lib/isc/task.c:827
#5 isc_task_run lib/isc/task.c:907
#6 isc__nm_async_task netmgr/netmgr.c:827
#7 process_netievent netmgr/netmgr.c:906
#8 process_queue netmgr/netmgr.c:998
#9 process_all_queues netmgr/netmgr.c:746
#10 async_cb netmgr/netmgr.c:775
#11 <null> <null>
#12 isc__trampoline_run lib/isc/trampoline.c:185
#13 <null> <null>
Previous read of size 8 at 0x000000000001 by thread T2 (mutexes: write M1):
#0 fctx_decreference lib/dns/resolver.c:6880
#1 dns_resolver_destroyfetch lib/dns/resolver.c:10604
#2 fetch_callback lib/ns/query.c:6253
#3 task_run lib/isc/task.c:827
#4 isc_task_run lib/isc/task.c:907
#5 isc__nm_async_task netmgr/netmgr.c:827
#6 process_netievent netmgr/netmgr.c:906
#7 process_queue netmgr/netmgr.c:998
#8 process_all_queues netmgr/netmgr.c:746
#9 async_cb netmgr/netmgr.c:775
#10 <null> <null>
#11 isc__trampoline_run lib/isc/trampoline.c:185
#12 <null> <null>
Location is heap block of size 3728 at 0x000000000017 allocated by thread T2:
#0 malloc <null>
#1 mallocx lib/isc/jemalloc_shim.h:30
#2 mem_get lib/isc/mem.c:341
#3 isc__mem_get lib/isc/mem.c:754
#4 fctx_create lib/dns/resolver.c:4574
#5 dns_resolver_createfetch lib/dns/resolver.c:10463
#6 ns_query_recurse lib/ns/query.c:6455
#7 query_delegation_recurse lib/ns/query.c:8924
#8 query_delegation lib/ns/query.c:8870
#9 query_gotanswer lib/ns/query.c:7607
#10 query_lookup lib/ns/query.c:5989
#11 ns__query_start lib/ns/query.c:5631
#12 query_setup lib/ns/query.c:5344
#13 ns_query_start lib/ns/query.c:12183
#14 ns__client_request lib/ns/client.c:2153
#15 isc__nm_async_readcb netmgr/netmgr.c:2748
#16 isc__nm_readcb netmgr/netmgr.c:2721
#17 udp_recv_cb netmgr/udp.c:418
#18 <null> <null>
#19 isc__trampoline_run lib/isc/trampoline.c:185
#20 <null> <null>
Mutex M1 (0x000000000035) created at:
#0 pthread_mutex_init <null>
#1 isc__mutex_init lib/isc/mutex.c:288
#2 dns_resolver_create lib/dns/resolver.c:9915
#3 dns_view_createresolver lib/dns/view.c:819
#4 configure_view bin/named/server.c:4714
#5 load_configuration bin/named/server.c:9199
#6 loadconfig bin/named/server.c:10380
#7 named_server_reconfigcommand bin/named/server.c:10777
#8 named_control_docommand bin/named/control.c:248
#9 control_command bin/named/controlconf.c:392
#10 task_run lib/isc/task.c:827
#11 isc_task_run lib/isc/task.c:907
#12 isc__nm_async_task netmgr/netmgr.c:827
#13 process_netievent netmgr/netmgr.c:906
#14 process_queue netmgr/netmgr.c:998
#15 process_all_queues netmgr/netmgr.c:746
#16 async_cb netmgr/netmgr.c:775
#17 <null> <null>
#18 isc__trampoline_run lib/isc/trampoline.c:185
#19 <null> <null>
Thread T1 (running) created by main thread at:
#0 pthread_create <null>
#1 isc_thread_create lib/isc/thread.c:79
#2 isc__netmgr_create netmgr/netmgr.c:321
#3 isc_managers_create lib/isc/managers.c:39
#4 create_managers bin/named/main.c:927
#5 setup bin/named/main.c:1200
#6 main bin/named/main.c:1472
Thread T2 (running) created by main thread at:
#0 pthread_create <null>
#1 isc_thread_create lib/isc/thread.c:79
#2 isc__netmgr_create netmgr/netmgr.c:321
#3 isc_managers_create lib/isc/managers.c:39
#4 create_managers bin/named/main.c:927
#5 setup bin/named/main.c:1200
#6 main bin/named/main.c:1472
SUMMARY: ThreadSanitizer: data race lib/dns/resolver.c:3257 in findname
```
</details>November 2021 (9.16.23, 9.16.23-S1, 9.17.20)https://gitlab.isc.org/isc-projects/bind9/-/issues/2970bind9.xsl is not properly transmitted over stats channel2021-11-02T15:04:51ZMark Andrewsbind9.xsl is not properly transmitted over stats channelLooking at the stats channel in a web browser errors are reported.Looking at the stats channel in a web browser errors are reported.November 2021 (9.16.23, 9.16.23-S1, 9.17.20)Mark AndrewsMark Andrewshttps://gitlab.isc.org/isc-projects/bind9/-/issues/2993Replace instances of ARRAYSIZE with ARRAY_SIZE2021-11-02T15:03:50ZMark AndrewsReplace instances of ARRAYSIZE with ARRAY_SIZEARRAY_SIZE is defined in isc/util.h if not already defined by the host systemARRAY_SIZE is defined in isc/util.h if not already defined by the host systemNovember 2021 (9.16.23, 9.16.23-S1, 9.17.20)Mark AndrewsMark Andrews