BIND issues
https://gitlab.isc.org/isc-projects/bind9/-/issues
2021-04-06T08:37:29Z
https://gitlab.isc.org/isc-projects/bind9/-/issues/2133
Refactor the code that assigns nmhandle instead of attaching to it
2021-04-06T08:37:29Z
Ondřej Surý
Refactor the code that assigns nmhandle instead of attaching to it
Here's not (non-complete) list:
* [ ] `bin/named/controlconf.c`
* [ ] `lib/isccc/ccmsg.c`
* [ ] `lib/isc/http.c`
Here's not (non-complete) list:
* [ ] `bin/named/controlconf.c`
* [ ] `lib/isccc/ccmsg.c`
* [ ] `lib/isc/http.c`
April 2021 (9.11.30/9.11.31, 9.11.30-S1/9.11.31-S1, 9.16.14/9.16.15, 9.16.14-S1/9.16.15-S1, 9.17.12)
https://gitlab.isc.org/isc-projects/bind9/-/issues/2654
Create isc_managers API
2021-05-11T08:58:27Z
Ondřej Surý
Create isc_managers API
All the BIND 9 binaries have the same sequence of taskmgr/netmgr/timermgr/... ctors and dtors. Move this into a separate API, so we don't duplicate the code at various places.
All the BIND 9 binaries have the same sequence of taskmgr/netmgr/timermgr/... ctors and dtors. Move this into a separate API, so we don't duplicate the code at various places.
May 2021 (9.11.32, 9.11.32-S1, 9.16.16, 9.16.16-S1, 9.17.13)
Ondřej Surý
Ondřej Surý
https://gitlab.isc.org/isc-projects/bind9/-/issues/2638
Run internal tasks on top of network manager worker loops
2021-09-02T09:40:28Z
Ondřej Surý
Run internal tasks on top of network manager worker loops
After the networking manager was introduced, the existing taskmgr kept its own set of worker threads competing with the netmgr threads. This issue is about moving the tasks to run on top of netmgr loops while keeping the existing interfa...
After the networking manager was introduced, the existing taskmgr kept its own set of worker threads competing with the netmgr threads. This issue is about moving the tasks to run on top of netmgr loops while keeping the existing interface.
---
The primary merge requests implementing this change are:
- !4918
- !4983
The above changes required some follow-up tweaks, which were implemented in:
- !4980
- !4981
- !4982
- !5006
- !5008
- !5009
- !5010
May 2021 (9.11.32, 9.11.32-S1, 9.16.16, 9.16.16-S1, 9.17.13)
Ondřej Surý
Ondřej Surý
https://gitlab.isc.org/isc-projects/bind9/-/issues/2636
timing race in setnsec3param task
2021-04-19T09:49:09Z
Ondřej Surý
timing race in setnsec3param task
In `zone_postload()` the queued `zone->setnsec3param_queue` get scheduled as regular tasks:
> /*
> * Process any queued NSEC3PARAM change requests. Only for dynamic
> * zones, an inline-signing zone will perfor...
In `zone_postload()` the queued `zone->setnsec3param_queue` get scheduled as regular tasks:
> /*
> * Process any queued NSEC3PARAM change requests. Only for dynamic
> * zones, an inline-signing zone will perform this action when
> * receiving the secure db (receive_secure_db).
> */
However, the original reason why the NSEC3PARAM change requests were originally queued depends on following code:
> /*
> * setnsec3param() will silently return early if the zone does not yet
> * have a database. Prevent that by queueing the event up if zone->db
> * is NULL. All events queued here are subsequently processed by
> * receive_secure_db() if it ever gets called or simply freed by
> * zone_free() otherwise.
> */
So, when all the queued events gets scheduled as tasks in the `zone_postload()` and the tasks are fired up, the `zone->db` still might be `NULL`. Due to a difference in scheduling this doesn't really happen with the old (separate) task manager, but it's very easy to trigger in the taskmgr@netmgr.
May 2021 (9.11.32, 9.11.32-S1, 9.16.16, 9.16.16-S1, 9.17.13)
Ondřej Surý
Ondřej Surý
https://gitlab.isc.org/isc-projects/bind9/-/issues/2732
Zone dumping is blocking the networking IO
2021-06-07T12:43:40Z
Ondřej Surý
Zone dumping is blocking the networking IO
June 2021 (9.11.33, 9.11.33-S1, 9.16.17/9.16.18, 9.16.17-S1/9.16.18-S1, 9.17.14/9.17.15)
Ondřej Surý
Ondřej Surý
https://gitlab.isc.org/isc-projects/bind9/-/issues/2515
Performance drop of 10 % in LTT/main/root zone/dnsgen Perflab scenario
2021-05-24T15:18:45Z
Michal Nowak
Performance drop of 10 % in LTT/main/root zone/dnsgen Perflab scenario
Since isc-projects/bind9!4659 was merged the `LTT/main/root zone/dnsgen` [Perflab scenario](https://perflab.isc.org/#/config/run/5bf195f683ba91a870b29770), and only that one, shows persistent 10 % performance drop from 460,000 to 420,000...
Since isc-projects/bind9!4659 was merged the `LTT/main/root zone/dnsgen` [Perflab scenario](https://perflab.isc.org/#/config/run/5bf195f683ba91a870b29770), and only that one, shows persistent 10 % performance drop from 460,000 to 420,000 qps.
June 2021 (9.11.33, 9.11.33-S1, 9.16.17/9.16.18, 9.16.17-S1/9.16.18-S1, 9.17.14/9.17.15)
Ondřej Surý
Ondřej Surý
https://gitlab.isc.org/isc-projects/bind9/-/issues/2433
investigate and improve lock contention around mctx
2021-12-21T05:23:52Z
Petr Špaček
pspacek@isc.org
investigate and improve lock contention around mctx
### Summary
It _seems_ that locks around shared memory contexts (mctx) are contended for in various scenarios. This leads to worse performance.
### BIND version used
a23c5d2921e43aaea06d12b1b0f8de2fa6d07759
### Steps to reproduce
- ...
### Summary
It _seems_ that locks around shared memory contexts (mctx) are contended for in various scenarios. This leads to worse performance.
### BIND version used
a23c5d2921e43aaea06d12b1b0f8de2fa6d07759
### Steps to reproduce
- Compile named using these options:
`OPTIMIZE="-Og" CFLAGS="g3 -ggdb -Wno-deprecated-declarations -fno-omit-frame-pointer -fno-optimize-sibling-calls -fPIC -rdynamic" LDFLAGS="-fPIE"`
- Run [mutrace](https://github.com/bconry/mutrace) and named with more threads:
`mutrace --hash-size=594327 -d named -n 32 -g -c /dev/null`
Note: Systems with binutils 2.34+ require mutrace patch https://github.com/bconry/mutrace/pull/4
- Send some cache hit traffic to named. E.g. run:
`yes '. NS' | dnsperf -c 100 -l 10`
- SIGINT named when dnsperf finishes.
- While reading output of mutrace ignore misleading line `mutrace.c:750 unlock_hash()` in stack tracebacks if it is present ([depends on mutrace version](https://github.com/bconry/mutrace/issues/5)).
### What is the current *bug* behavior?
Lock contention, leading to bad performance.
### Relevant logs and/or screenshots
[mutrace.log](/uploads/71b0d927b5950a422392d1556c496cee/mutrace.log)
Most contended mutex is:
```
Mutex #60844 (0x0x7f442da070d0) first referenced by:
mutrace.c:750 unlock_hash()
mutex.c:288 isc__mutex_init()
netmgr.c:249 isc_nm_start()
main.c:934 create_managers()
main.c:1248 setup()
main.c:1555 main()
??:0 __libc_start_main()
```
In source code netmgr.c:249 it is lock created on line:
```
249 isc_mempool_create(mgr->mctx, sizeof(isc__netievent_storage_t),
250 &mgr->evpool);
```
I.e. it is locking around `mgr->mctx`.
This is simplest way to show lock contention with tools. During high-QPS benchmarking using `kxdpgun` the lock contention around `mgr->mctx` was indeed creating measurable performance problem. On 16 core system this shared mctx leads to performance drop 3x when compared with situation where each thread has its own mctx.
### Possible fixes
Generally having a separate `mctx` per thread might be a good idea, but it needs careful design so objects which get passed between threads can be de/reallocated correctly.
June 2021 (9.11.33, 9.11.33-S1, 9.16.17/9.16.18, 9.16.17-S1/9.16.18-S1, 9.17.14/9.17.15)
Ondřej Surý
Ondřej Surý
https://gitlab.isc.org/isc-projects/bind9/-/issues/2788
Use tolower()/toupper()/isupper() from ctype.h
2021-07-08T08:37:51Z
Ondřej Surý
Use tolower()/toupper()/isupper() from ctype.h
This was suggested by Rick and my initial answer was but it is locale aware, so a) it will mess the names when in non-POSIX locale and b) it must be slower.
Turns out that a) can be easily fixed (any program that doesn't call `setlocale...
This was suggested by Rick and my initial answer was but it is locale aware, so a) it will mess the names when in non-POSIX locale and b) it must be slower.
Turns out that a) can be easily fixed (any program that doesn't call `setlocale()` is running with **POSIX** locale) and b) isn't true. In fact, using the `tolower()` was fastest out of all the implementations we used or considered.
July 2021 (9.11.34, 9.11.34-S1, 9.16.19, 9.16.19-S1, 9.17.16)
Ondřej Surý
Ondřej Surý
https://gitlab.isc.org/isc-projects/bind9/-/issues/2690
Remove Windows support for BIND 9.17/9.18+
2022-01-21T13:43:04Z
Ondřej Surý
Remove Windows support for BIND 9.17/9.18+
This is a tracking issue to remove the Windows port from BIND 9 source code for the next stable release.
This is a tracking issue to remove the Windows port from BIND 9 source code for the next stable release.
July 2021 (9.11.34, 9.11.34-S1, 9.16.19, 9.16.19-S1, 9.17.16)
Ondřej Surý
Ondřej Surý
https://gitlab.isc.org/isc-projects/bind9/-/issues/2860
The DoH endpoint URL in dig might get prepared in wrong format
2021-08-31T13:41:21Z
Artem Boldariev
The DoH endpoint URL in dig might get prepared in wrong format
One of the causes why the problem #2858 appeared is that the URL in the DoH code might get generated in wrong format at least when IPv6 is used. In particular, in the issue `dig` was trying to connect to `https://::1:444/dns-query` inste...
One of the causes why the problem #2858 appeared is that the URL in the DoH code might get generated in wrong format at least when IPv6 is used. In particular, in the issue `dig` was trying to connect to `https://::1:444/dns-query` instead of `https://[::1]:444/dns-query`. Obviously, this issue prevents `dig` from querying servers via its IPv6 addresses (querying hostnames available via IPv6 works fine).
Also, it will always generate a URL starting with `https://` regardless of the fact if we use HTTP with encryption or not.
So, while the fix for the problem reported is already prepared (!5319), this issue needs to be taken care of as well. The code which prepared the URL needs to be revisited.
September 2021 (9.16.21, 9.16.21-S1, 9.17.18)
Artem Boldariev
Artem Boldariev
https://gitlab.isc.org/isc-projects/bind9/-/issues/2851
A rare crash in the DoH code caused by an assert
2021-08-17T09:01:28Z
Artem Boldariev
A rare crash in the DoH code caused by an assert
A rare issue was found while working on !5309. The unit test suite (`doh_test`) revealed a situation when `session->handle` got detached too early in the `http_send_outgoing()`, the function which takes data from nghttp2 and sends it via...
A rare issue was found while working on !5309. The unit test suite (`doh_test`) revealed a situation when `session->handle` got detached too early in the `http_send_outgoing()`, the function which takes data from nghttp2 and sends it via the underlying connection.
As a result, when we have reached the call to `isc_nm_send()`
```
session->sending++;
isc_nm_send(session->handle, &send->data, http_writecb, send);
return (true);
}
```
the `session->handle` was `NULL` triggering an assert:
```
void
isc_nm_send(isc_nmhandle_t *handle, isc_region_t *region, isc_nm_cb_t cb,
void *cbarg) {
REQUIRE(VALID_NMHANDLE(handle));
```
September 2021 (9.16.21, 9.16.21-S1, 9.17.18)
Artem Boldariev
Artem Boldariev
https://gitlab.isc.org/isc-projects/bind9/-/issues/331
Further refactoring of functions in lib/dns/zoneverify.c
2021-08-31T13:33:10Z
Michał Kępień
Further refactoring of functions in lib/dns/zoneverify.c
Certain review comments in !291 were related to code which was not introduced by that MR, but rather just moved around. The following comments should thus be addressed:
- [x] @ondrej started a [discussion](https://gitlab.isc.org/isc-pr...
Certain review comments in !291 were related to code which was not introduced by that MR, but rather just moved around. The following comments should thus be addressed:
- [x] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/merge_requests/291#note_12167): (+1 comment)
> This block is little bit confusing (triggering warning condition on ISC_R_SUCCESS), is there a reason why not to move the whole block before the `break` where it really logically belongs?
- [x] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/merge_requests/291#note_12176): (+1 comment)
> Also this sort of calls for a helper static functions similar to `innsec3params()`.
- [x] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/merge_requests/291#note_12178): (+1 comment)
> Perhaps use `= { 0 };` and remove memset?
- [x] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/merge_requests/291#note_12180): (+1 comment)
> Use `sizeof(set_algorithms)` instead of arbitrary number.
- [x] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/merge_requests/291#note_12181): (+1 comment)
> `= { 0 };`
- [x] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/merge_requests/291#note_12182): (+3 comments)
> `goto done;` as in previous functions instead of cut© code?
- [x] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/merge_requests/291#note_12184): (+1 comment)
> This seems awfully similar to just calling `!chain_equal()`.
- [x] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/merge_requests/291#note_12185): (+1 comment)
> This is so fragile.
- [x] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/merge_requests/291#note_12186): (+1 comment)
> Perhaps `255` should be a constant with a descriptive name?
September 2021 (9.16.21, 9.16.21-S1, 9.17.18)
https://gitlab.isc.org/isc-projects/bind9/-/issues/2691
Remove native PKCS#11 support from BIND 9.17/9.18+
2022-01-19T11:20:49Z
Ondřej Surý
Remove native PKCS#11 support from BIND 9.17/9.18+
This a tracking issue to remove the native PKCS#11 implementation from BIND 9 source code (including the documentation).
This a tracking issue to remove the native PKCS#11 implementation from BIND 9 source code (including the documentation).
October 2021 (9.11.36, 9.11.36-S1, 9.16.22, 9.16.22-S1, 9.17.19)
Ondřej Surý
Ondřej Surý
https://gitlab.isc.org/isc-projects/bind9/-/issues/2401
use netmgr for dispatch
2022-01-26T11:33:41Z
Evan Hunt
use netmgr for dispatch
Convert the dns/dispatch module (and its callers, dns/request and dns/resolver) to use the network manager.
This involves, first, refactoring dispatch to remove code that's no longer being used; moving all the calls to the isc_socket AP...
Convert the dns/dispatch module (and its callers, dns/request and dns/resolver) to use the network manager.
This involves, first, refactoring dispatch to remove code that's no longer being used; moving all the calls to the isc_socket API into dispatch.c and adding functions such as `dns_dispatch_connect()` and `dns_dispatch_send()` to call them instead; replacing the isc_socket calls with isc_nm calls and then revising the dispatch API as needed to work with the new architecture.
When this is done, there will be only one remaining use of isc_socket in `named` - the route socket that's used for interface scanning. This will also complete the netmgr conversion of `delv`, `mdig`, and `nsupdate`.
October 2021 (9.11.36, 9.11.36-S1, 9.16.22, 9.16.22-S1, 9.17.19)
Evan Hunt
Evan Hunt
https://gitlab.isc.org/isc-projects/bind9/-/issues/719
Make isc_results static
2021-10-07T06:48:12Z
Witold Krecicki
Make isc_results static
Currently there's a dynamic list of results handled by lib/isc/result.c, and e.g. isc_result_totext requires a lock.
Since we don't have any external users now, and nobody will be adding any results from the outside, we can move all the...
Currently there's a dynamic list of results handled by lib/isc/result.c, and e.g. isc_result_totext requires a lock.
Since we don't have any external users now, and nobody will be adding any results from the outside, we can move all the result codes from libdns, ns, isccfg, etc. to libisc - and make the list static.
October 2021 (9.11.36, 9.11.36-S1, 9.16.22, 9.16.22-S1, 9.17.19)
https://gitlab.isc.org/isc-projects/bind9/-/issues/88
Make all BIND libraries private
2021-10-05T10:13:20Z
Ondřej Surý
Make all BIND libraries private
BIND currently exports number of libraries, but there's virtually no external projects that we are aware of that would be using those libraries. Keeping the ABI and API stable is big burden, and we are exploring possibility of merging a...
BIND currently exports number of libraries, but there's virtually no external projects that we are aware of that would be using those libraries. Keeping the ABI and API stable is big burden, and we are exploring possibility of merging all the libraries into a tightly-coupled private library that wouldn't be used outside of BIND (and tools) effectively making those libraries private.
The BIND 9.13/9.14 would be the first release that would drop the libraries.
The BIND 9.11 ESV would keep those libraries until 2022, so any external users would have enough time to migrate to other DNS libraries.
Known external users of libisc and friends:
* ISC DHCP (will continue using BIND 9.11 libraries)
* dnsperf (either use BIND 9.11 libraries, or make it ISC project)
October 2021 (9.11.36, 9.11.36-S1, 9.16.22, 9.16.22-S1, 9.17.19)
https://gitlab.isc.org/isc-projects/bind9/-/issues/2926
use netmgr for route sockets and remove isc_socket
2022-01-21T13:44:36Z
Evan Hunt
use netmgr for route sockets and remove isc_socket
The last remaining use of `isc_socket` and `isc_socketmgr` in BIND is for the netlink/route sockets that are used to scan for interface changes.
The libuv documentation indicates that any socket that honors the datagram contract can be ...
The last remaining use of `isc_socket` and `isc_socketmgr` in BIND is for the netlink/route sockets that are used to scan for interface changes.
The libuv documentation indicates that any socket that honors the datagram contract can be passed to `uv_udp_open()`, so we should be able to make the netmgr do this instead.
November 2021 (9.16.23, 9.16.23-S1, 9.17.20)
Evan Hunt
Evan Hunt
https://gitlab.isc.org/isc-projects/bind9/-/issues/2843
EC_KEY has been deprecated on OpenSSL 3.0.0
2022-01-19T11:20:50Z
Mark Andrews
EC_KEY has been deprecated on OpenSSL 3.0.0
Need to workout what the replacement code needs to be as builds fail on strict systems.
Need to workout what the replacement code needs to be as builds fail on strict systems.
November 2021 (9.16.23, 9.16.23-S1, 9.17.20)
Arаm Sаrgsyаn
Arаm Sаrgsyаn
https://gitlab.isc.org/isc-projects/bind9/-/issues/828
version limits are ineffective when rolling logfiles with timestamp suffix
2023-03-28T10:03:34Z
Evan Hunt
version limits are ineffective when rolling logfiles with timestamp suffix
The logfileconfig test is so badly written, I think it would be better to do it over than try to fix it.
Update: while working on the test, it turned out that timestamp logfiles that should have been removed when rolling, weren't. This ...
The logfileconfig test is so badly written, I think it would be better to do it over than try to fix it.
Update: while working on the test, it turned out that timestamp logfiles that should have been removed when rolling, weren't. This bug needs fixing too.
November 2021 (9.16.23, 9.16.23-S1, 9.17.20)
Evan Hunt
Evan Hunt
https://gitlab.isc.org/isc-projects/bind9/-/issues/1265
BIND 9.14 option synth-from-dnssec causing high CPU consumption and degraded ...
2022-02-18T12:12:52Z
Cathy Almond
BIND 9.14 option synth-from-dnssec causing high CPU consumption and degraded client experience
### Summary
As reported in [Support ticket #15338](https://support.isc.org/Ticket/Display.html?id=15338)
```
### BIND version used
BIND 9.14.6 (Stable Release) <id:efd3496>
running on Linux x86_64 3.10.0-1062.1.1.el7.x86_64 #1 SMP Fri...
### Summary
As reported in [Support ticket #15338](https://support.isc.org/Ticket/Display.html?id=15338)
```
### BIND version used
BIND 9.14.6 (Stable Release) <id:efd3496>
running on Linux x86_64 3.10.0-1062.1.1.el7.x86_64 #1 SMP Fri Sep 13 22:55:44 UTC 2019
built by make with '--prefix=/opt/bind' '--sysconfdir=/etc' '--disable-linux-caps' '--enable-dnsrps'
compiled by GCC 4.8.5 20150623 (Red Hat 4.8.5-39)
compiled with OpenSSL version: OpenSSL 1.0.2k 26 Jan 2017
linked to OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
compiled with zlib version: 1.2.7
linked to zlib version: 1.2.7
threads support is enabled
default paths:
named configuration: /etc/named.conf
rndc configuration: /etc/rndc.conf
DNSSEC root key: /etc/bind.keys
nsupdate session key: /opt/bind/var/run/named/session.key
named PID file: /opt/bind/var/run/named/named.pid
named lock file: /opt/bind/var/run/named/named.lock
```
### Steps to reproduce
This is an in-house resolver (one of a pair behind a load-balancer). DNSSEC and DNSSEC-validation are both disabled:
```
dnssec-enable no;
dnssec-validation no;
```
### What is the current *bug* behavior?
After upgrading one of the server pair from BIND 9.10 to 9.14, the server that had previously been running BIND 9.10 changed from running named at around 20% CPU consumption to 150%. It is also slower for clients: "it is slow after CPU usage reach 150%, even when I query the cached data on it, it takes more than 200ms to respond"
### What is the expected *correct* behavior?
As good (or better) performance than before upgrading
### Relevant configuration files
See above.
### Relevant logs and/or screenshots
There was nothing unusual or different in any of the logging, QPS or any of the usual suspects.
Of note, in 9.14, 'dnssec-enable' is no longer a functioning option - and, confirmed in the PCAPs, this server is setting the DO bit on queries to authoritative servers and receiving DNSSEC material with query responses (which will be being cached).
We captured a series of operating stack snapshots using pstack - which showed a surprising number of instances of worker threads calling find_coveringnsec() which was a surprise.
Notably on this server and in this environment, it was expected that there will be a high proportion of negative responses to clients: "there are a lot of invalid/NXDOMAIN dns queries".
Speculatively, we added:
`synth-from-dnssec no; `
With this new configuration option, performance returned to normal.
Per the ARM:
```
synth-from-dnssec
Synthesize answers from cached NSEC, NSEC3 and other RRsets that have been proved to be correct using DNSSEC. The default is yes.
Note:
• DNSSEC validation must be enabled for this option to be effective.
This initial implementation only covers synthesis of answers from NSEC records. Synthesis from NSEC3 is planned for the future. This will also be controlled by synth-from-dnssec.
```
I would have expected that to mean that the option would be disabled if DNSSEC-validation is disabled, but it could be interpreted to mean that the option doesn't do anything useful (which makes sense - as you wouldn't want to use unvalidated NSEC RRsets for this). But the high performance penalty was nevertheless surprising.
The problem may have been more significant in this case due to the notably high proportion of negative cached RRsets/pseudo-RRsets.
### Possible fixes
N/A (but there's a clear workaround - disable this feature)
December 2021 (9.16.24, 9.16.24-S1, 9.17.21)