BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2021-09-02T09:40:28Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2638Run internal tasks on top of network manager worker loops2021-09-02T09:40:28ZOndřej SurýRun internal tasks on top of network manager worker loopsAfter the networking manager was introduced, the existing taskmgr kept its own set of worker threads competing with the netmgr threads. This issue is about moving the tasks to run on top of netmgr loops while keeping the existing interfa...After the networking manager was introduced, the existing taskmgr kept its own set of worker threads competing with the netmgr threads. This issue is about moving the tasks to run on top of netmgr loops while keeping the existing interface.
---
The primary merge requests implementing this change are:
- !4918
- !4983
The above changes required some follow-up tweaks, which were implemented in:
- !4980
- !4981
- !4982
- !5006
- !5008
- !5009
- !5010May 2021 (9.11.32, 9.11.32-S1, 9.16.16, 9.16.16-S1, 9.17.13)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2654Create isc_managers API2021-05-11T08:58:27ZOndřej SurýCreate isc_managers APIAll the BIND 9 binaries have the same sequence of taskmgr/netmgr/timermgr/... ctors and dtors. Move this into a separate API, so we don't duplicate the code at various places.All the BIND 9 binaries have the same sequence of taskmgr/netmgr/timermgr/... ctors and dtors. Move this into a separate API, so we don't duplicate the code at various places.May 2021 (9.11.32, 9.11.32-S1, 9.16.16, 9.16.16-S1, 9.17.13)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2690Remove Windows support for BIND 9.17/9.18+2022-01-21T13:43:04ZOndřej SurýRemove Windows support for BIND 9.17/9.18+This is a tracking issue to remove the Windows port from BIND 9 source code for the next stable release.This is a tracking issue to remove the Windows port from BIND 9 source code for the next stable release.July 2021 (9.11.34, 9.11.34-S1, 9.16.19, 9.16.19-S1, 9.17.16)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2691Remove native PKCS#11 support from BIND 9.17/9.18+2022-01-19T11:20:49ZOndřej SurýRemove native PKCS#11 support from BIND 9.17/9.18+This a tracking issue to remove the native PKCS#11 implementation from BIND 9 source code (including the documentation).This a tracking issue to remove the native PKCS#11 implementation from BIND 9 source code (including the documentation).October 2021 (9.11.36, 9.11.36-S1, 9.16.22, 9.16.22-S1, 9.17.19)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2697Further taskmgr refactoring2022-03-01T09:56:48ZOndřej SurýFurther taskmgr refactoringJust dumping notes/ideas:
* schedule events directly onto the worker loops, not tasks
* thus events will have to attach to task (or just reference count running events)
* as the last event schedule conditional task cleanup (move the logi...Just dumping notes/ideas:
* schedule events directly onto the worker loops, not tasks
* thus events will have to attach to task (or just reference count running events)
* as the last event schedule conditional task cleanup (move the logic from task_run there)
* use isc_queue_t to store task events instead of locked LIST to remove contention
This should further simplify the taskmgr logic.Not plannedOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2732Zone dumping is blocking the networking IO2021-06-07T12:43:40ZOndřej SurýZone dumping is blocking the networking IOJune 2021 (9.11.33, 9.11.33-S1, 9.16.17/9.16.18, 9.16.17-S1/9.16.18-S1, 9.17.14/9.17.15)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2788Use tolower()/toupper()/isupper() from ctype.h2021-07-08T08:37:51ZOndřej SurýUse tolower()/toupper()/isupper() from ctype.hThis was suggested by Rick and my initial answer was but it is locale aware, so a) it will mess the names when in non-POSIX locale and b) it must be slower.
Turns out that a) can be easily fixed (any program that doesn't call `setlocale...This was suggested by Rick and my initial answer was but it is locale aware, so a) it will mess the names when in non-POSIX locale and b) it must be slower.
Turns out that a) can be easily fixed (any program that doesn't call `setlocale()` is running with **POSIX** locale) and b) isn't true. In fact, using the `tolower()` was fastest out of all the implementations we used or considered.July 2021 (9.11.34, 9.11.34-S1, 9.16.19, 9.16.19-S1, 9.17.16)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2843EC_KEY has been deprecated on OpenSSL 3.0.02022-01-19T11:20:50ZMark AndrewsEC_KEY has been deprecated on OpenSSL 3.0.0Need to workout what the replacement code needs to be as builds fail on strict systems.Need to workout what the replacement code needs to be as builds fail on strict systems.November 2021 (9.16.23, 9.16.23-S1, 9.17.20)Arаm SаrgsyаnArаm Sаrgsyаnhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2851A rare crash in the DoH code caused by an assert2021-08-17T09:01:28ZArtem BoldarievA rare crash in the DoH code caused by an assertA rare issue was found while working on !5309. The unit test suite (`doh_test`) revealed a situation when `session->handle` got detached too early in the `http_send_outgoing()`, the function which takes data from nghttp2 and sends it via...A rare issue was found while working on !5309. The unit test suite (`doh_test`) revealed a situation when `session->handle` got detached too early in the `http_send_outgoing()`, the function which takes data from nghttp2 and sends it via the underlying connection.
As a result, when we have reached the call to `isc_nm_send()`
```
session->sending++;
isc_nm_send(session->handle, &send->data, http_writecb, send);
return (true);
}
```
the `session->handle` was `NULL` triggering an assert:
```
void
isc_nm_send(isc_nmhandle_t *handle, isc_region_t *region, isc_nm_cb_t cb,
void *cbarg) {
REQUIRE(VALID_NMHANDLE(handle));
```September 2021 (9.16.21, 9.16.21-S1, 9.17.18)Artem BoldarievArtem Boldarievhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2860The DoH endpoint URL in dig might get prepared in wrong format2021-08-31T13:41:21ZArtem BoldarievThe DoH endpoint URL in dig might get prepared in wrong formatOne of the causes why the problem #2858 appeared is that the URL in the DoH code might get generated in wrong format at least when IPv6 is used. In particular, in the issue `dig` was trying to connect to `https://::1:444/dns-query` inste...One of the causes why the problem #2858 appeared is that the URL in the DoH code might get generated in wrong format at least when IPv6 is used. In particular, in the issue `dig` was trying to connect to `https://::1:444/dns-query` instead of `https://[::1]:444/dns-query`. Obviously, this issue prevents `dig` from querying servers via its IPv6 addresses (querying hostnames available via IPv6 works fine).
Also, it will always generate a URL starting with `https://` regardless of the fact if we use HTTP with encryption or not.
So, while the fix for the problem reported is already prepared (!5319), this issue needs to be taken care of as well. The code which prepared the URL needs to be revisited.September 2021 (9.16.21, 9.16.21-S1, 9.17.18)Artem BoldarievArtem Boldarievhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2926use netmgr for route sockets and remove isc_socket2022-01-21T13:44:36ZEvan Huntuse netmgr for route sockets and remove isc_socketThe last remaining use of `isc_socket` and `isc_socketmgr` in BIND is for the netlink/route sockets that are used to scan for interface changes.
The libuv documentation indicates that any socket that honors the datagram contract can be ...The last remaining use of `isc_socket` and `isc_socketmgr` in BIND is for the netlink/route sockets that are used to scan for interface changes.
The libuv documentation indicates that any socket that honors the datagram contract can be passed to `uv_udp_open()`, so we should be able to make the netmgr do this instead.November 2021 (9.16.23, 9.16.23-S1, 9.17.20)Evan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/2951assertion failure in req_senddone2022-01-21T07:32:54ZMark Andrewsassertion failure in req_senddoneJob [#2037401](https://gitlab.isc.org/isc-projects/bind9/-/jobs/2037401) failed for c177c33c2773d7efa4b5af5460fd354450529d05:
```
D:checkds:Core was generated by `/builds/isc-projects/bind9/bin/named/.libs/named -D checkds-ns9 -X named.l...Job [#2037401](https://gitlab.isc.org/isc-projects/bind9/-/jobs/2037401) failed for c177c33c2773d7efa4b5af5460fd354450529d05:
```
D:checkds:Core was generated by `/builds/isc-projects/bind9/bin/named/.libs/named -D checkds-ns9 -X named.lock -'.
D:checkds:Program terminated with signal SIGABRT, Aborted.
D:checkds:#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
D:checkds:[Current thread is 1 (Thread 0x7f81a97fa700 (LWP 1941))]
D:checkds:#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
D:checkds:#1 0x00007f81b069e535 in __GI_abort () at abort.c:79
D:checkds:#2 0x0000000000420cac in assertion_failed (file=0x7f81b12ee461 "request.c", line=<optimized out>, type=<optimized out>, cond=<optimized out>) at main.c:237
D:checkds:#3 0x00007f81b1372b7a in isc_assertion_failed (file=0x2 <error: Cannot access memory at address 0x2>, line=-1451276416, line@entry=1029, type=type@entry=isc_assertiontype_require, cond=0x7f81b06b37bb <__GI_raise+267> "H\213\214$\b\001") at assertions.c:47
D:checkds:#4 0x00007f81b123b4e2 in req_senddone (eresult=<optimized out>, region=<optimized out>, arg=<optimized out>) at request.c:1029
D:checkds:#5 0x00007f81b11727f6 in send_done (handle=0x7f819db70d80, result=ISC_R_CANCELED, cbarg=<optimized out>) at dispatch.c:1770
D:checkds:#6 0x00007f81b1363b7f in isc__nm_async_sendcb (worker=<optimized out>, ev0=ev0@entry=0x7f81a5843580) at netmgr/netmgr.c:2788
D:checkds:#7 0x00007f81b13603c6 in process_netievent (worker=worker@entry=0x7f81adc4a030, ievent=ievent@entry=0x7f81a5843580) at netmgr/netmgr.c:960
D:checkds:#8 0x00007f81b135baf1 in process_queue (worker=0x7f81adc4a030, type=NETIEVENT_NORMAL) at netmgr/netmgr.c:998
D:checkds:#9 process_all_queues (worker=0x7f81adc4a030) at netmgr/netmgr.c:746
D:checkds:#10 async_cb (handle=0x7f81adc4a390) at netmgr/netmgr.c:775
D:checkds:#11 0x00007f81b0c976d8 in ?? () from /usr/lib/x86_64-linux-gnu/libuv.so.1
D:checkds:#12 0x00007f81b0ca6530 in uv.io_poll () from /usr/lib/x86_64-linux-gnu/libuv.so.1
D:checkds:#13 0x00007f81b0c97ff5 in uv_run () from /usr/lib/x86_64-linux-gnu/libuv.so.1
D:checkds:#14 0x00007f81b135bc39 in nm_thread (worker0=0x7f81adc4a030) at netmgr/netmgr.c:681
D:checkds:#15 0x00007f81b13a1aea in isc__trampoline_run (arg=0x1948b00) at trampoline.c:185
D:checkds:#16 0x00007f81b0844fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
D:checkds:#17 0x00007f81b07754cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
```
```
Thread 1 (Thread 0x7f81a97fa700 (LWP 1941)):
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
set = {__val = {16387, 0, 0, 0, 3905522674892552757, 140195000463360, 140195000513668, 140195000691568, 140195006326720, 140194998247424, 0, 140194998247424, 140194871224000, 140194942605624, 140194942605616, 5}}
pid = <optimized out>
tid = <optimized out>
ret = <optimized out>
#1 0x00007f81b069e535 in __GI_abort () at abort.c:79
save_stage = 1
act = {__sigaction_handler = {sa_handler = 0x7f81b0c5e900, sa_sigaction = 0x7f81b0c5e900}, sa_mask = {__val = {140194987098112, 140194987177249, 140194988119184, 140195000149089, 140194871199616, 31, 1483, 140194175012544, 13, 140194993269024, 17674366533170704384, 206158430256, 4575977, 140195000149089, 14, 206158430248}}, sa_flags = -1451275824, sa_restorer = 0xe}
sigs = {__val = {32, 0 <repeats 15 times>}}
#2 0x0000000000420cac in assertion_failed (file=0x7f81b12ee461 "request.c", line=<optimized out>, type=<optimized out>, cond=<optimized out>) at main.c:237
tracebuf = {0x420cc4 <assertion_failed+148>, 0x7f81b1372b7a <isc_assertion_failed+10>, 0x7f81b123b4e2 <req_senddone+450>, 0x7f81b11727f6 <send_done+54>, 0x7f81b1363b7f <isc__nm_async_sendcb+143>, 0x7f81b13603c6 <process_netievent+1222>, 0x7f81b135baf1 <async_cb+145>, 0x7f81b0c976d8, 0x7f81b0ca6530 <uv.io_poll+800>, 0x7f81b0c97ff5 <uv_run+277>, 0x7f81b135bc39 <nm_thread+121>, 0x7f81b13a1aea <isc__trampoline_run+74>, 0x7f81b0844fa3 <start_thread+243>, 0x7f81b07754cf <clone+63>, 0x7f81b07754cf <clone+63>, 0x0, 0x0, 0x405, 0x7f81b12ee461, 0x7f81a576f990, 0x7f81b0679668, 0x7f819c2b4e00, 0x7f81adc4a030, 0x7f81b18e2b00 <_dl_fixup+208>, 0x5, 0xffff00001f80, 0x7f816e473000, 0xa0, 0x7f819ca05748, 0x7f81a97f4e10, 0x0, 0x7f81a97f9080, 0x7f819ca05708, 0x7f81b043efaf, 0x7f81ade5f740, 0x7f81adea5040, 0x7f81adedc6c0, 0x7f81adedc6c0, 0x7f81adedc6c0, 0x7f81adedd340, 0x7f81adedd340, 0x7f81adedd440, 0xdededededededede, 0xdededededededede, 0xff00ffff00ff00, 0xff00ff00ff00ff, 0xff000000, 0x0, 0x3036346536313866, 0x756873203a303034, 0x0, 0x10, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7325732500732520, 0x7325732573257325, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4200000042, 0x4200000042, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f81adc09000, 0x7f816e460400, 0x7f81b138437b <isc__mem_put+171>, 0x0, 0x7f816e460400, 0x7f819da00000, 0x7f816e460410, 0x0, 0x1, 0x7f816e460560, 0x7f816e460410, 0x2, 0x800000000000000e, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f81a97f4e70, 0x7f81b1171054 <tcp_recv+1828>, 0x7f81a56e2000, 0x700350a1f640002, 0x0 <repeats 15 times>, 0x10, 0xffffffffffffffff, 0xffffffffffffffff, 0x700350a1f640002, 0x0, 0x0, 0x0, 0x0, 0xf547f6b817b13000, 0x0, 0x7f81adc09000, 0x7f81a97f4e10, 0x7f81b13ac7a8, 0x7b0, 0x7f81adc09008}
nframes = <optimized out>
#3 0x00007f81b1372b7a in isc_assertion_failed (file=0x2 <error: Cannot access memory at address 0x2>, line=-1451276416, line@entry=1029, type=type@entry=isc_assertiontype_require, cond=0x7f81b06b37bb <__GI_raise+267> "H\213\214$\b\001") at assertions.c:47
No locals.
#4 0x00007f81b123b4e2 in req_senddone (eresult=<optimized out>, region=<optimized out>, arg=<optimized out>) at request.c:1029
request = <optimized out>
#5 0x00007f81b11727f6 in send_done (handle=0x7f819db70d80, result=ISC_R_CANCELED, cbarg=<optimized out>) at dispatch.c:1770
resp = 0x0
#6 0x00007f81b1363b7f in isc__nm_async_sendcb (worker=<optimized out>, ev0=ev0@entry=0x7f81a5843580) at netmgr/netmgr.c:2788
ievent = <optimized out>
sock = 0x7f816e473000
uvreq = 0x0
eresult = ISC_R_CANCELED
#7 0x00007f81b13603c6 in process_netievent (worker=worker@entry=0x7f81adc4a030, ievent=ievent@entry=0x7f81a5843580) at netmgr/netmgr.c:960
No locals.
#8 0x00007f81b135baf1 in process_queue (worker=0x7f81adc4a030, type=NETIEVENT_NORMAL) at netmgr/netmgr.c:998
waiting = 4
ievent = 0x7f81a5843580
stop = <optimized out>
#9 process_all_queues (worker=0x7f81adc4a030) at netmgr/netmgr.c:746
result = <optimized out>
type = 3
reschedule = true
type = <optimized out>
result = <optimized out>
#10 async_cb (handle=0x7f81adc4a390) at netmgr/netmgr.c:775
worker = 0x7f81adc4a030
#11 0x00007f81b0c976d8 in ?? () from /usr/lib/x86_64-linux-gnu/libuv.so.1
No symbol table info available.
#12 0x00007f81b0ca6530 in uv.io_poll () from /usr/lib/x86_64-linux-gnu/libuv.so.1
No symbol table info available.
#13 0x00007f81b0c97ff5 in uv_run () from /usr/lib/x86_64-linux-gnu/libuv.so.1
No symbol table info available.
#14 0x00007f81b135bc39 in nm_thread (worker0=0x7f81adc4a030) at netmgr/netmgr.c:681
r = <optimized out>
worker = 0x7f81adc4a030
mgr = 0x7f81adc45000
#15 0x00007f81b13a1aea in isc__trampoline_run (arg=0x1948b00) at trampoline.c:185
trampoline = 0x1948b00
result = <optimized out>
#16 0x00007f81b0844fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
ret = <optimized out>
pd = <optimized out>
now = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140194871224064, -130959809494454272, 140721831580654, 140721831580655, 140194871224064, 26512128, 84985927855435776, 84967749757497344}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#17 0x00007f81b07754cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.
```January 2022 (9.18.0)https://gitlab.isc.org/isc-projects/bind9/-/issues/2953Resolver issues with refactored dispatch code2023-11-02T17:02:19ZMichał KępieńResolver issues with refactored dispatch codeThis issue attempts to describe various issues with resolver behavior
found after merging !4601 (#2401). Most of these issues are
intermittent, so it is important to keep track of them somewhere in
order to not forget that they exist. ...This issue attempts to describe various issues with resolver behavior
found after merging !4601 (#2401). Most of these issues are
intermittent, so it is important to keep track of them somewhere in
order to not forget that they exist. We should get to the bottom of all
of these issues before we release BIND 9.18.0.
1. [x] **Recursive Perflab tests cause the resolver to stop responding.**
This issue might be the simplest to start with because the behavior
observed seems to be consistent rather than intermittent. Namely,
all Perflab jobs which test a resolver seem to crank out a response
rate of some 70-120 kQPS at the beginning of the test and then...
the resolver stops responding indefinitely. While Perflab was not
designed with recursive tests in mind and therefore we can treat its
recursive results with a grain of salt, it certainly should not be
reporting zeros all over the place.
- https://perflab.isc.org/#/config/run/5bf195dd83ba91a870b2976f/
- https://perflab.isc.org/#/config/run/5cd6a166643076f6c1f6c26f/
- https://perflab.isc.org/#/config/run/5db74b6264458967f762143a/
- https://perflab.isc.org/#/config/run/5db74b7264458967f762143b/
- https://perflab.isc.org/#/config/run/5db74c2764458967f7621440/
- https://perflab.isc.org/#/config/run/5db74c3464458967f7621441/
(Resolved by !5500.)
2. [x] **`respdiff` tests are *sometimes* slow.**
Ever since we merged the dispatch branch, the `respdiff` tests
started failing *intermittently* for `main` (and only `main`)
because of timeouts.
- [job 2016337][1]: pass, ~2m30s per each 10,000 queries
- [job 2016622][2]: pass, ~2m45s per each 10,000 queries
- [job 2017990][3]: pass, ~2m30s per each 10,000 queries
- [job 2020093][4]: fail, 7+ minutes per each 10,000 queries
- [job 2023057][5]: fail, 16+ minutes per each 10,000 queries
- [job 2023490][6]: pass, ~2m40s per each 10,000 queries
I do not think varying CI runner stress can be blamed for this, not
for discrepancies this large. It also never happened before merging
!4601, AFAIK.
3. [x] **A lot of "stress" test graph indicate growing memory use.** #3002
While testing October BIND 9 releases, one of the 1-hour "stress"
tests ran in recursive mode for BIND 9.17.19 yielded a graph which
indicates that memory use growth over time might be an issue.
https://wiki.isc.org/bin/viewfile/QA/BindQaResults_9_11_36?filename=bind-9.17.19-linux-amd64-recursive-1h.png;rev=1
However, that phenomenon was not observable for other OS/arch
combinations this specific code revision was tested with.
It was also not observable on the *same* OS/arch combination for a
very similar code revision (the code differences should not have any
effect on memory use patterns):
https://wiki.isc.org/bin/viewfile/QA/BindQaResults_9_11_36?filename=bind-9.17.19-linux-amd64-recursive-1h.png;rev=2
Pre-release tests run for BIND 9.17.20 confirmed that memory leaks
are a common thing when `named` is used as a recursive resolver.
More details are available in #3002.
The "stress" tests are run on isolated VMs and despite being pretty
synthetic (fixed traffic pattern, everything happens on one machine,
etc.), they have a history of being very stable, so typical issues
like test host load varying over time etc. are not a factor here.
4. [x] **Lame servers with IPv6 unreachable cause hang on shutdown.** #2927
5. [x] **resolver test fails intermittently** #3013
See https://gitlab.isc.org/isc-projects/bind9/-/jobs/2054296
```
I:resolver:query count error: 6 NS records: expected queries 10, actual 11
I:resolver:failed
```
6. [x] **Assertion failed in `dns_resolver_logfetch()`** #2962
7. [x] **Assertion failed in `dns_dispatch_gettcp()`** #2963
8. [x] **Assertion failed in `dns_resolver_destroyfetch()`** #2969
9. [x] **ThreadSanitizer issues with adb** #2978 #2979
10. [x] **fctx_cancelquery() attempts to process a query which has already been freed** #3018
11. [x] **premature TCP connection closure leaks fetch contexts (hang on shutdown)** #3026
12. [ ] **validator loops can cause shutdown hang** #3033
13. [ ] **ADB finds for a broken zone may cause fetch contexts to hang** #3037
14. [ ] **ASAN error in fctx_cancelquery()** #3102
I decided to open a single issue for all of the above problems because I
sense they are somehow related and I hope that fixing the root cause of
one of them will eliminate the other ones as well.
[1]: https://gitlab.isc.org/isc-projects/bind9/-/jobs/2016337
[2]: https://gitlab.isc.org/isc-projects/bind9/-/jobs/2016622
[3]: https://gitlab.isc.org/isc-projects/bind9/-/jobs/2017990
[4]: https://gitlab.isc.org/isc-projects/bind9/-/jobs/2020093
[5]: https://gitlab.isc.org/isc-projects/bind9/-/jobs/2023057
[6]: https://gitlab.isc.org/isc-projects/bind9/-/jobs/2023490Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2959Remove support for signed 32-bit time_t2023-11-02T17:02:20ZOndřej SurýRemove support for signed 32-bit time_tNow there are couple of requirements:
* All user space must be compiled with a 64-bit time_t, which are supported in the musl-1.2 and glibc-2.32 releases, along with installed kernel headers from linux-5.6 or higher.
See for details: h...Now there are couple of requirements:
* All user space must be compiled with a 64-bit time_t, which are supported in the musl-1.2 and glibc-2.32 releases, along with installed kernel headers from linux-5.6 or higher.
See for details: https://lkml.org/lkml/2020/1/29/355?anz=webNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2969Assertion failure in dns_resolver_destroyfetch()2022-05-16T10:19:55ZEvan HuntAssertion failure in dns_resolver_destroyfetch()This often seems to occur in the `forward` system test, while chasing DS records. It looks as if `fctx_cancelqueries()` is being run at the same time by two threads.
```
#2 0x000055d09e875e47 in assertion_failed (file=0x7fc3ca45280b "r...This often seems to occur in the `forward` system test, while chasing DS records. It looks as if `fctx_cancelqueries()` is being run at the same time by two threads.
```
#2 0x000055d09e875e47 in assertion_failed (file=0x7fc3ca45280b "resolver.c",
line=10603, type=isc_assertiontype_require,
cond=0x7fc3ca453c78 "((fctx) != ((void *)0) && ((const isc__magic_t *)(fctx))->magic == ((('F') << 24 | ('!') << 16 | ('!') << 8 | ('!'))))") at main.c:236
#3 0x00007fc3ca4e3c8d in isc_assertion_failed (
file=0x7fc3ca45280b "resolver.c", line=10603,
type=isc_assertiontype_require,
cond=0x7fc3ca453c78 "((fctx) != ((void *)0) && ((const isc__magic_t *)(fctx))->magic == ((('F') << 24 | ('!') << 16 | ('!') << 8 | ('!'))))")
at assertions.c:47
#4 0x00007fc3ca377303 in dns_resolver_destroyfetch (fetchp=0x7fc3c2bf4e08)
at resolver.c:10603
#5 0x00007fc3ca3ae320 in fetch_callback_ds (task=0x7fc3b6f55b80, event=0x0)
at validator.c:620
#6 0x00007fc3ca5126f9 in task_run (task=0x7fc3b6f55b80) at task.c:827
#7 0x00007fc3ca51290e in isc_task_run (task=0x7fc3b6f55b80) at task.c:907
#8 0x00007fc3ca4c2018 in isc__nm_async_task (worker=0x7fc3c7049df0,
ev0=0x7fc3b5a13780) at netmgr/netmgr.c:834
#9 0x00007fc3ca4c229c in process_netievent (worker=0x7fc3c7049df0,
ievent=0x7fc3b5a13780) at netmgr/netmgr.c:913
#10 0x00007fc3ca4c2dd5 in process_queue (worker=0x7fc3c7049df0,
type=NETIEVENT_TASK) at netmgr/netmgr.c:1007
#11 0x00007fc3ca4c1e59 in process_all_queues (worker=0x7fc3c7049df0)
at netmgr/netmgr.c:753
#12 0x00007fc3ca4c1edc in async_cb (handle=0x7fc3c704a150)
at netmgr/netmgr.c:782
#13 0x00007fc3c9df1a51 in uv__async_io (loop=0x7fc3c7049e00,
w=<optimized out>, events=<optimized out>) at src/unix/async.c:163
#14 0x00007fc3c9e035e5 in uv__io_poll (loop=loop@entry=0x7fc3c7049e00,
timeout=<optimized out>) at src/unix/linux-core.c:462
#15 0x00007fc3c9df21ec in uv_run (loop=0x7fc3c7049e00, mode=UV_RUN_DEFAULT)
at src/unix/core.c:389
```May 2022 (9.16.29, 9.16.29-S1, 9.18.3, 9.19.1)Evan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/3059Follow-up from "Draft: Resolve #3055 by examining RTM_NEWADDR, RTM_DELADDR me...2021-12-20T11:58:16ZEvan HuntFollow-up from "Draft: Resolve #3055 by examining RTM_NEWADDR, RTM_DELADDR messages contents"The following discussion from !5638 should be addressed:
- [ ] @marka started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5638#note_254568): (+3 comments)
> We could also just take the RTM_NEWADDR and...The following discussion from !5638 should be addressed:
- [ ] @marka started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5638#note_254568): (+3 comments)
> We could also just take the RTM_NEWADDR and RTM_DELADDR content and add/delete the interfaces individually rather than scanning at all. Leave scanning for startup / reconfiguration. Queue up events while scanning then process any queued events at the end.
Also, see #3064.https://gitlab.isc.org/isc-projects/bind9/-/issues/3063dnssec-verify detect and support multiple cores2023-11-02T17:02:21ZDaniel Stirnimanndnssec-verify detect and support multiple cores### Description
We use `dnssec-verify` (from BIND 9.16) to validate large DNSSEC-signed zones. I noticed that on a multi core processor (eg 16 cores) always only one cpu is used. I guess, validation time could be speed up a lot if all a...### Description
We use `dnssec-verify` (from BIND 9.16) to validate large DNSSEC-signed zones. I noticed that on a multi core processor (eg 16 cores) always only one cpu is used. I guess, validation time could be speed up a lot if all available cores would be used.
### Request
Make `dnssec-verify` use all available cores automatically for operations for which this is possible eg. signature verification.
`dnssec-signzone` already automatically detects and uses all available cores and even has an argument switch to specify an specific number (`man dnssec-signzone`). I think something like this would be very useful:
```
-n ncpus
This option specifies the number of threads to use. By default, one thread is started for each detected CPU.
```Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3113Follow-up from "allow dns_clientinfo to store client ECS data"2023-02-27T14:15:08ZEvan HuntFollow-up from "allow dns_clientinfo to store client ECS data"The following discussion from !5555 should be addressed:
- [ ] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5555#note_257882): (+1 comment)
> Honestly, I would not change `dns_clientinf...The following discussion from !5555 should be addressed:
- [ ] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5555#note_257882): (+1 comment)
> Honestly, I would not change `dns_clientinfo_init()`, but would rather add extra function to set the ecs in the structure:
>
> ```suggestion:-2+0
> void
> dns_clientinfo_init(dns_clientinfo_t *ci, void *data, void *versionp);
>
> void
> dns_clientinfo_set_ecs(dns_clientinfo_t *ci, dns_ecs_t *ecs);
> ```
>
> That way the amount of changes to the other code would be minimized as if I count correctly, it's only three places, where we set the `ecs` data in the `dns_clientinfo` structure.March 2023 (9.16.39, 9.16.39-S1, 9.18.13, 9.18.13-S1, 9.19.11)https://gitlab.isc.org/isc-projects/bind9/-/issues/3122DoT stops working after "rndc reconfigure" when running named as non-root2024-01-11T14:13:01ZsthaugDoT stops working after "rndc reconfigure" when running named as non-root<!--
If the bug you are reporting is potentially security-related - for example,
if it involves an assertion failure or other crash in `named` that can be
triggered repeatedly - then please do *NOT* report it here, but send an
email to [...<!--
If the bug you are reporting is potentially security-related - for example,
if it involves an assertion failure or other crash in `named` that can be
triggered repeatedly - then please do *NOT* report it here, but send an
email to [security-officer@isc.org](security-officer@isc.org).
-->
### Summary
DoT stops working after "rndc reconfigure" when running named as non-root
### BIND version used
```
BIND 9.18.0 (Stable Release) <id:8db45af>
running on FreeBSD amd64 12.3-STABLE FreeBSD 12.3-STABLE r371270 DNS_VIMAGE
built by make with '--disable-linux-caps' '--localstatedir=/var' '--sysconfdir=/usr/local/etc/namedb' '--with-dlopen=yes' '--with-libxml2' '--with-openssl=/usr' '--enable-dnsrps' '--with-readline=libedit' '--enable-dnstap' '--disable-fixed-rrset' '--disable-geoip' '--without-maxminddb' '--without-gssapi' '--with-libidn2=/usr/local' '--with-json-c' '--disable-largefile' '--with-lmdb=/usr/local' '--disable-querytrace' '--enable-tcp-fastopen' '--prefix=/usr/local' '--mandir=/usr/local/man' '--disable-silent-rules' '--infodir=/usr/local/share/info/' '--build=amd64-portbld-freebsd12.2' 'build_alias=amd64-portbld-freebsd12.2' 'CC=cc' 'CFLAGS=-O2 -pipe -DLIBICONV_PLUG -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing ' 'LDFLAGS= -L/usr/local/lib -ljson-c -fstack-protector-strong ' 'LIBS=-L/usr/local/lib' 'CPPFLAGS=-DLIBICONV_PLUG -isystem /usr/local/include' 'CPP=cpp' 'PKG_CONFIG=pkgconf' 'READLINE_CFLAGS=-L/usr/local/lib'
compiled by CLANG FreeBSD Clang 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611aa2)
compiled with OpenSSL version: OpenSSL 1.1.1m-freebsd 14 Dec 2021
linked to OpenSSL version: OpenSSL 1.1.1m-freebsd 14 Dec 2021
compiled with libuv version: 1.41.0
linked to libuv version: 1.41.0
compiled with libnghttp2 version: 1.44.0
linked to libnghttp2 version: 1.44.0
compiled with libxml2 version: 2.9.12
linked to libxml2 version: 20912
compiled with json-c version: 0.15
linked to json-c version: 0.15
compiled with zlib version: 1.2.11
linked to zlib version: 1.2.11
compiled with protobuf-c version: 1.4.0
linked to protobuf-c version: 1.4.0
threads support is enabled
default paths:
named configuration: /usr/local/etc/namedb/named.conf
rndc configuration: /usr/local/etc/namedb/rndc.conf
DNSSEC root key: /usr/local/etc/namedb/bind.keys
nsupdate session key: /var/run/named/session.key
named PID file: /var/run/named/named.pid
named lock file: /var/run/named/named.lock
```
Also tried 9.17.18 (Installed from FreeBSD package) and 9.17.22 (compiled from source).
### Steps to reproduce
Start named with the following command line:
/usr/local/sbin/named -t /var/named -u bind -c /usr/local/etc/namedb/named.conf
and with the following named.conf file:
```
options {
directory "/usr/local/etc/namedb/working";
pid-file "/var/run/named/pid";
dump-file "/var/dump/named_dump.db";
statistics-file "/var/stats/named.stats";
listen-on { 193.75.110.2; 127.0.0.1; };
listen-on port 853 tls dotas2116 { 193.75.110.2; 127.0.0.1; };
interface-interval 0;
recursion yes;
max-cache-size 1500M;
minimal-any yes;
minimal-responses yes;
querylog yes;
allow-query { 194.19.2.0/24; 193.75.110.0/24; 127.0.0.1; };
};
tls dotas2116 {
cert-file "/usr/local/etc/namedb/fullchain.pem";
key-file "/usr/local/etc/namedb/privkey.pem";
protocols { TLSv1.2; TLSv1.3; };
};
```
### What is the current *bug* behavior?
After doing "rndc reconfigure" named no longer listens to TCP port 853. This is visible in the log:
```
Jan 31 16:29:08 nslum named[42849]: no longer listening on 127.0.0.1#853
Jan 31 16:29:08 nslum named[42849]: listening on IPv4 interface lo0, 127.0.0.1#853
Jan 31 16:29:08 nslum named[42849]: creating TLS socket: permission denied
Jan 31 16:29:08 nslum named[42849]: creating IPv4 interface lo0 failed; interface ignored
Jan 31 16:29:08 nslum named[42849]: no longer listening on 193.75.110.2#853
Jan 31 16:29:08 nslum named[42849]: listening on IPv4 interface ixl1.15, 193.75.110.2#853
Jan 31 16:29:08 nslum named[42849]: creating TLS socket: permission denied
Jan 31 16:29:08 nslum named[42849]: creating IPv4 interface ixl1.15 failed; interface ignored
```
Using "dig +tls" to test results in:
;; Connection to 193.75.110.2#853(193.75.110.2) for vg.no failed: connection refused.
and using the FreeBSD "sockstat" command shows named is not listening to TCP port 853.
### What is the expected *correct* behavior?
The expected behavior is that "dig +tls" works, and resolves names normally. This works right after startup - and using the "sockstat command I can see that named is listening to TCP port 853:
```
Jan 31 16:27:10 nslum named[42849]: listening on IPv4 interface lo0, 127.0.0.1#853
Jan 31 16:27:10 nslum named[42849]: listening on IPv4 interface ixl1.15, 193.75.110.2#853
```
### Relevant configuration files
See named.conf above.
### Relevant logs and/or screenshots
(Paste any relevant logs - please use code blocks (```) to format console
output, logs, and code, as it's very hard to read otherwise.)
### Possible fixes
(If you can, link to the line of code that might be responsible for the
problem.)May 2022 (9.16.29, 9.16.29-S1, 9.18.3, 9.19.1)Artem BoldarievArtem Boldarievhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3168Refactor recursion tracking in lib/ns/query.c2022-10-10T07:40:54ZMichał KępieńRefactor recursion tracking in lib/ns/query.cWhile #3147 was fixed with a change that should be safe to backport
(!5870 only touches statistics-related code), a number of problems in
`lib/ns/query.c` were identified while that problem was being
investigated:
1. [x] (!5882) Code d...While #3147 was fixed with a change that should be safe to backport
(!5870 only touches statistics-related code), a number of problems in
`lib/ns/query.c` were identified while that problem was being
investigated:
1. [x] (!5882) Code duplication.
2. [x] (!5883) Logically unrelated bits of code (e.g. prefetch and RPZ)
share common code paths (e.g. `prefetch_done()`) even
though they do not need to.
3. [x] (!5884) Various features able to initiate recursion use common
memory locations (e.g. `client->fetchhandle`,
`client->query.fetch`), which forces them all to
"cooperate" with each other even though they are all
logically separate.
4. [x] (!5885) One of the overloaded variables is
`client->recursionquota`, which causes recursion quota
to be handled inconsistently.
5. [x] (!5886) The `check_recursionquota()` function needs to be
cleaned up.
Unlike #3147, IMHO addressing the above issues is 9.19+ material due to
the amount of changes involved.July 2022 (9.16.31, 9.16.31-S1, 9.18.5, 9.19.3)Michał KępieńMichał Kępień