BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2024-02-29T15:26:01Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3810Replace system test runner with pytest2024-02-29T15:26:01ZTom KrizekReplace system test runner with pytestThe legacy solution for running systems test has evolved over the course of years and is currently a mix of shell & perl scripts intermingled with the build system, while some of the system tests utilize pytest. Implementing a more consi...The legacy solution for running systems test has evolved over the course of years and is currently a mix of shell & perl scripts intermingled with the build system, while some of the system tests utilize pytest. Implementing a more consistent solution using just pytest as a runner could bring following benefits:
- better test run isolation (i.e. artifacts from previous run don't interfere with current test run)
- more precise control over test selection (running just a single test case)
- getting rid of perl+shell glue scripts
- a simpler and more standard way to run and parallelize test runs
- solid foundation for future extensions (e.g. wrapping test execution inside a network/pid namespace)
For a transitory period of time, the legacy test framework should be supported, since it'd be difficult to replace everything at once. The pytest runner should be available in 9.18+, it'd be prudent to keep the legacy runner support until 9.16 reaches EOL. By that time, we should have enough insight to determine whether pytest proves to be a suitable replacement and throw away the legacy runner from supported branches at that point.
Migration plan for moving to pytest runner and dropping the legacy runner support:
- Phase I - pytest runner development, legacy runner supported
- [x] initial implementation of the pytest runner (#3978, !6809)
- [x] support out-of-tree tests (#4246)
- [x] resolve support on CI systems with old pytest (OpenBSD, CentOS 7) (!8193)
- [x] implement any missing (and desired) features from legacy runner (#4252)
- [x] configure `make check` to invoke pytest (#4262)
- Phase II - deprecating legacy runner - 9.19-only
- [ ] remove legacy runner control script(s) - legacy.run.sh, get_ports.sh ...
- [ ] remove no longer needed scripts from system tests (e.g. clean.sh)
- [ ] remove conf.sh(.common) and declare variables in pytest only
- [ ] remove the Makefile entanglement
- [ ] declare python and pytest-xdist as required dependencies for tests + document
- [ ] address any `FUTURE` comments in the pytest runner code
- Phase III - cleanup after legacy runner
- [ ] rewrite start.pl/stop.pl to python (related https://gitlab.isc.org/isc-projects/bind9/-/issues/3198)
- [ ] rewrite remaining setup/teardown perl&shell scripts to python
- [ ] rewrite setup.sh/prereq.sh system tests scripts to pytest fixtures
- [ ] ensure system test documentation is up to dateBIND 9.19.xTom KrizekTom Krizekhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3262Offloaded RPZ processing needs 'shuttingdown' signal2023-11-02T17:05:04ZOndřej SurýOffloaded RPZ processing needs 'shuttingdown' signalWhen the database is shutdown during the threadpool processing of the RPZ, we would crash:
```
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007f999...When the database is shutdown during the threadpool processing of the RPZ, we would crash:
```
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007f99990958f3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007f99990486a6 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007f99990327d3 in __GI_abort () at abort.c:79
#4 0x0000000000415baa in assertion_failed (file=<optimized out>, line=<optimized out>, type=<optimized out>, cond=<optimized out>) at main.c:237
#5 0x00007f9999af07fa in isc_assertion_failed (file=file@entry=0x7f9999a42890 "db.c", line=line@entry=581, type=type@entry=isc_assertiontype_require,
cond=cond@entry=0x7f9999a49528 "nodep != ((void *)0) && *nodep != ((void *)0)") at assertions.c:49
#6 0x00007f99998f2bef in dns_db_detachnode (db=<optimized out>, nodep=nodep@entry=0x7f99739f98f0) at db.c:581
#7 0x00007f99999ca3b2 in update_nodes (rpz=rpz@entry=0x7f99928d1400, newnodes=<optimized out>) at rpz.c:1762
#8 0x00007f99999cace8 in update_rpz_cb (data=0x7f99928d1400) at rpz.c:1942
#9 0x00007f99993fce94 in uv__queue_work (w=0x7f9907839600) at /usr/src/libuv-v1.43.0/src/threadpool.c:326
#10 0x00007f99993fc61c in worker (arg=0x0) at /usr/src/libuv-v1.43.0/src/threadpool.c:122
#11 0x00007f9999093b1a in start_thread (arg=<optimized out>) at pthread_create.c:443
#12 0x00007f99991178e4 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
```
Related MR with reverts:
* https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/6091
* https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/6092
Original issue:
* https://gitlab.isc.org/isc-projects/bind9/-/issues/3190
Original MRs with offloaded RPZ:
* https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5938
* https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/6072
* https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/6074Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4592Improve the isc_heap resize algorithm2024-03-06T08:38:23ZOndřej SurýImprove the isc_heap resize algorithmThe current isc_heap resizing algorithm grows the array for holding the heap elements by 1024 (there's an argument to `isc_heap_create()`, but either default (1024) or explicit 1024 is used everywhere).The current isc_heap resizing algorithm grows the array for holding the heap elements by 1024 (there's an argument to `isc_heap_create()`, but either default (1024) or explicit 1024 is used everywhere).May 2024 (9.18.27, 9.18.27-S1, 9.19.24)https://gitlab.isc.org/isc-projects/bind9/-/issues/3948Remove the artificial limit on max zone keys2023-03-15T09:37:28ZOndřej SurýRemove the artificial limit on max zone keysThe `struct dns_update_state` contains the following member `dst_key_t *zone_keys[DNS_MAXZONEKEYS];` limiting the number of the zone keys to `32`. This seems enough, but since we already pass memory context to both `lib/dns/zone.c:dns__...The `struct dns_update_state` contains the following member `dst_key_t *zone_keys[DNS_MAXZONEKEYS];` limiting the number of the zone keys to `32`. This seems enough, but since we already pass memory context to both `lib/dns/zone.c:dns__zone_findkeys()`, `lib/dns/dnssec.c:dns_dnssec_findzonekeys()`, and `lib/dns/update.c:find_zone_keys()` and return the number of found keys in `&nkeys`, we could as well allocate the array in `dns_dnssec_findzonekeys()` by calling `dns_rdataset_count()` first, allocating the array to hold all the possible keys and then shrinking to the actual number of keys.
Alternatively, this could be converted to `ISC_LIST()` instead of a static array.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3063dnssec-verify detect and support multiple cores2023-11-02T17:02:21ZDaniel Stirnimanndnssec-verify detect and support multiple cores### Description
We use `dnssec-verify` (from BIND 9.16) to validate large DNSSEC-signed zones. I noticed that on a multi core processor (eg 16 cores) always only one cpu is used. I guess, validation time could be speed up a lot if all a...### Description
We use `dnssec-verify` (from BIND 9.16) to validate large DNSSEC-signed zones. I noticed that on a multi core processor (eg 16 cores) always only one cpu is used. I guess, validation time could be speed up a lot if all available cores would be used.
### Request
Make `dnssec-verify` use all available cores automatically for operations for which this is possible eg. signature verification.
`dnssec-signzone` already automatically detects and uses all available cores and even has an argument switch to specify an specific number (`man dnssec-signzone`). I think something like this would be very useful:
```
-n ncpus
This option specifies the number of threads to use. By default, one thread is started for each detected CPU.
```Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2959Remove support for signed 32-bit time_t2023-11-02T17:02:20ZOndřej SurýRemove support for signed 32-bit time_tNow there are couple of requirements:
* All user space must be compiled with a 64-bit time_t, which are supported in the musl-1.2 and glibc-2.32 releases, along with installed kernel headers from linux-5.6 or higher.
See for details: h...Now there are couple of requirements:
* All user space must be compiled with a 64-bit time_t, which are supported in the musl-1.2 and glibc-2.32 releases, along with installed kernel headers from linux-5.6 or higher.
See for details: https://lkml.org/lkml/2020/1/29/355?anz=webNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1771Refactor how we load librpz.so2023-11-02T16:58:14ZOndřej SurýRefactor how we load librpz.soCurrently, there are three ways how `librpz.so` could be linked into BIND 9. In BIND 9.17, the `dlopen()` is mandatory (via libltdl), so this needs little bit of refactoring.Currently, there are three ways how `librpz.so` could be linked into BIND 9. In BIND 9.17, the `dlopen()` is mandatory (via libltdl), so this needs little bit of refactoring.Not plannedOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1739Restore the GSS.framework (and KRB5.framework) on macOS2023-11-02T16:51:56ZOndřej SurýRestore the GSS.framework (and KRB5.framework) on macOSThe !985 has refactored GSSAPI usage and temporarily dropped use of GSS.framework (using gssapi and krb5 libraries instead of frameworks). Use of GSS.framework on macOS needs to be restored.The !985 has refactored GSSAPI usage and temporarily dropped use of GSS.framework (using gssapi and krb5 libraries instead of frameworks). Use of GSS.framework on macOS needs to be restored.Not plannedOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1692Use include-what-you-use to trim down included header files2023-11-02T16:51:55ZOndřej SurýUse include-what-you-use to trim down included header files[include-what-you-use](https://include-what-you-use.org/) is a great tool that can be used to trim down the header files that we include in every file.
> "Include what you use" means this: for every symbol (type, function variable, or m...[include-what-you-use](https://include-what-you-use.org/) is a great tool that can be used to trim down the header files that we include in every file.
> "Include what you use" means this: for every symbol (type, function variable, or macro) that you use in foo.cc, either foo.cc or foo.h should #include a .h file that exports the declaration of that symbol. The include-what-you-use tool is a program that can be built with the clang libraries in order to analyze #includes of source files to find include-what-you-use violations, and suggest fixes for them.
>
> The main goal of include-what-you-use is to remove superfluous #includes. It does this both by figuring out what #includes are not actually needed for this file (for both .cc and .h files), and replacing #includes with forward-declares when possible.
Why this is useful?
Excerpt from [Why Include What You Use?](https://github.com/include-what-you-use/include-what-you-use/blob/master/docs/WhyIWYU.md)
* Faster Compiles
* Fewer Recompiles
* Allow Refactoring
* Self-documentation
* Dependency CuttingNot plannedOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1550[ISC-support #15905] rndc stop issued after server (or single zone) rndc relo...2024-03-13T21:08:58ZCathy Almond[ISC-support #15905] rndc stop issued after server (or single zone) rndc reload and during ixfr-from-differences processing leaves the .jnl file corruptedFrom Support ticket [#15905](https://support.isc.org/Ticket/Display.html?id=15905)
9.11.x (but I don't anticipate that the BIND version makes any difference)
This problem is readily reproducible (and, I suspect, occurs because "rndc st...From Support ticket [#15905](https://support.isc.org/Ticket/Display.html?id=15905)
9.11.x (but I don't anticipate that the BIND version makes any difference)
This problem is readily reproducible (and, I suspect, occurs because "rndc stop" doesn't recognise that the zone is effectively 'dynamic' because it has been reloaded with 'ixfr-from-differences yes;').
Here's what is being done, per the server logs. First the reload (the outcome is also the same with 'reload zone'):
```
09-Jan-2020 14:50:29.157 general: info: received control channel command 'reload' ============>>> RELOAD COMMAND
09-Jan-2020 14:50:29.179 general: info: loading configuration from '/etc/named.conf'
... etcetera
09-Jan-2020 14:50:29.202 general: info: reloading configuration succeeded
09-Jan-2020 14:50:29.202 general: info: reloading zones succeeded
09-Jan-2020 14:50:29.202 general: notice: all zones loaded
09-Jan-2020 14:50:29.202 general: notice: running
```
Note that the reload has completed, as far as the logging is concerned, but, it would appear that the regeneration of the .jnl files via 'ixfr-from-differences yes;' has not (high CPU use by named - suggests that it is busy doing this).
Then the 'rndc stop' is issued - and it completes almost immediately (no evidence that it is waiting for the processing to complete), in fact, it seems to log that it has aborted a zone reload, even though the previous logging said that the reload *had* completed:
```
09-Jan-2020 14:50:33.211 general: info: received control channel command 'stop -p'
09-Jan-2020 14:50:33.212 general: info: shutting down: flushing changes
.. etcetera (just the logs of the various sockets being closed here)
09-Jan-2020 14:50:33.216 general: error: zone test.com/IN: loading from master file dynamic/test.com.zone failed: operation canceled
09-Jan-2020 14:50:33.216 general: error: zone test.com/IN: not loaded due to errors.
09-Jan-2020 14:50:34.265 general: notice: exiting
```
And then after this, restarting named - the zone can no longer be loaded - the journal file does not tally with the zone itself:
```
...
09-Jan-2020 14:51:28.141 general: error: zone test.com/IN: journal rollforward failed: journal out of sync with zone
09-Jan-2020 14:51:28.141 general: error: zone test.com/IN: not loaded due to errors.
09-Jan-2020 14:51:28.142 general: notice: all zones loaded
09-Jan-2020 14:51:28.142 general: notice: running
```
======
Something has gone badly wrong during the 'rndc stop' - which is supposed to be a graceful shutdown of named. I'm assuming that the problem is that the .jnl file itself is corrupt, rather than that something has happened to the zone file on disk - but will ask for more data to confirm this.
```
stop [-p]
Stop the server, making sure any recent changes made through dynamic update
or IXFR are first saved to the master files of the updated zones. If -p is
specified named’s process id is returned. This allows an external process
to determine when named had completed stopping.
```
Now, since ixfr-from-differences processing could take an age to complete, I don't think it's reasonable to wait forever on the rndc stop. Possibly one solution would be have a (configurable?) timeout, after which any pending ixfr-from-differences .jnl file generation is terminated and the incomplete .jnl file discarded/removed. After all, the administrator in this scenario has just presented named with a new zone file that it asked named to load - so we know that the full copy of the zone on disk should be good and valid - we just loaded from it.
====
The workaround if this happens, is presumably to manually discard the corrupted .jnl file and restart named.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1156isc_refcount_increment0 is wrong, the code using it needs refactoring2023-11-02T16:42:12ZOndřej Surýisc_refcount_increment0 is wrong, the code using it needs refactoringThe `isc_refcount_increment0()` does two things and that's wrong.
1. The first purpose is to bump the value from `0` -> `1` making the object referenced.
2. The second purpose is to increment the reference counter.
This has several pr...The `isc_refcount_increment0()` does two things and that's wrong.
1. The first purpose is to bump the value from `0` -> `1` making the object referenced.
2. The second purpose is to increment the reference counter.
This has several problems:
1. You can't check whether the previous value really was `0`.
2. When object becomes dereferenced with `isc_refcount_decrement() == 1`, the `isc_refcount_increment0()` can make it referenced again while destroying the object.
There are two things that we could do about it:
1. Don't use isc_refcount API when it's not reference counting, prepare similar API for object counting (isc_objcount?)
2. Use `isc_refcount_init()` when initializing object for the first time (note that `isc_refcount_init()` is not atomic)
3. Always initialize the value to `1` and adjust the code that destroys the object
4. Split `isc_refcount_increment0()` to `isc_refcount_incfirst()` and existing `isc_refcount_increment()`
Nevertheless, the overloading of the API for `<1, MAX>` and `<0, MAX>` reference counting is wrong.Not plannedOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1113Use per-query GeoIP2 entry cache2023-11-02T16:42:11ZEvan HuntUse per-query GeoIP2 entry cacheAn idea suggested by @ondrej in relation to https://gitlab.isc.org/isc-projects/bind9/merge_requests/2031#note_64941:
We can improve the GeoIP2 code by storing a copy of the `MMDB_entry` for each database we've consulted in the `ns_clie...An idea suggested by @ondrej in relation to https://gitlab.isc.org/isc-projects/bind9/merge_requests/2031#note_64941:
We can improve the GeoIP2 code by storing a copy of the `MMDB_entry` for each database we've consulted in the `ns_client` object. When we need to make another query to the same database for the same client address (or client ECS address, on 9.11), we already know it's going to get the same answer, so we can keep it and reuse it.
This is currently done with thread-specific state memory in lib/dns/geoip2.c, but would be simpler this way.Not plannedEvan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/733Rewrite various logging functions to variadic macros...2023-11-02T16:32:29ZOndřej SurýRewrite various logging functions to variadic macros...There's a lot of pre-C99 code that defines extra logging functions in different places in the code, like this:
```
static void
manager_log(isc__socketmgr_t *sockmgr,
isc_logcategory_t *category, isc_logmodule_t *module, int l...There's a lot of pre-C99 code that defines extra logging functions in different places in the code, like this:
```
static void
manager_log(isc__socketmgr_t *sockmgr,
isc_logcategory_t *category, isc_logmodule_t *module, int level,
const char *fmt, ...) ISC_FORMAT_PRINTF(5, 6);
static void
manager_log(isc__socketmgr_t *sockmgr,
isc_logcategory_t *category, isc_logmodule_t *module, int level,
const char *fmt, ...)
{
char msgbuf[2048];
va_list ap;
if (! isc_log_wouldlog(isc_lctx, level))
return;
va_start(ap, fmt);
vsnprintf(msgbuf, sizeof(msgbuf), fmt, ap);
va_end(ap);
isc_log_write(isc_lctx, category, module, level,
"sockmgr %p: %s", sockmgr, msgbuf);
}
```
With C99, this could be rewritten using variadic macros like this:
```
#define manager_log(sockmgr, category, module, level, fmt, ...) \
if (isc_log_wouldlog(isc_lctx, level)) { \
isc_log_write(isc_lctx, category, module, level, "sockmgr %p: " # fmt, sockmgr, __VA_ARGS__); \
}
```
Using variadic macros would lead to having fewer functions.
@joey, could you take care of it please?Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2411Gracefully shutdown the TLS connections in TLSDNS using SSL_shutdown2021-09-02T12:42:57ZOndřej SurýGracefully shutdown the TLS connections in TLSDNS using SSL_shutdownThe SSL_shutdown needs bit back and forth on the networking channel, so right now we are doing ungraceful shutdown by tearing down the underlying TCP connection. This should be fixed to behave like a good netizen.The SSL_shutdown needs bit back and forth on the networking channel, so right now we are doing ungraceful shutdown by tearing down the underlying TCP connection. This should be fixed to behave like a good netizen.Not plannedOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3877Support _FORTIFY_SOURCE=32023-02-16T10:57:26ZTony FinchSupport _FORTIFY_SOURCE=3Recent versions of clang, gcc, and glibc support `_FORTIFY_SOURCE=3` which adds support for tracking sizes of allocations at run time in a way that can be checked by `memmove()` and friends. To make use of the new fortification level, al...Recent versions of clang, gcc, and glibc support `_FORTIFY_SOURCE=3` which adds support for tracking sizes of allocations at run time in a way that can be checked by `memmove()` and friends. To make use of the new fortification level, allocation functions need attributes indicating which argument is the size (`__alloc_size__`) and other functions need to tell the compiler which arguments are pointer, size pairs (`__access__`). For more details see https://developers.redhat.com/articles/2023/02/06/how-improve-application-security-using-fortifysource3#Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2953Resolver issues with refactored dispatch code2023-11-02T17:02:19ZMichał KępieńResolver issues with refactored dispatch codeThis issue attempts to describe various issues with resolver behavior
found after merging !4601 (#2401). Most of these issues are
intermittent, so it is important to keep track of them somewhere in
order to not forget that they exist. ...This issue attempts to describe various issues with resolver behavior
found after merging !4601 (#2401). Most of these issues are
intermittent, so it is important to keep track of them somewhere in
order to not forget that they exist. We should get to the bottom of all
of these issues before we release BIND 9.18.0.
1. [x] **Recursive Perflab tests cause the resolver to stop responding.**
This issue might be the simplest to start with because the behavior
observed seems to be consistent rather than intermittent. Namely,
all Perflab jobs which test a resolver seem to crank out a response
rate of some 70-120 kQPS at the beginning of the test and then...
the resolver stops responding indefinitely. While Perflab was not
designed with recursive tests in mind and therefore we can treat its
recursive results with a grain of salt, it certainly should not be
reporting zeros all over the place.
- https://perflab.isc.org/#/config/run/5bf195dd83ba91a870b2976f/
- https://perflab.isc.org/#/config/run/5cd6a166643076f6c1f6c26f/
- https://perflab.isc.org/#/config/run/5db74b6264458967f762143a/
- https://perflab.isc.org/#/config/run/5db74b7264458967f762143b/
- https://perflab.isc.org/#/config/run/5db74c2764458967f7621440/
- https://perflab.isc.org/#/config/run/5db74c3464458967f7621441/
(Resolved by !5500.)
2. [x] **`respdiff` tests are *sometimes* slow.**
Ever since we merged the dispatch branch, the `respdiff` tests
started failing *intermittently* for `main` (and only `main`)
because of timeouts.
- [job 2016337][1]: pass, ~2m30s per each 10,000 queries
- [job 2016622][2]: pass, ~2m45s per each 10,000 queries
- [job 2017990][3]: pass, ~2m30s per each 10,000 queries
- [job 2020093][4]: fail, 7+ minutes per each 10,000 queries
- [job 2023057][5]: fail, 16+ minutes per each 10,000 queries
- [job 2023490][6]: pass, ~2m40s per each 10,000 queries
I do not think varying CI runner stress can be blamed for this, not
for discrepancies this large. It also never happened before merging
!4601, AFAIK.
3. [x] **A lot of "stress" test graph indicate growing memory use.** #3002
While testing October BIND 9 releases, one of the 1-hour "stress"
tests ran in recursive mode for BIND 9.17.19 yielded a graph which
indicates that memory use growth over time might be an issue.
https://wiki.isc.org/bin/viewfile/QA/BindQaResults_9_11_36?filename=bind-9.17.19-linux-amd64-recursive-1h.png;rev=1
However, that phenomenon was not observable for other OS/arch
combinations this specific code revision was tested with.
It was also not observable on the *same* OS/arch combination for a
very similar code revision (the code differences should not have any
effect on memory use patterns):
https://wiki.isc.org/bin/viewfile/QA/BindQaResults_9_11_36?filename=bind-9.17.19-linux-amd64-recursive-1h.png;rev=2
Pre-release tests run for BIND 9.17.20 confirmed that memory leaks
are a common thing when `named` is used as a recursive resolver.
More details are available in #3002.
The "stress" tests are run on isolated VMs and despite being pretty
synthetic (fixed traffic pattern, everything happens on one machine,
etc.), they have a history of being very stable, so typical issues
like test host load varying over time etc. are not a factor here.
4. [x] **Lame servers with IPv6 unreachable cause hang on shutdown.** #2927
5. [x] **resolver test fails intermittently** #3013
See https://gitlab.isc.org/isc-projects/bind9/-/jobs/2054296
```
I:resolver:query count error: 6 NS records: expected queries 10, actual 11
I:resolver:failed
```
6. [x] **Assertion failed in `dns_resolver_logfetch()`** #2962
7. [x] **Assertion failed in `dns_dispatch_gettcp()`** #2963
8. [x] **Assertion failed in `dns_resolver_destroyfetch()`** #2969
9. [x] **ThreadSanitizer issues with adb** #2978 #2979
10. [x] **fctx_cancelquery() attempts to process a query which has already been freed** #3018
11. [x] **premature TCP connection closure leaks fetch contexts (hang on shutdown)** #3026
12. [ ] **validator loops can cause shutdown hang** #3033
13. [ ] **ADB finds for a broken zone may cause fetch contexts to hang** #3037
14. [ ] **ASAN error in fctx_cancelquery()** #3102
I decided to open a single issue for all of the above problems because I
sense they are somehow related and I hope that fixing the root cause of
one of them will eliminate the other ones as well.
[1]: https://gitlab.isc.org/isc-projects/bind9/-/jobs/2016337
[2]: https://gitlab.isc.org/isc-projects/bind9/-/jobs/2016622
[3]: https://gitlab.isc.org/isc-projects/bind9/-/jobs/2017990
[4]: https://gitlab.isc.org/isc-projects/bind9/-/jobs/2020093
[5]: https://gitlab.isc.org/isc-projects/bind9/-/jobs/2023057
[6]: https://gitlab.isc.org/isc-projects/bind9/-/jobs/2023490Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/3059Follow-up from "Draft: Resolve #3055 by examining RTM_NEWADDR, RTM_DELADDR me...2021-12-20T11:58:16ZEvan HuntFollow-up from "Draft: Resolve #3055 by examining RTM_NEWADDR, RTM_DELADDR messages contents"The following discussion from !5638 should be addressed:
- [ ] @marka started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5638#note_254568): (+3 comments)
> We could also just take the RTM_NEWADDR and...The following discussion from !5638 should be addressed:
- [ ] @marka started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5638#note_254568): (+3 comments)
> We could also just take the RTM_NEWADDR and RTM_DELADDR content and add/delete the interfaces individually rather than scanning at all. Leave scanning for startup / reconfiguration. Queue up events while scanning then process any queued events at the end.
Also, see #3064.https://gitlab.isc.org/isc-projects/bind9/-/issues/4559Convert DNS_GETDB_ into struct with 1-bit long booleans2024-02-02T13:23:23ZOndřej SurýConvert DNS_GETDB_ into struct with 1-bit long booleansSee https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/8683#note_433377 for details:
> Going step further, I think it can very well be a struct with booleans. It should cost nothing because compiler is not stupid nowadays and it...See https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/8683#note_433377 for details:
> Going step further, I think it can very well be a struct with booleans. It should cost nothing because compiler is not stupid nowadays and it will make decoding values in coredumps easier.
>
> (To be clear - I mean something like this: !6902 (merged))https://gitlab.isc.org/isc-projects/bind9/-/issues/2697Further taskmgr refactoring2022-03-01T09:56:48ZOndřej SurýFurther taskmgr refactoringJust dumping notes/ideas:
* schedule events directly onto the worker loops, not tasks
* thus events will have to attach to task (or just reference count running events)
* as the last event schedule conditional task cleanup (move the logi...Just dumping notes/ideas:
* schedule events directly onto the worker loops, not tasks
* thus events will have to attach to task (or just reference count running events)
* as the last event schedule conditional task cleanup (move the logic from task_run there)
* use isc_queue_t to store task events instead of locked LIST to remove contention
This should further simplify the taskmgr logic.Not plannedOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2450Follow-up from "Draft: Resolve "XoT xfrin""2022-05-23T09:10:40ZOndřej SurýFollow-up from "Draft: Resolve "XoT xfrin""The following discussion from !4571 should be addressed:
- [ ] @each started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/4571#note_189777):
> Rebased, squashed, pushed some suggestions, including CHAN...The following discussion from !4571 should be addressed:
- [ ] @each started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/4571#note_189777):
> Rebased, squashed, pushed some suggestions, including CHANGES, release note, documentation (which may not be very good, I'm not sure I have the TLS terminology correct), added setter/getter functions for a few things that were left out, and a few other bits and bobs.
>
> I think `dns_transport` should be `isc_transport` so that it can be used with the netmgr and as a parameter to `isc_tlsctx_createserver()`. I almost made that change already but decided to leave it for now so that it would be easier to review the changes I've already made (moving a file makes it harder to read diffs).
>
> We urgently MUST create key and cert files so that we can test with something other than "ephemeral" in the system tests. I suspect non-ephemeral configurations may not work for DoT currently, and I'm certain they don't work for XoT.
>
> I'm about to reveal some possibly-embarrassing ignorance about TLS: does it make _sense_ to reference `tls` statements in `primaries`? `isc_tlsctx_createclient()` doesn't take any parameters, so I'm not sure what `tls <anything-but-ephemeral>` would even do. (I already confirmed it's basically a no-op by configuring it with "tls whatever", a configuration that uses /dev/null for both key and cert files, and it worked fine.)
>
> So, is configuring client-side TLS parameters a thing we want to be able to do but haven't implemented yet? Or is it a thing nobody ever does, and we should revise the syntax so as not to imply it's possible?Not planned