BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2023-06-29T13:04:00Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4186Name Buffer Truncation2023-06-29T13:04:00ZMarkus VervierName Buffer Truncation
### Summary
A truncation of the name of memory pools was found which might lead to unintended behavior
or incorrect debugging output.
A memory pool structure `isc_mempool` has a member field name with a capacity of 16 bytes as
shown i...
### Summary
A truncation of the name of memory pools was found which might lead to unintended behavior
or incorrect debugging output.
A memory pool structure `isc_mempool` has a member field name with a capacity of 16 bytes as
shown in the following listing from file `lib/isc/mem.c`:
~~~c
struct isc_mempool {
/* always unlocked */
unsigned int magic;
isc_mem_t *mctx;
/*%< our memory context */
ISC_LINK(isc_mempool_t) link; /*%< next pool in this mem context */
element *items;
/*%< low water item list */
size_t size;
/*%< size of each item on this pool */
size_t allocated;
/*%< # of items currently given out */
size_t freecount;
/*%< # of items on reserved list */
size_t freemax;
/*%< # of items allowed on free list */
size_t fillcount;
/*%< # of items to fetch on each fill */
/*%< Stats only. */
size_t gets; /*%< # of requests to this pool */
/*%< Debugging only. */
char name[16]; /*%< printed name in stats reports */
};
~~~
In the function `dns_zonemgr_create()` a string of size 16 without the terminating NUL byte is passed
on to function `isc_mem_setname()`, leading to silent truncation of the last character in that string
as shown in the following listing:
~~~c
for (size_t i = 0; i < zmgr->workers; i++) {
isc_mem_create(&zmgr->mctxpool[i]);
isc_mem_setname(zmgr->mctxpool[i], "zonemgr-mctxpool");
,→
// MARK truncation / off by one (namebuffer is 16 bytes only)
}
~~~
This issue is informational since the truncation has no security implications, but could lead to
incorrect assumptions or functionality defects.
### BIND version used
BIND 9.19.13 (Development Release) <id:66a3c6b>
### Possible fixes
X41 recommends either increase the buffer size or shorten the name value, but to also add an
assertion to the `isc_mem_create()` function that ensures the name size is larger than zero and less
than 16 bytes without the terminating NUL byte.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4164Investigate performance impact of UDP_GRO2023-06-27T07:06:24ZPetr Špačekpspacek@isc.orgInvestigate performance impact of UDP_GRO### Description
Mention of UDP_GRO in [Linux udp man page](https://man.archlinux.org/man/udp.7) sounds worth investigating:
> #### UDP_GRO (since Linux 5.0)
> Enables UDP receive offload. If enabled, the socket may receive multiple dat...### Description
Mention of UDP_GRO in [Linux udp man page](https://man.archlinux.org/man/udp.7) sounds worth investigating:
> #### UDP_GRO (since Linux 5.0)
> Enables UDP receive offload. If enabled, the socket may receive multiple datagrams worth of data as a single large buffer, together with a cmsg(3) that holds the segment size. This option is the inverse of segmentation offload. It reduces receive cost by handling multiple datagrams worth of data as a single large packet in the kernel receive path, even when that exceeds MTU. This option should not be used in code intended to be portable.
More reading:
https://developers.redhat.com/articles/2021/11/05/improve-udp-performance-rhel-85
### Request
- Investigate if UDP receipt is even a bottleneck.
- Investigate if UDP_GRO makes a difference and is worth messing with (I supposed in libuv).
### Links / referencesLong-termhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4162SHA-1 removal2023-06-27T07:27:14ZPetr Špačekpspacek@isc.orgSHA-1 removal### Description
From [NIST announcement](https://csrc.nist.gov/news/2022/nist-transitioning-away-from-sha-1-for-all-apps):
> As a result, NIST will transition away from the use of SHA-1 for applying cryptographic protection to **all app...### Description
From [NIST announcement](https://csrc.nist.gov/news/2022/nist-transitioning-away-from-sha-1-for-all-apps):
> As a result, NIST will transition away from the use of SHA-1 for applying cryptographic protection to **all applications** by December 31, 2030
### Request
Don't get caught asleep at the wheel.
### Links / references
Send questions about the transition in an email to sha-1-transition@nist.gov. Visit the [Policy on Hash Functions](https://csrc.nist.gov/projects/hash-functions/nist-policy-on-hash-functions) page to learn more.Long-term2030-12-31https://gitlab.isc.org/isc-projects/bind9/-/issues/4161Support quantum safe DNSSEC algorithms2023-06-27T07:27:28ZPetr Špačekpspacek@isc.orgSupport quantum safe DNSSEC algorithms### Description
Reportedly US government is going to mandate post-quantum algorithm support from 2026 onward, with no legacy algorithms allowed after 2033.
### Request
Explore how we can integrate quantum safe algorithms for early exp...### Description
Reportedly US government is going to mandate post-quantum algorithm support from 2026 onward, with no legacy algorithms allowed after 2033.
### Request
Explore how we can integrate quantum safe algorithms for early experimentation.
Many algorithms are already available as OpenSSL provider here: https://github.com/open-quantum-safe/oqs-provider
### Additional details
* [FALCON implementation in PowerDNS](https://indico.dns-oarc.net/event/42/contributions/902/attachments/871/1601/Post-Quantum%20DNSSEC%20with%20FALCON-512%20and%20PowerDNS(2).pdf)
* [Verisign's presentation](https://indico.dns-oarc.net/event/46/contributions/985/attachments/938/1728/OARC40-ResearchAgendaForAPQCDNSSEC-Final.pdf)
Word of mouth from Red Hat crypto people I talked to:
Right now it seems that NIST might standardize 5 algorithms, with several variants for each algorithm with intent to provide 128/256 bit-equivalent of security.
Rambling about candidate algorithms for DNSSEC:
- HSS/LMS & XMSS^MT algorithms are extremely susceptible to key reuse. One key reuse ruins the whole thing. Don't use it.
- Falcon-512 has smallest signatures by large margin (around 666 bytes). CRYSTALS-Dillithium are built on the same principle but have larger signatures (about 2420 bytes). The problem is, both are reportedly built on shaky grounds because we as humankind don't fully understand the math behind them, so chances for breaking these algorithms in couple years are non-negligible.
- The remaining candidate algorithm is SPHINCS+-128. That one is most solid because it's based on ordinary hashes, which are well understood. The catch is that one signature is about 7856 bytes :exploding_head:
Consequently, this sounds like we need very good very solid TCP/TLS/QUIC support in client and server, so we are not limited to UDP packet sizes. That's IMHO the only way to go without significantly changing the protocol.
(Or we can go and engineer DNS 2.0 :grinning:)
### Links / referencesLong-term2026-01-01https://gitlab.isc.org/isc-projects/bind9/-/issues/4160doth system test is failing on MacOS - both system's CA store tests based are...2023-09-05T21:16:42ZMark Andrewsdoth system test is failing on MacOS - both system's CA store tests based are failingLooks like we are getting a different error state and the expected message is not emitted.
```
S:doth:2023-06-22T16:29:52+1000
T:doth:1:A
A:doth:System test doth
I:doth:PORTS:5300,5301,5302,5303,5304,5305,5306,5307,5308,5309,5310,5311,5...Looks like we are getting a different error state and the expected message is not emitted.
```
S:doth:2023-06-22T16:29:52+1000
T:doth:1:A
A:doth:System test doth
I:doth:PORTS:5300,5301,5302,5303,5304,5305,5306,5307,5308,5309,5310,5311,5312
I:doth:starting servers
I:doth:checking DoT query (with TLS verification using the system's CA store, failure expected) (1)
setup_libs()
setup_system()
create_search_list()
ndots is 1.
timeout is 0.
retries is 3.
get_server_list()
make_server(::1)
dig_query_setup
parse_args()
making new lookup
make_empty_lookup()
make_empty_lookup() = 0x1061e4000->references = 1
main parsing +tls
main parsing +noadd
main parsing +nosea
main parsing +nostat
main parsing +noquest
main parsing +nocmd
main parsing -p
main parsing +tls-ca
main parsing +tls-hostname=srv01.crt01.example.com
main parsing @10.53.0.1
make_server(10.53.0.1)
main parsing .
clone_lookup()
make_empty_lookup()
make_empty_lookup() = 0x1061e5800->references = 1
clone_server_list()
make_server(10.53.0.1)
looking up .
main parsing SOA
main parsing -d
dig_startup()
start_lookup()
setup_lookup(0x1061e5800)
resetting lookup counter.
idn_textname: .
using root origin
recursive query
AD query
add_question()
starting to render the message
add_opt()
done rendering
create query 0x10613c8c0 linked to lookup 0x1061e5800
dighost.c:2141:lookup_attach(0x1061e5800) = 2
dighost.c:2651:new_query(0x10613c8c0) = 1
do_lookup()
start_tcp(0x10613c8c0)
dighost.c:2928:query_attach(0x10613c8c0) = 2
query->servname = 10.53.0.1
dighost.c:3016:query_attach(0x10613c8c0) = 3
isc_tls_create -> 0x106086000
initialize_tls -> success
tls_do_bio: sock->tlsstream.state=0
SSL_do_handshake -> -1
tls_do_bio: tls_try_handshake -> -1
tls_do_bio: tls_status=2 saved_errno=0
tls_do_bio: tls_process_outgoing -> 303
tls_do_bio: sock->tlsstream.state=1
tls_do_bio: tls_status=2 saved_errno=0
tls_do_bio: tls_process_outgoing -> 0
tls_do_bio: SSL_ERROR_WANT_READ
tls_readcb: success
tls_do_bio: sock->tlsstream.state=1
SSL_do_handshake -> -1
tls_do_bio: tls_status=5 saved_errno=0
tls_do_bio: tls_process_outgoing -> 7
tls_do_bio: sock->tlsstream.state=1
tls_do_bio: tls_status=5 saved_errno=0
tls_do_bio: tls_process_outgoing -> 0
tls_do_bio: SSL_ERROR_WANT_READ
tls_readcb: end of file
tls_failed_read_cb: end of file
tls_failed_read_cb: is is TLS counterpart of isc__nm_failed_connect_cb()
tls_call_connect_cb: calling sock->connect_cb(0x10613ed80, end of file, 0x10610a800)
streamdns_transport_connected: end of file -> operation canceled
streamdns_transport_connected: sock->streamdns.tls_verify_error=unable to get local issuer certificate
tcp_connected()
tcp_connected(0x10613ef40, operation canceled, 0x10613c8c0)
dighost.c:3526:lookup_attach(0x1061e5800) = 3
in cancel handler
dighost.c:3546:_cancel_lookup()
canceling pending query 0x10613c8c0, belonging to 0x1061e5800
dighost.c:1729:query_detach(0x10613c8c0) = 2
dighost.c:2774:query_detach(0x10613c8c0) = 1
check_if_done()
list empty
dighost.c:3549:query_detach(0x10613c8c0) = 0
dighost.c:3549:destroy_query(0x10613c8c0) = 0
dighost.c:1687:lookup_detach(0x1061e5800) = 2
dighost.c:3550:lookup_detach(0x1061e5800) = 1
clear_current_lookup()
lookup cleared
dighost.c:1820:lookup_detach(0x1061e5800) = 0
destroy_lookup
freeing server 0x106109e00 belonging to 0x1061e5800
start_lookup()
check_if_done()
list empty
shutting down
shutdown
destroy_lookup
freeing server 0x106108a00 belonging to 0x1061e4000
cancel_all()
destroy_libs()
flush_server_list()
destroy DST lib
Removing log context
Destroy memory
I:doth:failed
I:doth:exit status: 1
I:doth:stopping servers
R:doth:FAIL
E:doth:2023-06-22T16:29:57+1000
FAIL: doth
============================================================================
Testsuite summary for BIND 9.19.15-dev
============================================================================
# TOTAL: 1
# PASS: 0
# SKIP: 0
# XFAIL: 0
# FAIL: 1
# XPASS: 0
# ERROR: 0
============================================================================
See bin/tests/system/run.log
Please report to https://gitlab.isc.org/isc-projects/bind9/-/issues/new?issuable_template=Bug
============================================================================
make[3]: *** [run.log] Error 1
make[2]: *** [check-TESTS] Error 2
make[1]: *** [check-am] Error 2
make: *** [check-recursive] Error 1
```
[debugging diff](/uploads/a1a79a1bb5239c456385dd40fc4a08f8/diff)Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4158investigate performance impact of huge pages2023-06-27T07:27:35ZPetr Špačekpspacek@isc.orginvestigate performance impact of huge pages- We do lots and lots of allocation.
- jemalloc supports huge pages to limit metadata overhead and TLB misses
- that together with tuning sysctl `vm.nr_hugepages` can be interesting.
Reading:
https://access.redhat.com/documentation/en-u...- We do lots and lots of allocation.
- jemalloc supports huge pages to limit metadata overhead and TLB misses
- that together with tuning sysctl `vm.nr_hugepages` can be interesting.
Reading:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/configuring-huge-pages_monitoring-and-managing-system-status-and-performanceLong-termhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4153Run system tests in network namespaces2023-12-14T15:27:50ZTom KrizekRun system tests in network namespacesExecuting system tests under pytest should support isolation using network namespaces on platforms where it's possible. It would simplify running the tests (no root setup required), prevent any network interference, remove weird quirks w...Executing system tests under pytest should support isolation using network namespaces on platforms where it's possible. It would simplify running the tests (no root setup required), prevent any network interference, remove weird quirks with port assignment and make it easier to capture relevant traffic into PCAP.Not plannedTom KrizekTom Krizekhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4130SSL version reporting is inconsistent2023-06-06T17:12:40ZMichal NowakSSL version reporting is inconsistentWith isc-projects/bind9!7998, the `./configure` script now prints the SSL version but is inconsistent with `named -V` on BSD platforms.
[**OpenBSD**](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3444056/raw)
```
Library versions:
...With isc-projects/bind9!7998, the `./configure` script now prints the SSL version but is inconsistent with `named -V` on BSD platforms.
[**OpenBSD**](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3444056/raw)
```
Library versions:
OpenSSL: 2.0.0
```
```
compiled with OpenSSL version: LibreSSL 3.7.2
linked to OpenSSL version: LibreSSL 3.7.2
```
[**FreeBSD 12**](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3444054/raw)
```
Library versions:
OpenSSL:
```
```
compiled with OpenSSL version: OpenSSL 1.1.1q-freebsd 5 Jul 2022
linked to OpenSSL version: OpenSSL 1.1.1q-freebsd 5 Jul 2022
```
[**FreeBSD 13**](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3444055/raw)
```
Library versions:
OpenSSL: 1.1.1t
```
```
compiled with OpenSSL version: OpenSSL 1.1.1t-freebsd 7 Feb 2023
linked to OpenSSL version: OpenSSL 1.1.1t-freebsd 7 Feb 2023
```Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4127dangerfile can't traverse GitLab project boundary2023-06-06T13:13:58ZMichal Nowakdangerfile can't traverse GitLab project boundaryJob [#3445138](https://gitlab.isc.org/isc-private/bind9/-/jobs/3445138) failed for [c713737cdc6ca2997f75c18ad35715ffb48688e8](https://gitlab.isc.org/isc-private/bind9/-/commit/c713737cdc6ca2997f75c18ad35715ffb48688e8).
`danger-python` c...Job [#3445138](https://gitlab.isc.org/isc-private/bind9/-/jobs/3445138) failed for [c713737cdc6ca2997f75c18ad35715ffb48688e8](https://gitlab.isc.org/isc-private/bind9/-/commit/c713737cdc6ca2997f75c18ad35715ffb48688e8).
`danger-python` crashed when `Backport of MR isc-projects/bind9!7457` was present in the MR description field:
```
$ Backport of MR isc-projects/bind9!7457 ci -f
There was an error when executing dangerfile.py:
GitlabGetError at line 207: 404 Not found
Stacktrace:
File "dangerfile.py", line 207, in <module>
original_mr = proj.mergerequests.get(original_mr_id)
File "/usr/local/lib/python3.9/dist-packages/gitlab/v4/objects/merge_requests.py", line 486, in get
return cast(ProjectMergeRequest, super().get(id=id, lazy=lazy, **kwargs))
File "/usr/local/lib/python3.9/dist-packages/gitlab/exceptions.py", line 338, in wrapped_f
raise error(e.error_message, e.response_code, e.response_body) from e
Failing the build, there is 1 fail.
Feedback: https://gitlab.isc.org/isc-private/bind9/merge_requests/531#note_378750
```
It seems that `danger-python` with the current `dangerfile.py` can't traverse the GitLab project boundary from isc-private to isc-projects (and vice versa) and couldn't look for missing isc-private/bind9!531 MR commits that are in the "upstream" isc-projects/bind9!7457 MR.https://gitlab.isc.org/isc-projects/bind9/-/issues/4119delv occasionally hangs in tsan tests2023-08-03T08:53:22ZMichal Nowakdelv occasionally hangs in tsan testsJob [#3442535](https://gitlab.isc.org/isc-private/bind9/-/jobs/3442535) failed for [97f8f0991e3879b047073a7e812e453f620d5c85](https://gitlab.isc.org/isc-private/bind9/-/commit/97f8f0991e3879b047073a7e812e453f620d5c85). Also https://gitla...Job [#3442535](https://gitlab.isc.org/isc-private/bind9/-/jobs/3442535) failed for [97f8f0991e3879b047073a7e812e453f620d5c85](https://gitlab.isc.org/isc-private/bind9/-/commit/97f8f0991e3879b047073a7e812e453f620d5c85). Also https://gitlab.isc.org/isc-private/bind9/-/jobs/3435673. Locally, I could not reproduce it, and in CI, pytest does not present verbose enough output to identify a stuck check.
The `digdelv` system test is sometimes stuck in TSAN system tests https://gitlab.isc.org/isc-private/bind9/-/jobs/3442535, and the CI times out after an hour as a result.
So far, I only saw this on BIND 9.18-S.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4118Data race lib/dns/adb.c:1537 in clean_finds_at_name2023-09-04T09:09:22ZMichal NowakData race lib/dns/adb.c:1537 in clean_finds_at_nameJob [respdiff-long:tsan](https://gitlab.isc.org/isc-private/bind9/-/jobs/3440993) failed for [d2fbe443b833d093f68bf4f5a1736242fc8d18a1](https://gitlab.isc.org/isc-private/bind9/-/commit/d2fbe443b833d093f68bf4f5a1736242fc8d18a1) (~"v9.18-...Job [respdiff-long:tsan](https://gitlab.isc.org/isc-private/bind9/-/jobs/3440993) failed for [d2fbe443b833d093f68bf4f5a1736242fc8d18a1](https://gitlab.isc.org/isc-private/bind9/-/commit/d2fbe443b833d093f68bf4f5a1736242fc8d18a1) (~"v9.18-S").
```
WARNING: ThreadSanitizer: data race
Write of size 4 at 0x000000000001 by thread T1 (mutexes: write M1, write M2):
#0 clean_finds_at_name lib/dns/adb.c:1537
#1 fetch_callback lib/dns/adb.c:4009
#2 task_run lib/isc/task.c:815
#3 isc_task_run lib/isc/task.c:896
#4 isc__nm_async_task netmgr/netmgr.c:848
#5 process_netievent netmgr/netmgr.c:920
#6 process_queue netmgr/netmgr.c:1013
#7 process_all_queues netmgr/netmgr.c:767
#8 async_cb netmgr/netmgr.c:796
#9 uv__async_io /usr/src/libuv-v1.44.1/src/unix/async.c:163
#10 isc__trampoline_run lib/isc/trampoline.c:189
Previous read of size 4 at 0x000000000001 by thread T2:
#0 findname lib/dns/resolver.c:3749
#1 fctx_getaddresses lib/dns/resolver.c:3993
#2 fctx_try lib/dns/resolver.c:4390
#3 rctx_nextserver lib/dns/resolver.c:10356
#4 rctx_done lib/dns/resolver.c:10503
#5 resquery_response lib/dns/resolver.c:8511
#6 udp_recv lib/dns/dispatch.c:638
#7 isc__nm_async_readcb netmgr/netmgr.c:2885
#8 isc__nm_readcb netmgr/netmgr.c:2858
#9 udp_recv_cb netmgr/udp.c:650
#10 isc__nm_udp_read_cb netmgr/udp.c:1057
#11 uv__udp_recvmsg /usr/src/libuv-v1.44.1/src/unix/udp.c:303
#12 isc__trampoline_run lib/isc/trampoline.c:189
Location is heap block of size 256 at 0x000000000025 allocated by thread T2:
#0 malloc ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:651
#1 mallocx lib/isc/jemalloc_shim.h:35
#2 mem_get lib/isc/mem.c:343
#3 isc__mem_get lib/isc/mem.c:761
#4 new_adbfind lib/dns/adb.c:1901
#5 dns_adb_createfind lib/dns/adb.c:2934
#6 findname lib/dns/resolver.c:3656
#7 fctx_getaddresses lib/dns/resolver.c:3993
#8 fctx_try lib/dns/resolver.c:4390
#9 rctx_nextserver lib/dns/resolver.c:10356
#10 rctx_done lib/dns/resolver.c:10503
#11 resquery_response lib/dns/resolver.c:8511
#12 udp_recv lib/dns/dispatch.c:638
#13 isc__nm_async_readcb netmgr/netmgr.c:2885
#14 isc__nm_readcb netmgr/netmgr.c:2858
#15 udp_recv_cb netmgr/udp.c:650
#16 isc__nm_udp_read_cb netmgr/udp.c:1057
#17 uv__udp_recvmsg /usr/src/libuv-v1.44.1/src/unix/udp.c:303
#18 isc__trampoline_run lib/isc/trampoline.c:189
Mutex M1 is already destroyed.
Mutex M2 is already destroyed.
Thread T1 (running) created by main thread at:
#0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962
#1 isc_thread_create lib/isc/thread.c:73
#2 isc__netmgr_create netmgr/netmgr.c:311
#3 isc_managers_create lib/isc/managers.c:31
#4 create_managers bin/named/main.c:1042
#5 setup bin/named/main.c:1313
#6 main bin/named/main.c:1594
Thread T2 (running) created by main thread at:
#0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962
#1 isc_thread_create lib/isc/thread.c:73
#2 isc__netmgr_create netmgr/netmgr.c:311
#3 isc_managers_create lib/isc/managers.c:31
#4 create_managers bin/named/main.c:1042
#5 setup bin/named/main.c:1313
#6 main bin/named/main.c:1594
SUMMARY: ThreadSanitizer: data race lib/dns/adb.c:1537 in clean_finds_at_name
```Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4112"serve-stale:check prefetch processing of a stale CNAME target" fails on Free...2023-07-07T09:25:45ZMichal Nowak"serve-stale:check prefetch processing of a stale CNAME target" fails on FreeBSD 13Job [#3435305](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3435305) failed for ff3d25a47f9f969669b2e4f5cde10c50f9cdd171 (~"v9.18").
On FreeBSD 13.2, the `check prefetch processing of a stale CNAME target` check [failed](https://git...Job [#3435305](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3435305) failed for ff3d25a47f9f969669b2e4f5cde10c50f9cdd171 (~"v9.18").
On FreeBSD 13.2, the `check prefetch processing of a stale CNAME target` check [failed](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3435305) [twice](https://gitlab.isc.org/isc-private/bind9/-/jobs/3431983) in the recent days:
```
2023-06-02 01:09:52 INFO:serve-stale I:serve-stale_tmp_q8yamlle:check prefetch processing of a stale CNAME target (214)
2023-06-02 01:09:55 INFO:serve-stale I:serve-stale_tmp_q8yamlle:failed
```
This was expected:
```
target.example. 2 IN A 10.53.0.2
```
But this was the answer:
```
target.example. 30 IN A 10.53.0.2
```
We got a stale answer after client timeout (`; EDE: 3 (Stale Answer): (client timeout)`), query time was 1840 msec. Locally, I get 2 msec and a non-stale answer.
I was unable to reproduce the problem locally.https://gitlab.isc.org/isc-projects/bind9/-/issues/4104ZoneQuota stats counter is not counting everything2024-02-24T07:55:05ZOndřej SurýZoneQuota stats counter is not counting everythingThe `ZoneQuota` should log all the hits to `fcount_incr()` returning `ISC_R_QUOTA`, but it does only in a single place. The counting should be moved to `fctx_incr()`.The `ZoneQuota` should log all the hits to `fcount_incr()` returning `ISC_R_QUOTA`, but it does only in a single place. The counting should be moved to `fctx_incr()`.May 2024 (9.18.27, 9.18.27-S1, 9.19.24)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4102Use liburcu QSBR flavor2023-07-26T09:59:54ZOndřej SurýUse liburcu QSBR flavorThe QSBR flavor is faster, but also requires rcu_quiescent_state() to be called periodically from every RCU thread.The QSBR flavor is faster, but also requires rcu_quiescent_state() to be called periodically from every RCU thread.Not plannedOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4100REQUIRE(((multi) != ((void *)0) && ((const isc__magic_t *)(multi))->magic == ...2023-05-30T08:09:36ZMichal NowakREQUIRE(((multi) != ((void *)0) && ((const isc__magic_t *)(multi))->magic == ((('q') << 24 | ('p') << 16 | ('m') << 8 | ('v'))))) in qp.cJob [#3424074](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3424074) failed for 2e8ceeea14e336980c9da80449b84ecd16afc7e5.
The `qpmulti_test` unit test failed.
```
[==========] Running 1 test(s).
[ RUN ] qpmulti
qp.c:634: REQUI...Job [#3424074](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3424074) failed for 2e8ceeea14e336980c9da80449b84ecd16afc7e5.
The `qpmulti_test` unit test failed.
```
[==========] Running 1 test(s).
[ RUN ] qpmulti
qp.c:634: REQUIRE(((multi) != ((void *)0) && ((const isc__magic_t *)(multi))->magic == ((('q') << 24 | ('p') << 16 | ('m') << 8 | ('v'))))) failed, back trace
/builds/isc-projects/bind9/lib/isc/.libs/libisc-9.19.14-dev.so(+0x2ddb2)[0x7f231ea2ddb2]
/builds/isc-projects/bind9/lib/isc/.libs/libisc-9.19.14-dev.so(isc_assertion_failed+0xa)[0x7f231ea2dd2d]
/builds/isc-projects/bind9/lib/dns/.libs/libdns-9.19.14-dev.so(+0xb9fe3)[0x7f231e0b9fe3]
/lib64/liburcu.so.6(+0x37a9)[0x7f231d64e7a9]
/lib64/libpthread.so.0(+0x81da)[0x7f231dbdd1da]
/lib64/libc.so.6(clone+0x43)[0x7f231ceafe73]
../../tests/unit-test-driver.sh: line 36: 13597 Aborted (core dumped) "${TEST_PROGRAM}"
FAIL qpmulti_test (exit status: 134)
```
There's no core file or full backtrace in the logs.https://gitlab.isc.org/isc-projects/bind9/-/issues/4099[PATCH] +shortans2023-05-30T13:24:54ZFredrick Brennan[PATCH] +shortans# Patch
```diff
From 6041dcb60313b5fd81076bd53713b8a53fb95f87 Mon Sep 17 00:00:00 2001
From: Fredrick Brennan <copypaste@kittens.ph>
Date: Sat, 27 May 2023 08:23:45 -0400
Subject: [PATCH] [dig] +shortans
---
bin/dig/dig.c | 48 +++++...# Patch
```diff
From 6041dcb60313b5fd81076bd53713b8a53fb95f87 Mon Sep 17 00:00:00 2001
From: Fredrick Brennan <copypaste@kittens.ph>
Date: Sat, 27 May 2023 08:23:45 -0400
Subject: [PATCH] [dig] +shortans
---
bin/dig/dig.c | 48 ++++++++++++++++++++++++++++++++++++------------
bin/dig/dig.rst | 4 ++++
doc/man/dig.1in | 5 +++++
3 files changed, 45 insertions(+), 12 deletions(-)
diff --git a/bin/dig/dig.c b/bin/dig/dig.c
index 694924c0f2..dd9bfcd4a7 100644
--- a/bin/dig/dig.c
+++ b/bin/dig/dig.c
@@ -286,6 +286,8 @@ help(void) {
"short\n"
" form of answers - global "
"option)\n"
+ " +[no]shortans (equivalent to `+noall"
+ "+authority +answer`)\n"
" +[no]showbadcookie (Show BADCOOKIE message)\n"
" +[no]showsearch (Search with intermediate "
"results)\n"
@@ -1901,18 +1903,40 @@ plus_option(char *option, bool is_batchfile, bool *need_clone,
goto invalid_option;
}
switch (cmd[3]) {
- case 'r': /* short */
- FULLCHECK("short");
- short_form = state;
- if (state) {
- printcmd = false;
- lookup->section_additional = false;
- lookup->section_answer = true;
- lookup->section_authority = false;
- lookup->section_question = false;
- lookup->comments = false;
- lookup->stats = false;
- lookup->rrcomments = -1;
+ case 'r': /* shor… */
+ switch(cmd[4]) {
+ case 't': /* short… */
+ switch(cmd[5]) { /* short */
+ case '\0':
+ FULLCHECK("short");
+ short_form = state;
+ if (state) {
+ printcmd = false;
+ lookup->section_additional = false;
+ lookup->section_answer = true;
+ lookup->section_authority = false;
+ lookup->section_question = false;
+ lookup->comments = false;
+ lookup->stats = false;
+ lookup->rrcomments = -1;
+ }
+ break;
+ case 'a': /* shortans */
+ FULLCHECK("shortans");
+ lookup->section_question = !state;
+ lookup->section_authority = state;
+ lookup->section_answer = state;
+ lookup->section_additional = !state;
+ lookup->comments = !state;
+ lookup->stats = !state;
+ printcmd = !state;
+ break;
+ default:
+ goto invalid_option;
+ }
+ break;
+ default:
+ goto invalid_option;
}
break;
case 'w': /* showsearch */
diff --git a/bin/dig/dig.rst b/bin/dig/dig.rst
index a5bfb86556..75237f0ae0 100644
--- a/bin/dig/dig.rst
+++ b/bin/dig/dig.rst
@@ -571,6 +571,10 @@ abbreviation is unambiguous; for example, :option:`+cd` is equivalent to
form. This option always has a global effect; it cannot be set globally and
then overridden on a per-lookup basis.
+.. option:: +shortans, +noshortans
+
+ This option expands to :option:`+noall` :option:`+authority` :option:`+answer`.
+
.. option:: +showbadcookie, +noshowbadcookie
This option toggles whether to show the message containing the
diff --git a/doc/man/dig.1in b/doc/man/dig.1in
index d5f42ed852..1607d7f2ca 100644
--- a/doc/man/dig.1in
+++ b/doc/man/dig.1in
@@ -663,6 +663,11 @@ then overridden on a per\-lookup basis.
.UNINDENT
.INDENT 0.0
.TP
+.B +shortans, +noshortans
+This option expands to \fI\%+noall\fP \fI\%+authority\fP \fI\%+answer\fP\&.
+.UNINDENT
+.INDENT 0.0
+.TP
.B +showbadcookie, +noshowbadcookie
This option toggles whether to show the message containing the
BADCOOKIE rcode before retrying the request or not. The default
--
2.40.1
```
# Detached signature
```gpg
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQS1rLeeEfG/f0nzK7hYUwVpYvFOWAUCZHH3EAAKCRBYUwVpYvFO
WOiHAP9uTERa4rrztKKeqk1TSLkqP5RgDnBbgxcbTkHAt5q7/wEAvffIjE5SUX8P
RpxZ9yS2geRmVXwyLDiS4FjxN3u7vgE=
=i92K
-----END PGP SIGNATURE-----
```https://gitlab.isc.org/isc-projects/bind9/-/issues/4092timer.c:223:timerevent_destroy(): fatal error: RUNTIME_CHECK(isc_mutex_unlock...2023-05-25T07:38:17ZMichal Nowaktimer.c:223:timerevent_destroy(): fatal error: RUNTIME_CHECK(isc_mutex_unlock((&timer->lock)) == ISC_R_SUCCESS) failedJob [#3411550](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3411550) failed for 66254cf56d7072833db6d8744e6bcef2109b72e2.
BIND 9.18 `task` unit test failed on `unit:gcc:oraclelinux8:amd64`.
```
[==========] Running 11 test(s).
[ RU...Job [#3411550](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3411550) failed for 66254cf56d7072833db6d8744e6bcef2109b72e2.
BIND 9.18 `task` unit test failed on `unit:gcc:oraclelinux8:amd64`.
```
[==========] Running 11 test(s).
[ RUN ] manytasks
[ OK ] manytasks
[ RUN ] all_events
[ OK ] all_events
[ RUN ] basic
timer.c:223:timerevent_destroy(): fatal error: RUNTIME_CHECK(isc_mutex_unlock((&timer->lock)) == ISC_R_SUCCESS) failed
../../tests/unit-test-driver.sh: line 36: 8595 Aborted (core dumped) "${TEST_PROGRAM}"
I:task_test:Core dump found: ./core.8595
D:task_test:backtrace from ./core.8595 start
[New LWP 8636]
[New LWP 8595]
[New LWP 8637]
[New LWP 8638]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/builds/isc-projects/bind9/tests/isc/.libs/lt-task_test'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f8c5b302aff in raise () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7f8c3bfff700 (LWP 8636))]
Thread 4 (Thread 0x7f8c412fa700 (LWP 8638)):
#0 0x00007f8c5b68846c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
No symbol table info available.
#1 0x00007f8c5c4af725 in run (uap=0x7f8c591e1000) at timer.c:632
manager = 0x7f8c591e1000
now = {seconds = 1684976709, nanoseconds = 609640403}
result = <optimized out>
__func__ = "run"
#2 0x00007f8c5c4b4b20 in isc__trampoline_run (arg=0x1973730) at trampoline.c:189
trampoline = 0x1973730
result = <optimized out>
#3 0x00007f8c5b6821da in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4 0x00007f8c5b2ede73 in clone () from /lib64/libc.so.6
No symbol table info available.
Thread 3 (Thread 0x7f8c40af9700 (LWP 8637)):
#0 0x00007f8c5b3e4017 in epoll_wait () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f8c5c2460f9 in uv.io_poll () from /lib64/libuv.so.1
No symbol table info available.
#2 0x00007f8c5c234a74 in uv_run () from /lib64/libuv.so.1
No symbol table info available.
#3 0x00007f8c5c47aa6c in nm_thread (worker0=0x7f8c591f75b8) at netmgr/netmgr.c:698
r = <optimized out>
worker = 0x7f8c591f75b8
mgr = 0x7f8c59036000
__func__ = "nm_thread"
#4 0x00007f8c5c4b4b20 in isc__trampoline_run (arg=0x1974330) at trampoline.c:189
trampoline = 0x1974330
result = <optimized out>
#5 0x00007f8c5b6821da in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#6 0x00007f8c5b2ede73 in clone () from /lib64/libc.so.6
No symbol table info available.
Thread 2 (Thread 0x7f8c5ce04140 (LWP 8595)):
#0 0x00007f8c5b3ae9a8 in nanosleep () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f8c5b3dbf48 in usleep () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007f8c5c4ac692 in isc__taskmgr_destroy (managerp=managerp@entry=0x607348 <taskmgr>) at task.c:1041
No locals.
#3 0x00007f8c5c49b4b0 in isc_managers_destroy (netmgrp=netmgrp@entry=0x607338 <netmgr>, taskmgrp=taskmgrp@entry=0x607348 <taskmgr>, timermgrp=timermgrp@entry=0x607340 <timermgr>) at managers.c:99
No locals.
#4 0x00000000004052ee in teardown_managers (state=<optimized out>) at isc.c:84
No locals.
#5 0x0000000000404f64 in _teardown (state=<optimized out>) at task_test.c:91
No locals.
#6 0x00007f8c5be1702e in cmocka_run_one_test_or_fixture () from /lib64/libcmocka.so.0
No symbol table info available.
#7 0x00007f8c5be179e0 in _cmocka_run_group_tests () from /lib64/libcmocka.so.0
No symbol table info available.
#8 0x000000000040516b in main () at task_test.c:1408
r = <optimized out>
Thread 1 (Thread 0x7f8c3bfff700 (LWP 8636)):
#0 0x00007f8c5b302aff in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f8c5b2d5ea5 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007f8c5c48f5c2 in isc_error_fatal (file=file@entry=0x7f8c5c4c45a6 "timer.c", line=line@entry=223, func=func@entry=0x7f8c5c4d07a0 <__func__.7544> "timerevent_destroy", format=format@entry=0x7f8c5c4c0814 "RUNTIME_CHECK(%s) failed") at error.c:72
args = {{gp_offset = 40, fp_offset = 48, overflow_arg_area = 0x7f8c3bff9d00, reg_save_area = 0x7f8c3bff9c40}}
#3 0x00007f8c5c4af15f in timerevent_destroy (event0=0x7f8c51800b00) at timer.c:225
timer = 0x7f8c591e10a0
event = 0x7f8c51800b00
__func__ = "timerevent_destroy"
#4 0x00007f8c5c48f7e9 in isc_event_free (eventp=eventp@entry=0x7f8c3bff9d48) at event.c:93
event = <optimized out>
#5 0x0000000000403449 in basic_tick (task=<optimized out>, event=<optimized out>) at task_test.c:444
No locals.
#6 0x00007f8c5c4abf17 in task_run (task=0x7f8c591e73c0) at task.c:815
dispatch_count = 0
finished = false
quantum = <optimized out>
event = 0x7f8c51800b00
result = ISC_R_SUCCESS
dispatch_count = <optimized out>
finished = <optimized out>
event = <optimized out>
result = <optimized out>
quantum = <optimized out>
__func__ = "task_run"
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__v = <optimized out>
#7 isc_task_run (task=0x7f8c591e73c0) at task.c:896
No locals.
#8 0x00007f8c5c472579 in isc__nm_async_task (worker=worker@entry=0x7f8c591f7000, ev0=ev0@entry=0x7f8c51805f80) at netmgr/netmgr.c:848
ievent = 0x7f8c51805f80
result = <optimized out>
#9 0x00007f8c5c479d78 in process_netievent (worker=worker@entry=0x7f8c591f7000, ievent=ievent@entry=0x7f8c51805f80) at netmgr/netmgr.c:920
No locals.
#10 0x00007f8c5c47a78e in process_queue (worker=worker@entry=0x7f8c591f7000, type=type@entry=NETIEVENT_TASK) at netmgr/netmgr.c:1013
next = 0x0
ievent = 0x7f8c51805f80
list = {head = 0x0, tail = 0x0}
__func__ = "process_queue"
#11 0x00007f8c5c47b23b in process_all_queues (worker=0x7f8c591f7000) at netmgr/netmgr.c:767
result = <optimized out>
type = 2
reschedule = false
reschedule = <optimized out>
type = <optimized out>
result = <optimized out>
#12 async_cb (handle=0x7f8c591f7360) at netmgr/netmgr.c:796
worker = 0x7f8c591f7000
#13 0x00007f8c5c2342f1 in uv.async_io.part () from /lib64/libuv.so.1
No symbol table info available.
#14 0x00007f8c5c245d15 in uv.io_poll () from /lib64/libuv.so.1
No symbol table info available.
#15 0x00007f8c5c234a74 in uv_run () from /lib64/libuv.so.1
No symbol table info available.
#16 0x00007f8c5c47aa6c in nm_thread (worker0=0x7f8c591f7000) at netmgr/netmgr.c:698
r = <optimized out>
worker = 0x7f8c591f7000
mgr = 0x7f8c59036000
__func__ = "nm_thread"
#17 0x00007f8c5c4b4b20 in isc__trampoline_run (arg=0x1976840) at trampoline.c:189
trampoline = 0x1976840
result = <optimized out>
#18 0x00007f8c5b6821da in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#19 0x00007f8c5b2ede73 in clone () from /lib64/libc.so.6
No symbol table info available.
D:task_test:backtrace from ./core.8595 end
FAIL task_test (exit status: 134)
```https://gitlab.isc.org/isc-projects/bind9/-/issues/4087Follow-up from "fix handling of TCP timeouts"2023-11-02T16:30:30ZEvan HuntFollow-up from "fix handling of TCP timeouts"The following discussion from !7937 should be addressed:
- [ ] @aram started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/7937#note_375087): (+2 comments)
> While you are addressing Ondřej's comments, ...The following discussion from !7937 should be addressed:
- [ ] @aram started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/7937#note_375087): (+2 comments)
> While you are addressing Ondřej's comments, would you please also look at something not strictly related to this MR, which caught my eye (cc @ondrej):
>
> ```c
> void
> dns_dispatch_resume(dns_dispentry_t *resp, uint16_t timeout);
> /*%<
> * Reset the read timeout in the socket associated with 'resp' and
> * continue reading.
> *
> * Requires:
> *\li 'resp' is valid.
> */
> ```
>
> The function is supposed to reset the read timeout, but if I am reading the code correctly, both `udp_dispatch_getnext()` and `tcp_dispatch_getnext()` (called by `dns_dispatch_resume()`) potentially can ignore the timeout value if the read operation is already ongoing. Is that by design?
>
> I think it should at least update the `resp->timeout` value with the new one, and probably call `isc_nmhandle_settimeout()` even when already reading, in case if the new timeout is smaller than the remaining time of the current one.Not plannedEvan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/4084auto-tune transfers-in, transfers-out, transfers-per-ns and friends2023-05-30T12:53:01ZCathy Almondauto-tune transfers-in, transfers-out, transfers-per-ns and friends### Description
The problem, as noted in [Support ticket #21991](https://support.isc.org/Ticket/Display.html?id=21991), is that without testing in a production environment and under specific circumstances (speed of network, configuratio...### Description
The problem, as noted in [Support ticket #21991](https://support.isc.org/Ticket/Display.html?id=21991), is that without testing in a production environment and under specific circumstances (speed of network, configuration of primaries per zone, reachability, rate of zone update propagation and so on), it's hard to know what sane values to give to the options for tuning zone transfers. The types of things that need to be optimised are:
- Speed of synchronisation/completion of refreshes following a secondary server restart
- Effective use of CPU resources so that servers are not idling while there is work that could be done
- No interruption to client services during zone refreshes (for servers that are currently client-facing)
- Effective onward zone update propagation/refreshes (for servers that are intermediaries in the zone update propagation path)
- Speed of propagation of zone updates during normal operation (i.e. not when restarting something...)
### Request
I'd like transfers-in, transfers-out, transfers-per-ns and friends to be able to auto-tune themselves based on knowledge of how the server is performing.
See other work currently ongoing:
#3883
#3914
### Links / referencesNot plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4065Could query-source be made best-effort, not preventing startup in case of fai...2023-06-16T09:03:46ZPetr MenšíkCould query-source be made best-effort, not preventing startup in case of failure?### Description
Could be specification of outgoing addresses made non-fatal? Some users try to configure used outgoing address by named. But then they are surprised it creates problem during startup, because those addresses might not ye...### Description
Could be specification of outgoing addresses made non-fatal? Some users try to configure used outgoing address by named. But then they are surprised it creates problem during startup, because those addresses might not yet be available.
### Request
Could it be possible to specify ``query-source 10.1.2.3 optional;``, which would behave similar way to FREEBIND for listening sockets? If the socket could not be bound, just use whatever default address system provides. But try to use that address if that would work. It would allow also starting with not yet present addresses, which would appear later.
Alternative would be delaying root primining queries until listen-on machinery detects source address available. That seems a lot more complicated.
### Links / references
- https://bugzilla.redhat.com/show_bug.cgi?id=2195976