BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2023-12-05T15:58:56Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4341Investigate the memory spike when the cache is cold2023-12-05T15:58:56ZOndřej SurýInvestigate the memory spike when the cache is coldOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4366XFR (dispatch) doesn't shutdown TCP connection on timeout2023-12-05T15:50:25ZOndřej SurýXFR (dispatch) doesn't shutdown TCP connection on timeoutAfter switching the XFR to use `dns_dispatch`, the TCP connection doesn't get cancelled properly when `dns_dispatch_done()` is called and it waits for the TCP connection to timeout on the server side.After switching the XFR to use `dns_dispatch`, the TCP connection doesn't get cancelled properly when `dns_dispatch_done()` is called and it waits for the TCP connection to timeout on the server side.https://gitlab.isc.org/isc-projects/bind9/-/issues/4452log more information from pytest assertions in system tests2023-12-05T15:29:43ZTom Krizeklog more information from pytest assertions in system testsSome python system tests contain `assert` expression like
```
assert loaded == expected
```
which provide no useful information in case the check [fails](https://gitlab.isc.org/isc-private/bind9/-/jobs/3820431), e.g.:
```
____________...Some python system tests contain `assert` expression like
```
assert loaded == expected
```
which provide no useful information in case the check [fails](https://gitlab.isc.org/isc-private/bind9/-/jobs/3820431), e.g.:
```
_______________________ test_zone_timers_secondary_json ________________________
[gw1] linux -- Python 3.11.6 /usr/bin/python3
/builds/isc-private/bind9/bin/tests/system/statschannel/tests_json.py:86: in test_zone_timers_secondary_json
generic.test_zone_timers_secondary(
/builds/isc-private/bind9/bin/tests/system/statschannel/generic.py:94: in test_zone_timers_secondary
check_zone_timers(loaded, expires, refresh, mtime)
/builds/isc-private/bind9/bin/tests/system/statschannel/generic.py:49: in check_zone_timers
check_loaded(loaded, loaded_exp, now)
/builds/isc-private/bind9/bin/tests/system/statschannel/generic.py:38: in check_loaded
assert loaded == expected
E AssertionError
```
An informative assert message with the relevant values/data should be added to the assert statements:
```
assert loaded == expected, f"loaded={loaded}, expected={expected}"
```December 2023 (9.18.21, 9.18.21-S1, 9.19.19)Tom KrizekTom Krizekhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2445NSEC3 iterations considered harmful2023-12-05T14:58:58ZMatthijs Mekkingmatthijs@isc.orgNSEC3 iterations considered harmful### Summary
BIND doesn't limit, allowing 16-bit unsigned integer number of iterations. Seeing a lot of traffic for a zone with a high iteration number can effectively DDoS the resolver.
CVSS Score: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/...### Summary
BIND doesn't limit, allowing 16-bit unsigned integer number of iterations. Seeing a lot of traffic for a zone with a high iteration number can effectively DDoS the resolver.
CVSS Score: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:L - 5.3
### BIND version used
Affected versions: 9.11, 9.16
### Steps to reproduce
1. Set up an authoritative server with a DNSSEC signed zone with high iteration count:
`dnssec-signzone -3 - -H 65535 example.`, then configure `named` and start
2. Set up a validating resolver.
3. Run a NXDOMAIN style attack
### What is the current *bug* behavior?
Resolver has very low QPS.
### What is the expected *correct* behavior?
Resolver doesn't starve.
### Relevant configuration files
To do.
### Relevant logs and/or screenshots
N/A
### Possible fixes
From the resolver perspective, this situation is actually protocol compliant.
RFC 5155 says:
```
A zone owner MUST NOT use a value higher than shown in the table
below for iterations for the given key size. A resolver MAY treat a
response with a higher value as insecure, after the validator has
verified that the signature over the NSEC3 RR is correct.
+----------+------------+
| Key Size | Iterations |
+----------+------------+
| 1024 | 150 |
| 2048 | 500 |
| 4096 | 2,500 |
+----------+------------+
```
But it also acknowledges this is susceptible to attacks:
```
12.1.4. Using High Iteration Values
Since validators should treat responses containing NSEC3 RRs with
high iteration values as insecure, presence of just one signed NSEC3
RR with a high iteration value in a zone provides attackers with a
possible downgrade attack.
[...]
Using a high number of iterations also introduces an additional
denial-of-service opportunity against servers, since servers must
calculate several hashes per negative or wildcard response.
```
Proposed fixes:
When loading a zone (primary server):
- We could error if we try to load a zone with NSEC3 records with too high iteration count.
- Or we could treat it as garbage in/garbage out.
When transferring a zone (secondary server):
- Nothing much we can do about it.
When validating a response from this zone (resolver):
- Treat such NSEC3 records as insecure after validating (as suggested in the RFC).October 2021 (9.11.36, 9.11.36-S1, 9.16.22, 9.16.22-S1, 9.17.19)Michał KępieńMichał Kępieńhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4384Assertion failure at ENSURE(isc_mempool_getallocated(*namepoolp) == 0) in mes...2023-12-05T12:50:50ZMichal NowakAssertion failure at ENSURE(isc_mempool_getallocated(*namepoolp) == 0) in message.c:4791See #4462 for reproducer.
Job [#3740189](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3740189) failed for cddd9dcb5329165633767fd1e319fbb0700487d2; worked fine 2 MRs prior (at 2728b810). Looks like a shutdown issue.
```
Core was ge...See #4462 for reproducer.
Job [#3740189](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3740189) failed for cddd9dcb5329165633767fd1e319fbb0700487d2; worked fine 2 MRs prior (at 2728b810). Looks like a shutdown issue.
```
Core was generated by `/builds/isc-projects/bind9/.local/usr/local/sbin/named -f -c ./named.conf'.
Program terminated with signal SIGABRT, Aborted.
#0 __pthread_kill_implementation (threadid=281472872136768, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
Downloading source file /usr/src/debug/glibc-2.37-10.fc38.aarch64/nptl/pthread_kill.c...
44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
[Current thread is 1 (Thread 0xffff828ec040 (LWP 24241))]
#0 __pthread_kill_implementation (threadid=281472872136768, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x0000ffff8171d958 [PAC] in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x0000ffff816d4980 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x0000ffff816c0284 [PAC] in __GI_abort () at abort.c:79
#4 0x000000000042efa8 [PAC] in assertion_failed (file=<optimized out>, line=4791, type=isc_assertiontype_ensure, cond=0xffff827a2368 "isc_mempool_getallocated(*namepoolp) == 0") at main.c:234
#5 0x0000ffff828658a4 in isc_assertion_failed (file=file@entry=0xffff827a0998 "message.c", line=line@entry=4791, type=type@entry=isc_assertiontype_ensure, cond=cond@entry=0xffff827a2368 "isc_mempool_getallocated(*namepoolp) == 0") at assertions.c:48
#6 0x0000ffff82678060 in dns_message_destroypools (namepoolp=0xffff80bf8e20, rdspoolp=0xffff80bf8e40) at message.c:4791
#7 0x0000ffff827031ec in dns_resolver__destroy (res=res@entry=0xffff80b21400) at resolver.c:9891
#8 0x0000ffff8270b0e4 in dns_resolver_unref (ptr=0xffff80b21400) at resolver.c:10172
#9 0x0000ffff8270b178 in dns_resolver_detach (ptrp=ptrp@entry=0xffff80070050) at resolver.c:10172
#10 0x0000ffff826185dc in destroy (adb=adb@entry=0xffff80070000) at adb.c:1833
#11 0x0000ffff8261acd4 in dns_adb_unref (ptr=0xffff80070000) at adb.c:1841
#12 0x0000ffff8261ae6c in dns_adb_detach (ptrp=ptrp@entry=0xffffec576580) at adb.c:1841
#13 0x0000ffff8273bf74 in dns_view_detach (viewp=viewp@entry=0xffff8090ae08) at view.c:516
#14 0x0000ffff82738050 in destroy_validator (val=val@entry=0xffff8090ae00) at validator.c:3122
#15 0x0000ffff82738148 in dns_validator_unref (ptr=0xffff8090ae00) at validator.c:3226
#16 0x0000ffff82736020 in dns_validator_detach (ptrp=ptrp@entry=0xffffec576688) at validator.c:3226
#17 0x0000ffff82737ee4 in validator_done_cb (arg=<optimized out>) at validator.c:211
#18 0x0000ffff82865c1c in isc__async_cb (handle=<optimized out>) at async.c:111
#19 0x0000ffff81dda0c0 in uv__async_io (loop=0xffff80482820, w=0xffff804829f0, events=1) at /usr/src/libuv-v1.46.0/src/unix/async.c:176
#20 0x0000ffff81df6d0c in uv__io_poll (loop=0xffff80482820, timeout=11996) at /usr/src/libuv-v1.46.0/src/unix/linux.c:1476
#21 0x0000ffff81ddb084 in uv_run (loop=0xffff80482820, mode=UV_RUN_DEFAULT) at /usr/src/libuv-v1.46.0/src/unix/core.c:447
#22 0x0000ffff8287977c in loop_thread (arg=arg@entry=0xffff80482800) at loop.c:282
#23 0x0000ffff82889204 in thread_body (wrap=0x29878850) at thread.c:85
#24 0x0000ffff8288928c in isc_thread_main (func=func@entry=0xffff82879710 <loop_thread>, arg=<optimized out>) at thread.c:116
#25 0x0000ffff8287a4e0 in isc_loopmgr_run (loopmgr=0xffff808c06c0) at loop.c:454
#26 0x00000000004318a8 in main (argc=<optimized out>, argv=<optimized out>) at main.c:1580
```
[core.24241-backtrace.txt](/uploads/8c4ac88405f3d8fb408f3ce70c590971/core.24241-backtrace.txt)
[named.conf](/uploads/28375b11fdea2ade28463c2c3c437c01/named.conf)
Similar issue: isc-projects/bind9#2188December 2023 (9.18.21, 9.18.21-S1, 9.19.19)Mark AndrewsMark Andrewshttps://gitlab.isc.org/isc-projects/bind9/-/issues/4457dig crashes after SIGINT if there are multiple queries2023-12-04T21:59:30ZPetr Špačekpspacek@isc.orgdig crashes after SIGINT if there are multiple queries### Summary
Dig with multiple queries crashes when interrupted.
### BIND version used
* ~"Affects v9.19" : de2009e3c2a
### Steps to reproduce
```
$ dig 2000.delay.getdnsapi.net 3000.delay.getdnsapi.net
```
+ SIGINT before the first qu...### Summary
Dig with multiple queries crashes when interrupted.
### BIND version used
* ~"Affects v9.19" : de2009e3c2a
### Steps to reproduce
```
$ dig 2000.delay.getdnsapi.net 3000.delay.getdnsapi.net
```
+ SIGINT before the first query finishes.
### What is the current *bug* behavior?
:boom:
```
signal.c:78: REQUIRE(((signal) != ((void *)0) && ((const isc__magic_t *)(signal))->magic == ((('S') << 24 | ('I') << 16 | ('G') << 8 | (' '))))) failed, back trace
```
### What is the expected *correct* behavior?
No crash.
### Relevant configuration files
None needed.
### Relevant logs and/or screenshots
Full debug log:
<details>
```
$ dig -d 2000.delay.getdnsapi.net 3000.delay.getdnsapi.net
setup_libs()
setup_system()
create_search_list()
ndots is 1.
timeout is 0.
retries is 3.
get_server_list()
make_server(127.0.0.111)
dig_query_setup
parse_args()
making new lookup
make_empty_lookup()
make_empty_lookup() = 0x7f6faa9aa000->references = 1
digrc (open)
main parsing +timeout=5
main parsing +retry=0
main parsing -d
main parsing 2000.delay.getdnsapi.net
clone_lookup()
make_empty_lookup()
make_empty_lookup() = 0x7f6faa9ab800->references = 1
clone_server_list()
looking up 2000.delay.getdnsapi.net
main parsing 3000.delay.getdnsapi.net
clone_lookup()
make_empty_lookup()
make_empty_lookup() = 0x7f6faa9ad000->references = 1
clone_server_list()
looking up 3000.delay.getdnsapi.net
dig_startup()
start_lookup()
setup_lookup(0x7f6faa9ab800)
resetting lookup counter.
cloning server list
clone_server_list()
make_server(127.0.0.111)
idn_textname: 2000.delay.getdnsapi.net
using root origin
recursive query
AD query
add_question()
starting to render the message
add_opt()
done rendering
create query 0x7f6fab09f540 linked to lookup 0x7f6faa9ab800
dighost.c:2141:lookup_attach(0x7f6faa9ab800) = 2
dighost.c:2652:new_query(0x7f6fab09f540) = 1
do_lookup()
start_udp(0x7f6fab09f540)
dighost.c:3255:query_attach(0x7f6fab09f540) = 2
working on lookup 0x7f6faa9ab800, query 0x7f6fab09f540
dighost.c:3300:query_attach(0x7f6fab09f540) = 3
udp_ready()
udp_ready(0x7f6fab09f700, success, 0x7f6fab09f540)
dighost.c:3147:lookup_attach(0x7f6faa9ab800) = 3
dighost.c:3216:query_attach(0x7f6fab09f540) = 4
recving with lookup=0x7f6faa9ab800, query=0x7f6fab09f540, handle=0x7f6fab09f700
recvcount=1
have local timeout of 5000
dighost.c:3094:query_attach(0x7f6fab09f540) = 5
sending a request
sendcount=1
dighost.c:1729:query_detach(0x7f6fab09f540) = 4
dighost.c:3236:query_detach(0x7f6fab09f540) = 3
dighost.c:3237:lookup_detach(0x7f6faa9ab800) = 2
send_done(0x7f6fab09f700, success, 0x7f6fab09f540)
sendcount=0
dighost.c:2729:lookup_attach(0x7f6faa9ab800) = 3
dighost.c:2746:query_detach(0x7f6fab09f540) = 2
dighost.c:2747:lookup_detach(0x7f6faa9ab800) = 2
check_if_done()
list full
pending lookup 0x7f6faa9ad000
^Crecv_done(0x7f6fab09f700, shutting down, 0x7ffe1fb53710, 0x7f6fab09f540)
recvcount=0
dighost.c:3905:lookup_attach(0x7f6faa9ab800) = 3
recv_done: cancel
dighost.c:3913:_cancel_lookup()
canceling pending query 0x7f6fab09f540, belonging to 0x7f6faa9ab800
dighost.c:2775:query_detach(0x7f6fab09f540) = 1
check_if_done()
list full
pending lookup 0x7f6faa9ad000
dighost.c:3915:query_detach(0x7f6fab09f540) = 0
dighost.c:3915:destroy_query(0x7f6fab09f540) = 0
dighost.c:1687:lookup_detach(0x7f6faa9ab800) = 2
dighost.c:3916:lookup_detach(0x7f6faa9ab800) = 1
clear_current_lookup()
lookup cleared
dighost.c:1820:lookup_detach(0x7f6faa9ab800) = 0
destroy_lookup
freeing server 0x7f6fab072a00 belonging to 0x7f6faa9ab800
start_lookup()
setup_lookup(0x7f6faa9ad000)
resetting lookup counter.
cloning server list
clone_server_list()
make_server(127.0.0.111)
idn_textname: 3000.delay.getdnsapi.net
using root origin
recursive query
AD query
add_question()
starting to render the message
add_opt()
done rendering
create query 0x7f6fab09f540 linked to lookup 0x7f6faa9ad000
dighost.c:2141:lookup_attach(0x7f6faa9ad000) = 2
dighost.c:2652:new_query(0x7f6fab09f540) = 1
do_lookup()
start_udp(0x7f6fab09f540)
dighost.c:3255:query_attach(0x7f6fab09f540) = 2
working on lookup 0x7f6faa9ad000, query 0x7f6fab09f540
signal.c:78: REQUIRE(((signal) != ((void *)0) && ((const isc__magic_t *)(signal))->magic == ((('S') << 24 | ('I') << 16 | ('G') << 8 | (' '))))) failed, back trace
/usr/lib/libisc-9.19.19-dev.so(+0x33891)[0x7f6faef42891]
/usr/lib/libisc-9.19.19-dev.so(isc_assertion_failed+0x31)[0x7f6faef427a2]
/usr/lib/libisc-9.19.19-dev.so(isc_signal_stop+0x44)[0x7f6faef71f5b]
/usr/lib/libisc-9.19.19-dev.so(isc_loopmgr_blocking+0x55)[0x7f6faef61f96]
dig(get_address+0x38)[0x55ddd5ed030b]
dig(+0x1b006)[0x55ddd5ecc006]
dig(do_lookup+0xc8)[0x55ddd5ed067b]
dig(start_lookup+0x285)[0x55ddd5ec6e71]
dig(+0x15485)[0x55ddd5ec6485]
dig(+0x15fcc)[0x55ddd5ec6fcc]
dig(+0x1d23b)[0x55ddd5ece23b]
/usr/lib/libisc-9.19.19-dev.so(+0x1e71d)[0x7f6faef2d71d]
/usr/lib/libisc-9.19.19-dev.so(isc__nm_readcb+0x121)[0x7f6faef2d863]
/usr/lib/libisc-9.19.19-dev.so(isc__nm_udp_failed_read_cb+0x12b)[0x7f6faef420b0]
/usr/lib/libisc-9.19.19-dev.so(isc__nm_failed_read_cb+0x89)[0x7f6faef2ac2c]
/usr/lib/libisc-9.19.19-dev.so(isc__nm_udp_shutdown+0x127)[0x7f6faef42723]
/usr/lib/libisc-9.19.19-dev.so(isc__nmsocket_shutdown+0x6e)[0x7f6faef2dd24]
/usr/lib/libisc-9.19.19-dev.so(+0x1edb4)[0x7f6faef2ddb4]
/usr/lib/libuv.so.1(uv_walk+0x9b)[0x7f6fae9a474b]
/usr/lib/libisc-9.19.19-dev.so(+0x1818b)[0x7f6faef2718b]
/usr/lib/libisc-9.19.19-dev.so(isc__async_cb+0x18d)[0x7f6faef42c6b]
/usr/lib/libuv.so.1(+0x9a1b)[0x7f6fae99fa1b]
/usr/lib/libuv.so.1(+0x26d48)[0x7f6fae9bcd48]
/usr/lib/libuv.so.1(uv_run+0x1bf)[0x7f6fae9a4fbf]
/usr/lib/libisc-9.19.19-dev.so(+0x51370)[0x7f6faef60370]
/usr/lib/libisc-9.19.19-dev.so(+0x66f07)[0x7f6faef75f07]
/usr/lib/libisc-9.19.19-dev.so(isc_thread_main+0x62)[0x7f6faef75fc6]
/usr/lib/libisc-9.19.19-dev.so(isc_loopmgr_run+0x187)[0x7f6faef613fb]
dig(dig_startup+0x48)[0x55ddd5ec0624]
dig(main+0x40)[0x55ddd5ec068a]
/usr/lib/libc.so.6(+0x27cd0)[0x7f6fae9f1cd0]
/usr/lib/libc.so.6(__libc_start_main+0x8a)[0x7f6fae9f1d8a]
dig(_start+0x25)[0x55ddd5eb7045]
Aborted (core dumped)
```
</details>December 2023 (9.18.21, 9.18.21-S1, 9.19.19)Mark AndrewsMark Andrewshttps://gitlab.isc.org/isc-projects/bind9/-/issues/4417Stale hyperlinks in the ARM2023-12-04T10:02:12ZMatthijs Mekkingmatthijs@isc.orgStale hyperlinks in the ARMFrom bind-users:
https://bind9.readthedocs.io/en/v9.18.19/dnssec-guide.html
there's a link to
https://stats.research.icann.org/dns/tld_report/
which is no longer valid. New data seems to be here:
https://ithi.research.icann.or...From bind-users:
https://bind9.readthedocs.io/en/v9.18.19/dnssec-guide.html
there's a link to
https://stats.research.icann.org/dns/tld_report/
which is no longer valid. New data seems to be here:
https://ithi.research.icann.org/
ITHI == idenitifier technologies health indicators
how many TLDs support DNSSEC ?
https://ithi.research.icann.org/graph-m7.htmlDecember 2023 (9.18.21, 9.18.21-S1, 9.19.19)Petr Špačekpspacek@isc.orgPetr Špačekpspacek@isc.orghttps://gitlab.isc.org/isc-projects/bind9/-/issues/26Switch to IDNA2008 non-transitional processing (and use libidn2 for that)2023-12-01T10:23:29ZOndřej SurýSwitch to IDNA2008 non-transitional processing (and use libidn2 for that)Copied here from https://bugs.isc.org/Ticket/Display.html?id=36101
The most current (and maintained) implementation of IDNA is libidn2 and that what we should be using. Moreover the DNS world just needs to bite the bullet and switch to...Copied here from https://bugs.isc.org/Ticket/Display.html?id=36101
The most current (and maintained) implementation of IDNA is libidn2 and that what we should be using. Moreover the DNS world just needs to bite the bullet and switch to IDNA2008 non-transitional processing and finally be done with that.
Firefox, curl and wget have already switched to IDNA2008 non-transitional, and it's not like dig/host/nslookup IDNA processing would have any security implications like with the web browser software.BIND-9.13.0Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4456license question: do bind9 based configs require Mozilla License?2023-11-27T20:54:47ZPJ Fanninglicense question: do bind9 based configs require Mozilla License?I'm a developer on the Apache Pekko project, an open source fork of Akka.
One of our mentors has queried if we have a licensing issue for the files in this directory.
https://github.com/apache/incubator-pekko/tree/main/actor-tests/src/t...I'm a developer on the Apache Pekko project, an open source fork of Akka.
One of our mentors has queried if we have a licensing issue for the files in this directory.
https://github.com/apache/incubator-pekko/tree/main/actor-tests/src/test/bind/etc
The configs there are Bind9 configs used in our tests.
Does the Mozilla Public License have to be applied to our config files?
https://gitlab.isc.org/isc-projects/bind9/-/blob/main/LICENSE
The Mozilla Public License is regarded by the ASF as having copyleft implications.
Any advice on the licensing implications of having config files based on this repo would be appreciated.https://gitlab.isc.org/isc-projects/bind9/-/issues/4340"max-cache-size" is a no-op since BIND 9.19.162023-11-27T13:22:48ZMichał Kępień"max-cache-size" is a no-op since BIND 9.19.16Commit 4db150437e14b28c5b50ae466af9ce502fd73185 (part of !7873)
inadvertently turned the `max-cache-size` option into a no-op by
removing the `isc_mem_setwater()` call from `dns_cache_setcachesize()`.
This was not caught in testing as t...Commit 4db150437e14b28c5b50ae466af9ce502fd73185 (part of !7873)
inadvertently turned the `max-cache-size` option into a no-op by
removing the `isc_mem_setwater()` call from `dns_cache_setcachesize()`.
This was not caught in testing as the current memory use limitation
logic employed in `named` is not stable enough for low `max-cache-size`
values (which is documented) and most tests use the implicit default of
`max-cache-size 90%;`.
Shout-out to @esesterhenn for [spotting this][1], thank you! :medal:
[1]: https://mattermost.isc.org/x41-dsec/pl/e9ja89c8at8wxpyqkjcpoukoycNovember 2023 (9.16.45, 9.16.45-S1, 9.18.20, 9.18.20-S1, 9.19.18)Evan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/1295Implement PROXY protocol2023-11-27T12:46:04ZOndřej SurýImplement PROXY protocolThe PowerDNS folks are now considering dropping the XPF in favour of the [PROXY protocol](http://www.haproxy.org/download/1.8/doc/proxy-protocol.txt) which is pretty much reasonable.The PowerDNS folks are now considering dropping the XPF in favour of the [PROXY protocol](http://www.haproxy.org/download/1.8/doc/proxy-protocol.txt) which is pretty much reasonable.BIND 9.19.xArtem BoldarievArtem Boldarievhttps://gitlab.isc.org/isc-projects/bind9/-/issues/2209TSAN error bin/named/controlconf.c related.2023-11-23T08:14:30ZMark AndrewsTSAN error bin/named/controlconf.c related.Job [#1213042](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1213042) failed for ced5041de73f16dbadf1acacbf8c20659efcb6a6:
```
WARNING: ThreadSanitizer: data race
Write of size 8 at 0x000000000001 by thread T1:
#0 isc_nmhandle...Job [#1213042](https://gitlab.isc.org/isc-projects/bind9/-/jobs/1213042) failed for ced5041de73f16dbadf1acacbf8c20659efcb6a6:
```
WARNING: ThreadSanitizer: data race
Write of size 8 at 0x000000000001 by thread T1:
#0 isc_nmhandle_detach lib/isc/netmgr/netmgr.c:1258:15
#1 control_command bin/named/controlconf.c:388:3
#2 dispatch lib/isc/task.c:1152:7
#3 run lib/isc/task.c:1344:2
Previous read of size 8 at 0x000000000001 by thread T2:
#0 isc_nm_pauseread lib/isc/netmgr/netmgr.c:1449:33
#1 recv_data lib/isccc/ccmsg.c:109:2
#2 isc__nm_tcp_shutdown lib/isc/netmgr/tcp.c:1157:4
#3 shutdown_walk_cb lib/isc/netmgr/netmgr.c:1515:3
#4 uv_walk <null>
#5 process_queue lib/isc/netmgr/netmgr.c:659:4
#6 process_normal_queue lib/isc/netmgr/netmgr.c:582:10
#7 process_queues lib/isc/netmgr/netmgr.c:590:8
#8 async_cb lib/isc/netmgr/netmgr.c:548:2
#9 <null> <null>
Location is heap block of size 569 at 0x000000000016 allocated by thread T2:
#0 malloc <null>
#1 default_memalloc lib/isc/mem.c:713:8
#2 mem_get lib/isc/mem.c:622:8
#3 isc___mem_get lib/isc/mem.c:1044:9
#4 isc__mem_get lib/isc/mem.c:2432:10
#5 alloc_handle lib/isc/netmgr/netmgr.c:1076:3
#6 isc__nmhandle_get lib/isc/netmgr/netmgr.c:1100:12
#7 isc__nm_async_tcpchildaccept lib/isc/netmgr/tcp.c:486:11
#8 process_queue lib/isc/netmgr/netmgr.c:624:4
#9 process_normal_queue lib/isc/netmgr/netmgr.c:582:10
#10 process_queues lib/isc/netmgr/netmgr.c:590:8
#11 async_cb lib/isc/netmgr/netmgr.c:548:2
#12 <null> <null>
Thread T1 (running) created by main thread at:
#0 pthread_create <null>
#1 isc_thread_create lib/isc/pthreads/thread.c:73:8
#2 isc_taskmgr_create lib/isc/task.c:1434:3
#3 create_managers bin/named/main.c:915:11
#4 setup bin/named/main.c:1223:11
#5 main bin/named/main.c:1523:2
Thread T2 (running) created by main thread at:
#0 pthread_create <null>
#1 isc_thread_create lib/isc/pthreads/thread.c:73:8
#2 isc_nm_start lib/isc/netmgr/netmgr.c:232:3
#3 create_managers bin/named/main.c:909:15
#4 setup bin/named/main.c:1223:11
#5 main bin/named/main.c:1523:2
SUMMARY: ThreadSanitizer: data race lib/isc/netmgr/netmgr.c:1258:15 in isc_nmhandle_detach
```November 2020 (9.11.25, 9.11.25-S1, 9.16.9, 9.16.9-S1, 9.17.7)https://gitlab.isc.org/isc-projects/bind9/-/issues/3637BIND 9.18 retries same authoritative nameserver before trying another one2023-11-22T09:02:06ZDavid ZychBIND 9.18 retries same authoritative nameserver before trying another one<!--
If the bug you are reporting is potentially security-related - for example,
if it involves an assertion failure or other crash in `named` that can be
triggered repeatedly - then please do *NOT* report it here, but send an
email to [...<!--
If the bug you are reporting is potentially security-related - for example,
if it involves an assertion failure or other crash in `named` that can be
triggered repeatedly - then please do *NOT* report it here, but send an
email to [security-officer@isc.org](security-officer@isc.org).
-->
### Summary
Given an authoritative zone with multiple nameservers, some of which may currently be unavailable, how should a BIND recursive resolver react when its first query attempt times out?
Empirically, BIND 9.16 will try a _different_ authoritative nameserver next, but BIND 9.18 will try the _same_ one 3 times in a row before moving on to a different one. The BIND 9.18 behavior is significantly less robust.
### BIND version used
```
[root@dmrz-test-rdns ~]# named -V
BIND 9.18.8 (Stable Release) <id:35f5d35>
running on Linux x86_64 3.10.0-1160.el7.x86_64 #1 SMP Mon Oct 19 16:18:59 UTC 2020
built by make with '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/opt/isc/isc-bind/root/usr' '--exec-prefix=/opt/isc/isc-bind/root/usr' '--bindir=/opt/isc/isc-bind/root/usr/bin' '--sbindir=/opt/isc/isc-bind/root/usr/sbin' '--sysconfdir=/etc/opt/isc/isc-bind' '--datadir=/opt/isc/isc-bind/root/usr/share' '--includedir=/opt/isc/isc-bind/root/usr/include' '--libdir=/opt/isc/isc-bind/root/usr/lib64' '--libexecdir=/opt/isc/isc-bind/root/usr/libexec' '--localstatedir=/var/opt/isc/isc-bind' '--sharedstatedir=/var/opt/isc/isc-bind/lib' '--mandir=/opt/isc/isc-bind/root/usr/share/man' '--infodir=/opt/isc/isc-bind/root/usr/share/info' '--disable-static' '--enable-dnstap' '--with-pic' '--with-gssapi' '--with-json-c' '--with-libxml2' '--without-lmdb' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic' 'LDFLAGS=-Wl,-z,relro -L/opt/isc/isc-bind/root/usr/lib64' 'CPPFLAGS= -I/opt/isc/isc-bind/root/usr/include' 'LT_SYS_LIBRARY_PATH=/usr/lib64' 'PKG_CONFIG_PATH=:/opt/isc/isc-bind/root/usr/lib64/pkgconfig:/opt/isc/isc-bind/root/usr/share/pkgconfig' 'SPHINX_BUILD=/builddir/build/BUILD/bind-9.18.8/sphinx/bin/sphinx-build'
compiled by GCC 4.8.5 20150623 (Red Hat 4.8.5-44)
compiled with OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
linked to OpenSSL version: OpenSSL 1.0.2k-fips 26 Jan 2017
compiled with libuv version: 1.44.2
linked to libuv version: 1.44.2
compiled with libnghttp2 version: 1.33.0
linked to libnghttp2 version: 1.33.0
compiled with libxml2 version: 2.9.1
linked to libxml2 version: 20901
compiled with json-c version: 0.11
linked to json-c version: 0.11
compiled with zlib version: 1.2.7
linked to zlib version: 1.2.7
compiled with protobuf-c version: 1.4.1
linked to protobuf-c version: 1.4.1
threads support is enabled
DNSSEC algorithms: RSASHA1 NSEC3RSASHA1 RSASHA256 RSASHA512 ECDSAP256SHA256 ECDSAP384SHA384
DS algorithms: SHA-1 SHA-256 SHA-384
HMAC algorithms: HMAC-MD5 HMAC-SHA1 HMAC-SHA224 HMAC-SHA256 HMAC-SHA384 HMAC-SHA512
TKEY mode 2 support (Diffie-Hellman): yes
TKEY mode 3 support (GSS-API): yes
default paths:
named configuration: /etc/opt/isc/isc-bind/named.conf
rndc configuration: /etc/opt/isc/isc-bind/rndc.conf
DNSSEC root key: /etc/opt/isc/isc-bind/bind.keys
nsupdate session key: /var/opt/isc/isc-bind/run/named/session.key
named PID file: /var/opt/isc/isc-bind/run/named/named.pid
named lock file: /var/opt/isc/isc-bind/run/named/named.lock
```
### Steps to reproduce
1. Configure an authoritative server with the following authoritative zone, listing one working nameserver IP (itself) plus two others which are "down".
```
example.com. 30 IN SOA ns1.example.com. dns-admin.illinois.edu. 1 3600 900 1209600 30
example.com. 30 IN NS ns1.example.com.
example.com. 30 IN NS ns2.example.com.
example.com. 30 IN NS ns3.example.com.
foo.example.com. 30 IN TXT "test"
ns1.example.com. 30 IN A 18.191.96.154
ns2.example.com. 30 IN A 192.168.0.20
ns3.example.com. 30 IN A 192.168.0.30
```
2. Configure a CentOS 7 instance "dmrz-test-rdns" running isc-bind-bind 9.18.8-1.1.el7 from https://copr.fedorainfracloud.org/coprs/isc/bind/ with a stub zone for example.com:
```
options {
directory "/var/opt/isc/isc-bind/named/data";
allow-query { localhost; 10.0.0.0/8; };
recursion yes;
dnssec-validation no;
# enable querylogging at startup
querylog yes;
};
logging {
channel default_debug {
file "named.run";
print-time yes;
severity dynamic;
};
# local file for query logs
channel queries_file {
file "queries.log" versions 5 size 200m;
severity info;
print-time yes; print-category yes; print-severity yes;
};
category queries { queries_file; };
};
zone "example.com" IN {
file "example.com";
type stub;
masters { 18.191.96.154; };
};
```
3. As root on dmrz-test-rdns, start a packet capture:
```
[root@dmrz-test-rdns data]# tcpdump -i eth0 -Uw /var/tmp/${HOSTNAME}.pcap
```
4. From another workstation in the allow-query ACL, send a test query:
```
$ dig @10.225.64.30 foo.example.com txt
; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.10 <<>> @10.225.64.30 foo.example.com txt
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33425
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;foo.example.com. IN TXT
;; ANSWER SECTION:
foo.example.com. 30 IN TXT "test"
;; Query time: 3784 msec
;; SERVER: 10.225.64.30#53(10.225.64.30)
;; WHEN: Tue Nov 01 20:08:08 UTC 2022
;; MSG SIZE rcvd: 61
```
Note the very long query time (over 3 seconds). If you don't see a long query time at first, wait at least 30s (for cache to expire) and try again.
5. Observe named complaining about a timeout:
```
[root@dmrz-test-rdns data]# tail /var/opt/isc/isc-bind/named/data/named.run
01-Nov-2022 20:04:54.508 all zones loaded
01-Nov-2022 20:04:54.508 running
01-Nov-2022 20:08:08.556 timed out resolving 'foo.example.com/TXT/IN': 192.168.0.20#53
01-Nov-2022 20:08:51.441 timed out resolving 'foo.example.com/TXT/IN': 192.168.0.30#53
```
6. Download the packet capture file and examine it in Wireshark, filtering by `dns` protocol.
Observe that after receiving a query from client (10.224.255.16) at 20:08:04, we try 3 times in a row to query the unresponsive 192.168.0.20 before moving on to the working nameserver 18.191.96.154.
On another occasion, we try 192.168.0.30 3 times in a row.
At 20:09:32 we happen to try the working nameserver first and are able to respond to our client quickly.
```
No. Time Source Destination Protocol Length Info
3 2022-11-01 20:08:04.776005 10.224.255.16 10.225.64.30 DNS 86 Standard query 0x8291 TXT foo.example.com OPT
4 2022-11-01 20:08:04.776851 10.225.64.30 192.168.0.20 DNS 98 Standard query 0x90f6 TXT foo.example.com OPT
5 2022-11-01 20:08:05.576960 10.225.64.30 192.168.0.20 DNS 98 Standard query 0x7acc TXT foo.example.com OPT
7 2022-11-01 20:08:06.378009 10.225.64.30 192.168.0.20 DNS 98 Standard query 0x05fe TXT foo.example.com OPT
14 2022-11-01 20:08:08.557405 10.225.64.30 18.191.96.154 DNS 98 Standard query 0x7979 TXT foo.example.com OPT
15 2022-11-01 20:08:08.558188 18.191.96.154 10.225.64.30 DNS 233 Standard query response 0x7979 TXT foo.example.com TXT NS ns3.example.com NS ns2.example.com NS ns1.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT
16 2022-11-01 20:08:08.558432 10.225.64.30 10.224.255.16 DNS 103 Standard query response 0x8291 TXT foo.example.com TXT OPT
31 2022-11-01 20:08:47.747803 10.224.255.16 10.225.64.30 DNS 86 Standard query 0xbf02 TXT foo.example.com OPT
32 2022-11-01 20:08:47.748163 10.225.64.30 192.168.0.30 DNS 98 Standard query 0x474b TXT foo.example.com OPT
33 2022-11-01 20:08:48.549188 10.225.64.30 192.168.0.30 DNS 98 Standard query 0x2130 TXT foo.example.com OPT
34 2022-11-01 20:08:49.350213 10.225.64.30 192.168.0.30 DNS 98 Standard query 0x7d5f TXT foo.example.com OPT
37 2022-11-01 20:08:51.441890 10.225.64.30 18.191.96.154 DNS 114 Standard query 0x1ef3 TXT foo.example.com OPT
38 2022-11-01 20:08:51.442705 18.191.96.154 10.225.64.30 DNS 233 Standard query response 0x1ef3 TXT foo.example.com TXT NS ns2.example.com NS ns1.example.com NS ns3.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT
39 2022-11-01 20:08:51.442872 10.225.64.30 10.224.255.16 DNS 103 Standard query response 0xbf02 TXT foo.example.com TXT OPT
55 2022-11-01 20:09:32.834068 10.224.255.16 10.225.64.30 DNS 86 Standard query 0x0297 TXT foo.example.com OPT
56 2022-11-01 20:09:32.834680 10.225.64.30 18.191.96.154 DNS 114 Standard query 0x958a TXT foo.example.com OPT
57 2022-11-01 20:09:32.835082 18.191.96.154 10.225.64.30 DNS 233 Standard query response 0x958a TXT foo.example.com TXT NS ns2.example.com NS ns3.example.com NS ns1.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT
58 2022-11-01 20:09:32.835211 10.225.64.30 10.224.255.16 DNS 103 Standard query response 0x0297 TXT foo.example.com TXT OPT
```
7. Now retry the experiment with a different recursive resolver "dmrz-test-rdns-916" (10.225.64.119) running isc-bind-bind 9.16.34-1.1.el7 from https://copr.fedorainfracloud.org/coprs/isc/bind-esv/
BIND 9.16 handles the situation much more gracefully; at 20:42:51 we fail to get a response from 192.168.0.30, but instead of retrying the same server we try another one, which happens to be the working one, so we are able to respond successfully to the client in under 1 second.
```
No. Time Source Destination Protocol Length Info
12 2022-11-01 20:42:51.913893 10.224.255.16 10.225.64.119 DNS 86 Standard query 0x29ac TXT foo.example.com OPT
13 2022-11-01 20:42:51.915009 10.225.64.119 192.168.0.30 DNS 98 Standard query 0x19c1 TXT foo.example.com OPT
14 2022-11-01 20:42:52.713887 10.225.64.119 18.191.96.154 DNS 98 Standard query 0x2280 TXT foo.example.com OPT
15 2022-11-01 20:42:52.714303 18.191.96.154 10.225.64.119 DNS 131 Standard query response 0x2280 TXT foo.example.com TXT OPT
16 2022-11-01 20:42:52.714709 10.225.64.119 10.224.255.16 DNS 103 Standard query response 0x29ac TXT foo.example.com TXT OPT
53 2022-11-01 20:44:15.623212 10.224.255.16 10.225.64.119 DNS 86 Standard query 0xcf02 TXT foo.example.com OPT
54 2022-11-01 20:44:15.623584 10.225.64.119 18.191.96.154 DNS 114 Standard query 0xc333 TXT foo.example.com OPT
55 2022-11-01 20:44:15.624183 18.191.96.154 10.225.64.119 DNS 233 Standard query response 0xc333 TXT foo.example.com TXT NS ns1.example.com NS ns3.example.com NS ns2.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT
56 2022-11-01 20:44:15.624371 10.225.64.119 10.224.255.16 DNS 103 Standard query response 0xcf02 TXT foo.example.com TXT OPT
68 2022-11-01 20:44:54.321211 10.224.255.16 10.225.64.119 DNS 86 Standard query 0x6f5f TXT foo.example.com OPT
69 2022-11-01 20:44:54.321578 10.225.64.119 192.168.0.20 DNS 98 Standard query 0x1fa5 TXT foo.example.com OPT
70 2022-11-01 20:44:55.120996 10.225.64.119 18.191.96.154 DNS 114 Standard query 0xddae TXT foo.example.com OPT
71 2022-11-01 20:44:55.121400 18.191.96.154 10.225.64.119 DNS 233 Standard query response 0xddae TXT foo.example.com TXT NS ns3.example.com NS ns1.example.com NS ns2.example.com A 18.191.96.154 A 192.168.0.20 A 192.168.0.30 OPT
72 2022-11-01 20:44:55.121577 10.225.64.119 10.224.255.16 DNS 103 Standard query response 0x6f5f TXT foo.example.com TXT OPT
```
In this case, we also do not see any "timed out" log messages in named.run
### What is the current *bug* behavior?
As shown above, BIND 9.18 consistently tries multiple times in a row to reach the same unresponsive authoritative nameserver IP, and is therefore unable to answer the client in a timely fashion even though the zone has another authoritative nameserver which is working just fine.
### What is the expected *correct* behavior?
Be like BIND 9.16 and retry against a different authoritative nameserver IP when the first one doesn't respond.
aside: this may be related to #3629January 2023 (9.16.37, 9.16.37-S1, 9.18.11, 9.18.11-S1, 9.19.9)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4443Files Created with World Read/Write Permissions2023-11-21T12:50:53ZMarkus VervierFiles Created with World Read/Write PermissionsThe function `isc_file_openunique()` tries
to create files with permission mode *0666* as shown in the following listing:
```c
isc_result_t
isc_file_openunique(char *templet, FILE **fp) {
int mode = S_IWU...The function `isc_file_openunique()` tries
to create files with permission mode *0666* as shown in the following listing:
```c
isc_result_t
isc_file_openunique(char *templet, FILE **fp) {
int mode = S_IWUSR | S_IRUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH;
return (isc_file_openuniquemode(templet, mode, fp));
}
```
Unless a more restrictive *umask* is set, this results in the created file to be
world read- and writable for any user on the system. The function `isc_file_openunique()`
is involved in the creation of temporary files, zone files, and configuration files.
On nearly all modern systems,
the umask (https://man.freebsd.org/cgi/man.cgi?query=umask&sektion=2) will be restrictive,
mitigating a security impact because it will turn off corresponding bits requested
in the file mode.https://gitlab.isc.org/isc-projects/bind9/-/issues/4355Potential cache poisoning due to unexpected recursion instead of following de...2023-11-16T02:27:40ZCathy AlmondPotential cache poisoning due to unexpected recursion instead of following delegation when serve-stale is enabled### Summary
As reported in [Support ticket #22830](https://support.isc.org/Ticket/Display.html?id=22830)
The server is authoritative for some zones as well as supporting recursion for others. Some zones delegate subdomains to other na...### Summary
As reported in [Support ticket #22830](https://support.isc.org/Ticket/Display.html?id=22830)
The server is authoritative for some zones as well as supporting recursion for others. Some zones delegate subdomains to other nameservers. For those, the NS RRset in the delegation is unreachable/unresolvable.
With `stale-answer-enable no;` the expected SERVFAIL is returned to the clients querying for names in these subdomains.
With `stale-answer-enable yes;` the resolver appears not to follow the delegation but instead attempts resolution directly from the root nameservers instead, sometimes providing different answers to the client that those intended by the configuration and delegation (albeit broken).
### BIND version used
```
BIND 9.16.43-Ubuntu (Extended Support Version) <id:de6f1a0>
running on Linux x86_64 5.15.0-1041-aws #46~20.04.1-Ubuntu SMP Wed Jul 19 15:40:00 UTC 2023
built by make with '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=/usr/include' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-silent-rules' '--libdir=/usr/lib/x86_64-linux-gnu' '--libexecdir=/usr/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--disable-dependency-tracking' '--libdir=/usr/lib/x86_64-linux-gnu' '--sysconfdir=/etc/bind' '--with-python=python3' '--localstatedir=/' '--enable-threads' '--enable-largefile' '--with-libtool' '--enable-shared' '--enable-static' '--with-gost=no' '--with-openssl=/usr' '--with-gssapi=/usr' '--with-libidn2' '--with-json-c' '--with-lmdb=/usr' '--with-gnu-ld' '--with-maxminddb' '--with-atf=no' '--enable-ipv6' '--enable-rrl' '--enable-filter-aaaa' '--disable-native-pkcs11' '--enable-dnstap' 'build_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -fdebug-prefix-map=/build/bind9-FMDtLY/bind9-9.16.43=. -fstack-protector-strong -Wformat -Werror=format-security -fno-strict-aliasing -fno-delete-null-pointer-checks -DNO_VERSION_DATE -DDIG_SIGCHASE' 'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2'
compiled by GCC 9.4.0
compiled with OpenSSL version: OpenSSL 1.1.1f 31 Mar 2020
linked to OpenSSL version: OpenSSL 1.1.1f 31 Mar 2020
compiled with libuv version: 1.44.2
linked to libuv version: 1.44.2
compiled with libxml2 version: 2.9.10
linked to libxml2 version: 20910
compiled with json-c version: 0.13.1
linked to json-c version: 0.13.1
compiled with zlib version: 1.2.11
linked to zlib version: 1.2.11
linked to maxminddb version: 1.4.2
compiled with protobuf-c version: 1.3.3
linked to protobuf-c version: 1.3.3
threads support is enabled
DNSSEC algorithms: RSASHA1 NSEC3RSASHA1 RSASHA256 RSASHA512 ECDSAP256SHA256 ECDSAP384SHA384 ED25519 ED448
DS algorithms: SHA-1 SHA-256 SHA-384
HMAC algorithms: HMAC-MD5 HMAC-SHA1 HMAC-SHA224 HMAC-SHA256 HMAC-SHA384 HMAC-SHA512
TKEY mode 2 support (Diffie-Hellman): yes
TKEY mode 3 support (GSS-API): yes
default paths:
named configuration: /etc/bind/named.conf
rndc configuration: /etc/bind/rndc.conf
DNSSEC root key: /etc/bind/bind.keys
nsupdate session key: //run/named/session.key
named PID file: //run/named/named.pid
named lock file: //run/named/named.lock
geoip-directory: /usr/share/GeoIP
```
### Steps to reproduce
Pasting here from the report to Support team:
Locally setup in-addr.arpa for private /16 network: 59.10.in-addr.arpa
with delegation for /24:
```
$ORIGIN 59.10.in-addr.arpa.
1 NS nss1.example.net.
NS nss2.example.net.
NS nss3.example.net.
```
As these NSs are fake, we can't contact them ever and without serve-stale enabled - we always receive SERVFAIL.
**But, as soon as serve-stale is enabled, named will start to try to run recursion from the root**, and we start getting NXDOMAIN (what is cacheable answer)
```
# dig 1.59.10.in-addr.arpa @127.0.0.1
; <<>> DiG 9.16.43-Ubuntu <<>> 1.59.10.in-addr.arpa @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 21068
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: c6da2a29e4d47c6a01000000651554607e40e4b8c2a75c00 (good)
;; QUESTION SECTION:
;1.59.10.in-addr.arpa. IN A
;; AUTHORITY SECTION:
10.in-addr.arpa. 10800 IN SOA prisoner.iana.org. hostmaster.root-servers.org. 1 604800 60 604800 604800
;; Query time: 156 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Sep 28 10:24:32 UTC 2023
;; MSG SIZE rcvd: 172
```
The same thing with direct zones.
For example:
- Test zone: myctl.com
- Test record: test.myctl.com (on the external NSs it returns 127.0.0.1)
Delegation in the local zone file:
```
$ORIGIN myctl.com.
test NS nss1.example.net.
NS nss2.example.net.
NS nss3.example.net.
```
Without serve-stale enabled, I always have SERVFAIL answer.
With serve-stale enabled, I have SERVFAL twice, then recursion started from the root, and I will have answer from the external nameservers not specified in the localzone file:
```
# dig test.myctl.com. @127.0.0.1
; <<>> DiG 9.16.43-Ubuntu <<>> test.myctl.com. @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 29218
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 42aa29742ab548b00100000065155561f82a157e5a4464ae (good)
;; QUESTION SECTION:
;test.myctl.com. IN A
;; Query time: 60 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Sep 28 10:28:49 UTC 2023
;; MSG SIZE rcvd: 71
```
```
# dig test.myctl.com. @127.0.0.1
; <<>> DiG 9.16.43-Ubuntu <<>> test.myctl.com. @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42780
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: c14a6db123fa69bc010000006515556372206781ce6b989d (good)
;; QUESTION SECTION:
;test.myctl.com. IN A
;; ANSWER SECTION:
test.myctl.com. 300 IN A 127.0.0.1
;; AUTHORITY SECTION:
test.myctl.com. 3600 IN NS nss2.example.net.
test.myctl.com. 3600 IN NS nss1.example.net.
test.myctl.com. 3600 IN NS nss3.example.net.
;; Query time: 4 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Sep 28 10:28:51 UTC 2023
;; MSG SIZE rcvd: 155
```
### What is the current *bug* behavior?
Unexpected recursion from root down (ignoring the delegation from the local auth parent zones) by the resolver when stale-answer-enable is 'yes'
### What is the expected *correct* behavior?
Consistent SERVFAIL (by following the delegation NS RRset and being unable to resolve the delegated nameserver names in the parent zone).
### Relevant configuration files
Relevant snippets from named.conf (for the full configuration, see the Support ticket):
```
options {
directory "/var/cache/bind";
listen-on-v6 {
"none";
};
dnssec-validation no;
minimal-responses no;
qname-minimization off;
stale-answer-enable yes;
stale-answer-client-timeout 1800;
stale-cache-enable yes;
stale-refresh-time 30;
masterfile-format text;
};
zone "59.10.in-addr.arpa" in {
type master;
file "zones/59.10.rev";
forwarders {
};
};
zone "myctl.com" in {
type master;
file "zones/myctl.com";
forwarders {
};
};
zone "localhost" {
type master;
file "/etc/bind/db.local";
};
zone "127.in-addr.arpa" {
type master;
file "/etc/bind/db.127";
};
zone "0.in-addr.arpa" {
type master;
file "/etc/bind/db.0";
};
zone "255.in-addr.arpa" {
type master;
file "/etc/bind/db.255";
};
```
```
# cat /var/cache/bind/zones/myctl.com
$ORIGIN .
$TTL 3600 ; 1 hour
myctl.com IN SOA dc1.example.net. corporate.example.net. (
428210 ; serial
900 ; refresh (15 minutes)
600 ; retry (10 minutes)
86400 ; expire (1 day)
3600 ; minimum (1 hour)
)
NS ns1.myctl.com.
NS ns2.myctl.com.
ns1.myctl.com. IN A 127.0.0.1
ns2.myctl.com. IN A 127.0.0.1
$ORIGIN myctl.com.
test NS nss1.example.net.
NS nss2.example.net.
NS nss3.example.net.
```
```
# cat /var/cache/bind/zones/59.10.rev
$ORIGIN .
$TTL 3600 ; 1 hour
59.10.in-addr.arpa IN SOA dc1.example.net. corporate.example.net. (
428210 ; serial
900 ; refresh (15 minutes)
600 ; retry (10 minutes)
86400 ; expire (1 day)
3600 ; minimum (1 hour)
)
NS ns1.example.net.
NS ns2.example.net.
$ORIGIN 59.10.in-addr.arpa.
1 NS nss1.example.net.
NS nss2.example.net.
NS nss3.example.net.
```
Notably, this server IS authoritative for the parent zones but delegates to an NS RRset that it's not authoritative for and where the names can't be resolved to anything useful.
Therefore the resolver should be attempting to use the delegation NS RRset for these internal-only zones and delegations from them, and not attempting resolution from the root down (but it *DOES* nevertheless attempt that with `stale-answer-enable yes;`
### Why this is potentially a security defect:
Quoting the reporter:
We expect that internal customers will get some internal IPs in answers, or didn’t get anything if something wrong (like broken internal NSs) or get answers from the cache when the NSs configured in the zone file are not available but answers already in the cache. But not external IPs or unexpected answers.
Lets assume something goes wrong with our internal NSs (nss[1-3].example.net, like in example above), and everyone inside the company get some sort of external IP (or loopback IP) for some requested record (like in example above). Let it be “supernewfeature.myctl.com” (this is only for example), and everyone inside the company start to run some sort of queries against that record with answer pointed to unexpected place, then service might be overload/unexpected responses replied/anything else.
When the serve-stale is disabled - everyone will get SERVFAIL, and external services will not be impacted.
Also I see this as a way of potential attack, when in the external nameservers place some sort of victim IP address what can cause to DDoS or pointing to some sort of fishing website.
That's why I evaluate that issue as security issue.November 2023 (9.16.45, 9.16.45-S1, 9.18.20, 9.18.20-S1, 9.19.18)Matthijs Mekkingmatthijs@isc.orgMatthijs Mekkingmatthijs@isc.orghttps://gitlab.isc.org/isc-projects/bind9/-/issues/4260initial AXFR fails in the inline test2023-11-15T13:59:57ZTom Krizekinitial AXFR fails in the inline testIn [`system:gcc:tarball`](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3584073) the `inline` test failed:
```
2023-08-15 00:17:47 INFO:inline I:inline_tmp_dffqwzo0:checking that the zone is signed on initial transfer (3)
2023-08...In [`system:gcc:tarball`](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3584073) the `inline` test failed:
```
2023-08-15 00:17:47 INFO:inline I:inline_tmp_dffqwzo0:checking that the zone is signed on initial transfer (3)
2023-08-15 00:17:47 INFO:inline Zone contains no DNSSEC keys
2023-08-15 00:17:48 INFO:inline Zone contains no DNSSEC keys
2023-08-15 00:17:49 INFO:inline Zone contains no DNSSEC keys
2023-08-15 00:17:50 INFO:inline Zone contains no DNSSEC keys
2023-08-15 00:17:51 INFO:inline Zone contains no DNSSEC keys
2023-08-15 00:17:52 INFO:inline Zone contains no DNSSEC keys
2023-08-15 00:17:53 INFO:inline Zone contains no DNSSEC keys
2023-08-15 00:17:54 INFO:inline Zone contains no DNSSEC keys
2023-08-15 00:17:55 INFO:inline Zone contains no DNSSEC keys
2023-08-15 00:17:56 INFO:inline Zone contains no DNSSEC keys
2023-08-15 00:17:56 INFO:inline I:inline_tmp_dffqwzo0:failed
2023-08-15 00:17:56 INFO:inline I:inline_tmp_dffqwzo0:checking expired signatures are updated on load (4)
2023-08-15 00:17:56 INFO:inline I:inline_tmp_dffqwzo0:checking removal of private type record via 'rndc signing -clear' (5)
2023-08-15 00:18:06 INFO:inline I:inline_tmp_dffqwzo0:failed
2023-08-15 00:18:06 INFO:inline I:inline_tmp_dffqwzo0:checking private type was properly signed (6)
2023-08-15 00:18:06 INFO:inline I:inline_tmp_dffqwzo0:failed
```
The dig output says the transfer failed:
```
; <<>> DiG 9.19.17-dev <<>> +tcp +dnssec -p 18989 @10.53.0.3 bits. AXFR
; (1 server found)
;; global options: +cmd
; Transfer failed.
```
In [ns3/named.run](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3584073/artifacts/raw/bind-9.19.17-dev/bin/tests/system/inline_tmp_dffqwzo0/ns3/named.run?inline=false), there's a message that `zone transfer setup failed`:
```
15-Aug-2023 00:17:54.716 client @0x7f7a78c46000 (no-peer): allocate new client
15-Aug-2023 00:17:54.716 client @0x7f7a78c46000 10.53.0.1#35491: TCP request
15-Aug-2023 00:17:54.716 client @0x7f7a78c46000 10.53.0.1#35491: using view '_default'
15-Aug-2023 00:17:54.716 client @0x7f7a78c46000 10.53.0.1#35491: request is not signed
15-Aug-2023 00:17:54.716 client @0x7f7a78c46000 10.53.0.1#35491: recursion not available (recursion not enabled for view)
15-Aug-2023 00:17:54.716 query client=0x7f7a78c46000 thread=0x7f7a7a47e700(<unknown-query>): ns_query_start
15-Aug-2023 00:17:54.716 client @0x7f7a78c46000 10.53.0.1#35491 (bits): AXFR request
15-Aug-2023 00:17:54.716 client @0x7f7a78c46000 10.53.0.1#35491 (bits): zone transfer setup failed
15-Aug-2023 00:17:54.716 client @0x7f7a78c46000 10.53.0.1#35491 (bits): reset client
15-Aug-2023 00:17:54.716 query client=0x7f7a78c46000 thread=0x7f7a7a47e700(bits/AXFR): query_reset
```November 2023 (9.16.45, 9.16.45-S1, 9.18.20, 9.18.20-S1, 9.19.18)https://gitlab.isc.org/isc-projects/bind9/-/issues/3394[CVE-2022-2795] Processing large delegations may severely degrade resolver pe...2023-11-15T08:47:02ZYehuda Afek[CVE-2022-2795] Processing large delegations may severely degrade resolver performance### CVE-specific actions
- [x] [Assign a CVE identifier](#note_306871)
- [x] [Determine CVSS score](#note_306873)
- [x] [Determine the range of BIND versions affected (including the Subscription Edition)](#note_306877)
- [x] [De...### CVE-specific actions
- [x] [Assign a CVE identifier](#note_306871)
- [x] [Determine CVSS score](#note_306873)
- [x] [Determine the range of BIND versions affected (including the Subscription Edition)](#note_306877)
- [x] [Determine whether workarounds for the problem exists](#note_306880)
- [x] [Create a draft of the security advisory and put the information above in there](https://portal.document360.io/956e37e2-5ec0-4942-8b27-35533899f099/document/v1/view/802f6c03-dbcc-438b-a252-b5ee436c1b03)
- [x] Prepare a detailed description of the problem which should include the following by default:
- [instructions for reproducing the problem (a system test is good enough)](isc-private/bind9!430)
- [explanation of code flow which triggers the problem (a system test is *not* good enough)](#note_306898)
- [x] Prepare a private merge request containing the following items in separate commits:
- [a test for the issue (may be moved to a separate merge request for deferred merging)](isc-private/bind9!430)
- [a fix for the issue](isc-private/bind9!431)
- [documentation updates (`CHANGES`, release notes, anything else applicable)](isc-private/bind9!431)
- [x] Ensure the merge request from the previous step is reviewed by SWENG staff and has no outstanding discussions
- [x] Ensure the documentation changes introduced by the merge request addressing the problem are reviewed by Support and Marketing staff
- [x] Prepare backports of the merge request addressing the problem for all affected (and still maintained) BIND branches (backporting might affect the issue's scope and/or description)
- [x] Prepare a standalone patch for the last stable release of each affected (and still maintained) BIND branch
### Release-specific actions
- [x] Create/update the private issue containing links to fixes & reproducers for all CVEs fixed in a given release cycle
- [x] Reserve a block of `CHANGES` placeholders once the complete set of vulnerabilities fixed in a given release cycle is determined
- [x] Ensure the merge requests containing CVE fixes are merged into `security-*` branches in CVE identifier order
### Post-disclosure actions
- [x] Merge a regression test reproducing the bug into all affected (and still maintained) BIND branches
---
### Paper
[NXRedirect_-_Attack_Complexity_DDoS_attack_on_DNS_Recursive_Resolvers_WithNames.pdf](/uploads/87d6de4ff8e9372614249ac1affbe9bd/NXRedirect_-_Attack_Complexity_DDoS_attack_on_DNS_Recursive_Resolvers_WithNames.pdf)sSeptember 2022 (9.16.33, 9.16.33-S1, 9.18.7, 9.19.5)Michał KępieńMichał Kępieńhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4055[CVE-2023-2828] named's configured cache size limit can be significantly exce...2023-11-15T08:47:01ZShoham Danino[CVE-2023-2828] named's configured cache size limit can be significantly exceeded| Quick Links | :link: |
| ------------------------ | ------------------------------------------------------------------------------|
| Incident Manager:...| Quick Links | :link: |
| ------------------------ | ------------------------------------------------------------------------------|
| Incident Manager: | @michal |
| Deputy Incident Manager: | @aram |
| Public Disclosure Date: | 2023-06-21 |
| CVSS Score: | [7.5][cvss_score] |
| Security Advisory: | isc-private/printing-press!54 |
| Mattermost Channel: | [CVE-2023-2828: max-cache-size can be significantly exceeded][mattermost_url] |
| Support Ticket: | N/A |
| Release Checklist: | https://gitlab.isc.org/isc-projects/bind9/-/issues/4123 |
| Post-mortem Etherpad: | [postmortem-2023-06][postmortem_url] |
[cvss_score]: https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H&version=3.1
[mattermost_url]: https://mattermost.isc.org/isc/channels/cve-2023-2828-max-cache-size
[postmortem_url]: https://pad.isc.org/p/postmortem-2023-06
:bulb: **Click [here][checklist_explanations] (internal resource) for general information about the security incident handling process.**
[checklist_explanations]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations
### Earlier Than T-5
- [x] [:link:][step_deputy] **(IM)** Pick a Deputy Incident Manager
- [x] [:link:][step_respond] **(IM)** [Respond to the bug reporter](#note_373642)
- [x] [:link:][step_etherpad] **(IM)** [Create an Etherpad for post-mortem][postmortem_url]
- [x] [:link:][step_public_mrs] **(SwEng)** Ensure there are no public merge requests which inadvertently disclose the issue
- [x] [:link:][step_assign_cve_id] **(IM)** [Assign a CVE identifier](#note_374026)
- [x] [:link:][step_note_cve_info] **(SwEng)** Update this issue with the assigned CVE identifier and the CVSS score
- [x] [:link:][step_versions_affected] **(SwEng)** [Determine the range of product versions affected (including the Subscription Edition)](#note_374064)
- [x] [:link:][step_workarounds] **(SwEng)** [Determine whether workarounds for the problem exist](#note_374750)
- [x] [:link:][step_coordinate] **(SwEng)** ~~If necessary, coordinate with other parties~~
- [x] [:link:][step_earliest] **(Support)** Prepare and send out "earliest" notifications
- [x] [:link:][step_advisory_mr] **(Support)** [Create a merge request for the Security Advisory and include all readily available information in it](isc-private/printing-press!54)
- [x] [:link:][step_reproducer_mr] **(SwEng)** [Prepare a private merge request containing a system test reproducing the problem](isc-private/bind9!519)
- [x] [:link:][step_notify_support] **(SwEng)** [Notify Support when a reproducer is ready](https://mattermost.isc.org/isc/pl/bfjbhijg1jnqzxoix8q6uhsp7e)
- [x] [:link:][step_code_analysis] **(SwEng)** [Prepare a detailed explanation of the code flow triggering the problem](#note_372558)
- [x] [:link:][step_fix_mr] **(SwEng)** [Prepare a private merge request with the fix](isc-private/bind9!520)
- [x] [:link:][step_review_fix] **(SwEng)** [Ensure the merge request with the fix is reviewed and has no outstanding discussions](https://gitlab.isc.org/isc-private/bind9/-/merge_requests/520#note_378278)
- [x] [:link:][step_review_docs] **(Support)** [Review the documentation changes introduced by the merge request with the fix](https://mattermost.isc.org/isc/pl/ys9ngz8hpffitrdefochzi1ayh)
- [x] [:link:][step_backports] **(SwEng)** Prepare backports of the merge request addressing the problem for all affected ([and](https://gitlab.isc.org/isc-private/bind9/-/merge_requests/527) [still](https://gitlab.isc.org/isc-private/bind9/-/merge_requests/528) [maintained](https://gitlab.isc.org/isc-private/bind9/-/merge_requests/529)) branches of a given product
- [x] [:link:][step_finish_advisory] **(Support)** [Finish preparing the Security Advisory](https://mattermost.isc.org/isc/pl/nt6beetinpnwzg8678sb5fi79r)
- [x] [:link:][step_meta_issue] **(QA)** [Create (or update) the private issue containing links to fixes & reproducers for all CVEs fixed in a given release cycle](https://gitlab.isc.org/isc-private/bind9/-/issues/68)
- [x] [:link:][step_changes] **(QA)** (BIND 9 only) [Reserve a block of `CHANGES` placeholders once the complete set of vulnerabilities fixed in a given release cycle is determined](isc-projects/bind9!8010)
- [x] [:link:][step_merge_fixes] **(QA)** Merge the CVE fixes in CVE identifier order
- [x] [:link:][step_patches] **(QA)** [Prepare a standalone patch for the last stable release of each affected (and still maintained) product branch](https://mattermost.isc.org/isc/pl/apq5r8ir7ffnb8br3bdpkqbg5a)
- [x] [:link:][step_asn_releases] **(QA)** [Prepare ASN releases (as outlined in the Release Checklist)](https://mattermost.isc.org/isc/pl/gbe5fz3bypfixqt67b85tapruc)
### At T-5
- [x] [:link:][step_send_asn] **(Support)** [Send ASN to eligible customers](https://mattermost.isc.org/isc/pl/43d9p7bou7g79e1zbnp91fooxr)
- [x] [:link:][step_preannouncement] **(Support)** [(BIND 9 only) Send a pre-announcement email to the *bind-announce* mailing list to alert users that the upcoming release will include security fixes](isc-private/printing-press!57)
### At T-4
- [x] [:link:][step_verify_asn] **(Support)** Verify that all ASN-eligible customers have received the notification email
### At T-1
- [x] [:link:][step_check_customers] **(Support)** Verify that any new or reinstated customers have received the notification email
- [x] [:link:][step_packager_emails] **(First IM)** [Send notifications to OS packagers](https://gitlab.isc.org/isc-private/printing-press/-/merge_requests/58#note_382746)
### On the Day of Public Disclosure
- [x] [:link:][step_clearance] **(IM)** [Grant Support clearance to proceed with public release](https://mattermost.isc.org/isc/pl/kx7cyamrwiysdf3wqgq7qwo4me)
- [x] [:link:][step_publish] **(Support)** Publish the releases (as outlined in the release checklist)
- [x] [:link:][step_matrix] **(Support)** (BIND 9 only) Update vulnerability matrix in the Knowledge Base
- [x] [:link:][step_publish_advisory] **(Support)** Bump Document Version for the Security Advisory and publish it in the Knowledge Base
- [x] [:link:][step_notifications] **(First IM)** [Send notification emails to third parties](https://gitlab.isc.org/isc-private/printing-press/-/merge_requests/61#note_383415)
- [x] [:link:][step_mitre] **(First IM)** [Advise MITRE about the disclosed CVEs](https://gitlab.isc.org/isc-private/printing-press/-/merge_requests/54#note_383433)
- [x] [:link:][step_merge_advisory] **(First IM)** Merge the Security Advisory merge request
- [x] [:link:][step_embargo_end] **(IM)** [Inform original reporter (if external) that the security disclosure process is complete](#note_383448)
- [x] [:link:][step_customers] **(Support)** Inform customers a fix has been released
### After Public Disclosure
- [x] [:link:][step_postmortem] **(First IM)** [Organize post-mortem meeting and make sure it happens](https://mattermost.isc.org/isc/pl/8bh5bx4betgbxx67zoxfa3zioc)
- [x] [:link:][step_tickets] **(Support)** Close support tickets
- [x] [:link:][step_regression] **(QA)** Merge a regression test reproducing the bug into all affected (and still maintained) branches
[step_deputy]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#pick-a-deputy-incident-manager
[step_respond]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#respond-to-the-bug-reporter
[step_etherpad]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#create-an-etherpad-for-post-mortem
[step_public_mrs]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#ensure-there-are-no-public-merge-requests-which-inadvertently-disclose-the-issue
[step_assign_cve_id]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#assign-a-cve-identifier
[step_note_cve_info]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-this-issue-with-the-assigned-cve-identifier-and-the-cvss-score
[step_versions_affected]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#determine-the-range-of-product-versions-affected-including-the-subscription-edition
[step_workarounds]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#determine-whether-workarounds-for-the-problem-exist
[step_coordinate]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#if-necessary-coordinate-with-other-parties
[step_earliest]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-and-send-out-earliest-notifications
[step_advisory_mr]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#create-a-merge-request-for-the-security-advisory-and-include-all-readily-available-information-in-it
[step_reproducer_mr]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-private-merge-request-containing-a-system-test-reproducing-the-problem
[step_notify_support]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#notify-support-when-a-reproducer-is-ready
[step_code_analysis]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-detailed-explanation-of-the-code-flow-triggering-the-problem
[step_fix_mr]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-private-merge-request-with-the-fix
[step_review_fix]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#ensure-the-merge-request-with-the-fix-is-reviewed-and-has-no-outstanding-discussions
[step_review_docs]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#review-the-documentation-changes-introduced-by-the-merge-request-with-the-fix
[step_backports]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-backports-of-the-merge-request-addressing-the-problem-for-all-affected-and-still-maintained-branches-of-a-given-product
[step_finish_advisory]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#finish-preparing-the-security-advisory
[step_meta_issue]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#create-or-update-the-private-issue-containing-links-to-fixes-reproducers-for-all-cves-fixed-in-a-given-release-cycle
[step_changes]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-reserve-a-block-of-changes-placeholders-once-the-complete-set-of-vulnerabilities-fixed-in-a-given-release-cycle-is-determined
[step_merge_fixes]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#merge-the-cve-fixes-in-cve-identifier-order
[step_patches]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-standalone-patch-for-the-last-stable-release-of-each-affected-and-still-maintained-product-branch
[step_asn_releases]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-asn-releases-as-outlined-in-the-release-checklist
[step_send_asn]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#send-asn-to-eligible-customers
[step_preannouncement]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-send-a-pre-announcement-email-to-the-bind-announce-mailing-list-to-alert-users-that-the-upcoming-release-will-include-security-fixes
[step_verify_asn]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#verify-that-all-asn-eligible-customers-have-received-the-notification-email
[step_check_customers]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#verify-that-any-new-or-reinstated-customers-have-received-the-notification-email
[step_packager_emails]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#send-notifications-to-os-packagers
[step_clearance]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#grant-support-clearance-to-proceed-with-public-release
[step_publish]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#publish-the-releases-as-outlined-in-the-release-checklist
[step_matrix]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-update-vulnerability-matrix-in-the-knowledge-base
[step_publish_advisory]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bump-document-version-for-the-security-advisory-and-publish-it-in-the-knowledge-base
[step_notifications]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#send-notification-emails-to-third-parties
[step_mitre]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#advise-mitre-about-the-disclosed-cves
[step_merge_advisory]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#merge-the-security-advisory-merge-request
[step_embargo_end]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#inform-original-reporter-if-external-that-the-security-disclosure-process-is-complete
[step_customers]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#inform-customers-a-fix-has-been-released
[step_postmortem]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#organize-post-mortem-meeting-and-make-sure-it-happens
[step_tickets]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#close-support-tickets
[step_regression]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#merge-a-regression-test-reproducing-the-bug-into-all-affected-and-still-maintained-branches
---
### Summary
This vulnerability results in high memory cache usage for a DNS resolver, even larger than the maximum cache size configured. This happens when the resolver gets around 20,000 requests in several minutes or hours.
For example, with a 250 QPS rate, 1000MB RAM is used after 80 seconds when cache max size is configured to 32MB (the results example is attached to this message).
### BIND version used
BIND 9.16.40 (Extended Support Version) <id:113a865>
### Steps to reproduce
We reproduce NRDelegationAttack with some changes, for more details: https://www.usenix.org/system/files/sec23fall-prepub-309-afek.pdf
1. set the maximum cache size to 32MB:
in named.conf.option (attached example:[named.conf.options](/uploads/bd5dadb4674f120dbe96fd6ca5d060ba/named.conf.options)):
```
options {
...
max-cache-size 32m;
...
}
```
2. Run the resolver: `named -g -c /etc/named.conf`
3. Run psrecord for testing the RAM usage of the resolver: `psrecord NAMED_PID --interval 1 --plot OUTPUT_FILE.png`
4. Option a:
> Request my domain (shoham-shani.online) up to 50,000 dns queries (my authoritative ip address is 74.234.116.29):
`dig shoham{count}.shoham-shani.online. @resolver_ip (count is from 0 to 49,999)` (you can use dnsperf: `dnsperf -d queries.txt -s resolver_ip -v -Q 250`, an example to queries.txt is attached: [queries.txt](/uploads/3a710d8c8416c83ce0469dc7d6b7c92a/queries.txt)
> You can provide us with a test resolver that you want us to attack, and we will perform the attack from our client side at the time we will agree on.
5. option b:
> Create a zone file (example is attached) that has 1500 referrals per one request, you can use this script for that:
with open('zonfile.txt', 'w') as f:
for i in range(1, 50000):
for j in range(0, 1500):
print(f'shoham{i} 8600 IN NS attack{j}.auth{j}.shoham.store.',file=f)
[shoham-shani.online_zonefile_example.txt](/uploads/1cce5b5b5a7118205bde4f199e1c3404/shoham-shani.online_zonefile_example.txt)
> Create another zonefile that answers all shoham{i}.shoham.store. request:
For example: `* IN A 127.0.0.1`
[shoham.store_zonefile_example_copy.txt](/uploads/1fbda03a3a24fdffe95df466d3b031e6/shoham.store_zonefile_example_copy.txt)
> Request 50,000 dns queries as I described in option a.
6. Take a dump of the cache and examine its size: `rndc dumpdb -cache`
7. Stop the resolver service and download the OUTPUT_FILE.png image, examining RAM usage.
note: we checked the bug with 32MB, 64MB and 1GB max-cache-size and with rate of 1,5,10,13,100,250 QPS (all the results are attached)
For 1 QPS I got 440MB RAM used after 8,000 seconds for max-cache-size 32MB
![attack_1_qps](/uploads/4e918e5100db74bae11edf5df8a485bd/attack_1_qps.png)
For 5 QPS I got 840MB RAM used after 4,000 seconds for max-cache-size 32MB
![attack_5_qps](/uploads/c55932c784c402d868af2d306efb6795/attack_5_qps.png)
For 10 QPS I got 840MB RAM used after 2,000 seconds for max-cache-size 32MB
![attack_10_qps](/uploads/d4293561ddc6ba22d8ee203fb2244249/attack_10_qps.png)
For 100 QPS I got 840MB RAM used after 200 seconds for max-cache-size 32MB
![attack_100_qps](/uploads/a18150c0b47480e8ff9d5df5d68afce4/attack_100_qps.png)
For 250 QPS I got 1000MB RAM used after 80 seconds for max-cache-size 32MB
![attack_250_qps](/uploads/116feeb2217d76b84aa3427a0a52db32/attack_250_qps.png)
For 13 QPS I got 1550MB RAM used after 1150 seconds for max-cache-size 1000MB
![attack_1GB_cache](/uploads/bc7a472599212d5b1feed1563ee014bb/attack_1GB_cache.jpeg)
### What is the current *bug* behavior?
1. The cache size expands beyond the limit resulting in an increasing amount of memory being allocated. In addition, if there is no memory available on the machine, the resolver service will crash.
2. A free memory action is not performed for the referral list buffer, which results in an increase in memory allocation for buffers (dns_rdataslab_fromrdataset function in line 272 of the rdataslab.c file).
3. It seems the cache size calculation does not consider authoritative nameservers' refferal answers, although they are stored in the cache.
4. High and low watermarks are set incorrectly to 0, which means the resolver is unaware that the memory usage exceeds the maximum level and does not reduce it.
### What is the expected *correct* behavior?
1. The cache size should be the maximum size we configured.
2. There should be a free memory action to the buffer (rawbuf in rdataslab.c file)
3. In order to calculate cache size, it is necessary to take into account referral list and perform a deletion when the cache exceeds the configured limit.
### Relevant configuration files
configuration file:
> named.conf.options
zonefiles:
> shoham-shani.online_zonefile_example.txt
> shoham.store_zonefile_example copy.txt
### Relevant logs and/or screenshots
Tests are attached
### Possible fixes
1. Free the buffer:
```
free_rawbuf:
isc_mem_put(mctx, rawbuf, buflen);
```
please add the following to this issue:
Yehuda Afek, yehuda.afek@gmail.com /cc @afek
Anat Bremler-Barr, anat.bremlerbarr@gmail.com /cc @anat_bremler_barr
Yuval Shavitt, shavitt@eng.tau.ac.ilJune 2023 (9.16.42, 9.16.42-S1, 9.18.16, 9.18.16-S1, 9.19.14)Tom KrizekTom Krizekhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4392dispatch_newtcp check failed on OL8 FIPS2023-11-15T08:33:37ZMichal Nowakdispatch_newtcp check failed on OL8 FIPSThe new `dispatch_newtcp` check of the `dispatch` unit test failed on OL8 FIPS job [#3750824](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3750824) (8983bf8ed23f99abaef837395435e1eed5d692fe).
```
[==========] Running 10 test(s).
[ R...The new `dispatch_newtcp` check of the `dispatch` unit test failed on OL8 FIPS job [#3750824](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3750824) (8983bf8ed23f99abaef837395435e1eed5d692fe).
```
[==========] Running 10 test(s).
[ RUN ] dispatch_gettcp
[ OK ] dispatch_gettcp
[ RUN ] dispatch_newtcp
ASSERT: eresult == ISC_R_SUCCESS
[ LINE ] --- dispatch_test.c:477
```
```
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ff137074ea5 in __GI_abort () at abort.c:79
#2 0x00007ff137fd39b7 in exit_test (quit_application=1) at /usr/src/debug/cmocka-1.1.5-1.el8.x86_64/src/cmocka.c:404
#3 0x00007ff137fd3a2e in _fail (file=0x4048d0 "dispatch_test.c", line=477) at /usr/src/debug/cmocka-1.1.5-1.el8.x86_64/src/cmocka.c:2196
#4 0x000000000040384f in stop_listening (arg=<optimized out>) at dispatch_test.c:598
#5 0x00007ff1382342c7 in tcp_connected (handle=<optimized out>, eresult=<optimized out>, arg=<optimized out>) at dispatch.c:1851
#6 0x00007ff138a58de3 in streamdns_call_connect_cb (sock=0x7ff133035c00, handle=0x7ff133010380, result=result@entry=ISC_R_TIMEDOUT) at netmgr/streamdns.c:275
#7 0x00007ff138a59153 in streamdns_transport_connected (handle=0x7ff133010a80, result=ISC_R_TIMEDOUT, cbarg=<optimized out>) at netmgr/streamdns.c:364
#8 0x00007ff138a57964 in isc___nm_connectcb (arg=arg@entry=0x7ff1331f5700) at netmgr/netmgr.c:1735
#9 0x00007ff138a57a66 in isc__nm_connectcb (sock=sock@entry=0x7ff133036600, uvreq=uvreq@entry=0x7ff1331f5700, eresult=eresult@entry=ISC_R_TIMEDOUT, async=async@entry=false) at netmgr/netmgr.c:1750
#10 0x00007ff138a580f7 in isc__nm_failed_connect_cb (sock=sock@entry=0x7ff133036600, req=req@entry=0x7ff1331f5700, eresult=ISC_R_TIMEDOUT, async=async@entry=false) at netmgr/netmgr.c:1011
#11 0x00007ff138a5e0d1 in tcp_connect_cb (uvreq=<optimized out>, status=-125) at netmgr/tcp.c:215
#12 0x00007ff138825276 in uv__stream_destroy (stream=stream@entry=0x7ff133036bc8) at src/unix/stream.c:464
#13 0x00007ff138819ae8 in uv__finish_close (handle=0x7ff133036bc8) at src/unix/core.c:287
#14 uv__run_closing_handles (loop=0x7ff133032020) at src/unix/core.c:317
#15 uv_run (loop=loop@entry=0x7ff133032020, mode=mode@entry=UV_RUN_DEFAULT) at src/unix/core.c:395
#16 0x00007ff138a7a6dd in loop_thread (arg=arg@entry=0x7ff133032000) at loop.c:282
#17 0x00007ff138a89399 in thread_body (wrap=0x10ebcc0) at thread.c:85
#18 0x00007ff138a89413 in isc_thread_main (func=func@entry=0x7ff138a7a652 <loop_thread>, arg=0x7ff133032000) at thread.c:116
#19 0x00007ff138a7b665 in isc_loopmgr_run (loopmgr=0x7ff13302f000) at loop.c:454
#20 0x00000000004027d6 in make_dispatchset (dispatchmgr=<optimized out>, ndisps=941470056, dsetp=0x7ff137fd39d0 <exception_handler>) at dispatch_test.c:254
#21 0x00007ff137fd5f37 in cmocka_run_one_test_or_fixture (function_name=0x404950 "dispatch_newtcp", test_func=0x4027b2 <make_dispatchset+42>, setup_func=setup_func@entry=0x0,
teardown_func=teardown_func@entry=0x0, state=state@entry=0x10e4f10, heap_check_point=heap_check_point@entry=0x0) at /usr/src/debug/cmocka-1.1.5-1.el8.x86_64/src/cmocka.c:2801
#22 0x00007ff137fd68a1 in cmocka_run_one_tests (test_state=0x10e4f00) at /usr/src/debug/cmocka-1.1.5-1.el8.x86_64/src/cmocka.c:2909
#23 _cmocka_run_group_tests (group_name=0x40493a "tests", tests=<optimized out>, num_tests=<optimized out>, group_setup=<optimized out>, group_teardown=0x0)
at /usr/src/debug/cmocka-1.1.5-1.el8.x86_64/src/cmocka.c:3040
#24 0x000000000040454c in setup_workers (state=<optimized out>) at isc.c:67
#25 0x00007ff13708de45 in __libc_start_main (main=0x404501 <main+80>, argc=1, argv=0x7ffeb9a1f1e8, init=0x404820 <__libc_csu_init+80>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffeb9a1f1d8)
at ../csu/libc-start.c:302
#26 0x000000000040259e in register_tm_clones ()
#27 0x00007ffeb9a1f1d8 in ?? ()
#28 0x00007ff138eea100 in ?? () from /lib64/ld-linux-x86-64.so.2
#29 0x0000000000000001 in ?? ()
#30 0x00007ffeb9a1ffa6 in ?? ()
#31 0x0000000000000000 in ?? ()
```
[bt.all.txt](/uploads/101a857cba9c5edb31946920f38b3b85/bt.all.txt)November 2023 (9.16.45, 9.16.45-S1, 9.18.20, 9.18.20-S1, 9.19.18)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4418rbtdb.c:582: INSIST(!cds_lfht_destroy(rbtdb->common.update_listeners, ((void ...2023-11-14T16:06:32ZMichal Nowakrbtdb.c:582: INSIST(!cds_lfht_destroy(rbtdb->common.update_listeners, ((void *)0))) failedEmploying `rr` chaos mode on system tests of the `main` branch, I got a shutdown crash in `catz`:
```
Core was generated by `/home/newman/isc/ws/bind9/bin/named/.libs/named -D catz_tmp_8__h0mh7-ns4 -m rec'. ...Employing `rr` chaos mode on system tests of the `main` branch, I got a shutdown crash in `catz`:
```
Core was generated by `/home/newman/isc/ws/bind9/bin/named/.libs/named -D catz_tmp_8__h0mh7-ns4 -m rec'.
Program terminated with signal SIGABRT, Aborted.
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
[Current thread is 1 (Thread 0x756a4d1e8680 (LWP 187005))]
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007ff3f03228f3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78 #2 0x00007ff3f02d1afe in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007ff3f02ba87f in __GI_abort () at abort.c:79
#4 0x0000000000417b4a in assertion_failed (file=0x7fffea179e8b "rbtdb.c", line=582, type=isc_assertiontype_insist, cond=0x7fffea178fb8 "!cds_lfht_destroy(rbtdb->common.update_listeners, ((void *)0))") at main.c:234
#5 0x000022cf17dba56a in isc_assertion_failed (file=file@entry=0x7fffea179e8b "rbtdb.c", line=line@entry=582, type=type@entry=isc_assertiontype_insist, cond=cond@entry=0x7fffea178fb8 "!cds_lfht_destroy(rbtdb->common.update_listeners, ((void *)0))") at assertions.c:48
#6 0x00007fffea086b75 in free_rbtdb (rbtdb=rbtdb@entry=0x12675bcbd000, log=log@entry=true) at rbtdb.c:582
#7 0x00007fffea0879fe in dns__rbtdb_destroy (arg=0x12675bcbd000) at rbtdb.c:646
#8 0x00007fffea00d708 in dns__catz_done_cb (data=0x3aff36a446c0) at catz.c:2527
#9 0x000022cf17de247d in isc__after_work_cb (req=<optimized out>, status=0) at work.c:42
#10 0x00007ff3f02268a9 in uv__work_done (handle=0x3aff36a77a50) at src/threadpool.c:329
#11 0x00007ff3f021de63 in uv__async_io (loop=0x3aff36a779a0, w=<optimized out>, events=<optimized out>) at src/unix/async.c:176
#12 0x00007ff3f023bfae in uv__io_poll (loop=0x3aff36a779a0, timeout=<optimized out>) at src/unix/linux.c:1476
#13 0x00007ff3f0223558 in uv_run (loop=loop@entry=0x3aff36a779a0, mode=mode@entry=UV_RUN_DEFAULT) at src/unix/core.c:447
#14 0x000022cf17dcce8c in loop_thread (arg=arg@entry=0x3aff36a77980) at loop.c:282
#15 0x000022cf17ddc1d1 in thread_body (wrap=0x3aff36a9b2e0) at thread.c:85
#16 thread_run (wrap=0x3aff36a9b2e0) at thread.c:100
#17 0x00007ff3f0320947 in start_thread (arg=<optimized out>) at pthread_create.c:444
#18 0x00007ff3f03a6764 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
```
[core.187002-backtrace.txt](/uploads/9403d7b3b309b58248bf58361c766c80/core.187002-backtrace.txt)
[named.run](/uploads/abf3926d07d2f5571e3c1153cbcec69c/named.run)
[core.187002.gz](/uploads/ccf6c2cbe6165d1e7be26cf3f78a00c1/core.187002.gz)
[rr_trace.txz](/uploads/9fb36e52080e2d97c45c0a3c78e33d3d/rr_trace.txz) (`rr pack` for `rr replay`, if needed, from Fedora 38 but should work everywhere)December 2023 (9.18.21, 9.18.21-S1, 9.19.19)Arаm SаrgsyаnArаm Sаrgsyаn