BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2021-10-04T20:14:01Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1395Convert dns_dispatchmgr_t->buffers to atomic / reference counter.2021-10-04T20:14:01ZMark AndrewsConvert dns_dispatchmgr_t->buffers to atomic / reference counter.BIND 9.17 Backburnerhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1550[ISC-support #15905] rndc stop issued after server (or single zone) rndc relo...2024-03-13T21:08:58ZCathy Almond[ISC-support #15905] rndc stop issued after server (or single zone) rndc reload and during ixfr-from-differences processing leaves the .jnl file corruptedFrom Support ticket [#15905](https://support.isc.org/Ticket/Display.html?id=15905)
9.11.x (but I don't anticipate that the BIND version makes any difference)
This problem is readily reproducible (and, I suspect, occurs because "rndc st...From Support ticket [#15905](https://support.isc.org/Ticket/Display.html?id=15905)
9.11.x (but I don't anticipate that the BIND version makes any difference)
This problem is readily reproducible (and, I suspect, occurs because "rndc stop" doesn't recognise that the zone is effectively 'dynamic' because it has been reloaded with 'ixfr-from-differences yes;').
Here's what is being done, per the server logs. First the reload (the outcome is also the same with 'reload zone'):
```
09-Jan-2020 14:50:29.157 general: info: received control channel command 'reload' ============>>> RELOAD COMMAND
09-Jan-2020 14:50:29.179 general: info: loading configuration from '/etc/named.conf'
... etcetera
09-Jan-2020 14:50:29.202 general: info: reloading configuration succeeded
09-Jan-2020 14:50:29.202 general: info: reloading zones succeeded
09-Jan-2020 14:50:29.202 general: notice: all zones loaded
09-Jan-2020 14:50:29.202 general: notice: running
```
Note that the reload has completed, as far as the logging is concerned, but, it would appear that the regeneration of the .jnl files via 'ixfr-from-differences yes;' has not (high CPU use by named - suggests that it is busy doing this).
Then the 'rndc stop' is issued - and it completes almost immediately (no evidence that it is waiting for the processing to complete), in fact, it seems to log that it has aborted a zone reload, even though the previous logging said that the reload *had* completed:
```
09-Jan-2020 14:50:33.211 general: info: received control channel command 'stop -p'
09-Jan-2020 14:50:33.212 general: info: shutting down: flushing changes
.. etcetera (just the logs of the various sockets being closed here)
09-Jan-2020 14:50:33.216 general: error: zone test.com/IN: loading from master file dynamic/test.com.zone failed: operation canceled
09-Jan-2020 14:50:33.216 general: error: zone test.com/IN: not loaded due to errors.
09-Jan-2020 14:50:34.265 general: notice: exiting
```
And then after this, restarting named - the zone can no longer be loaded - the journal file does not tally with the zone itself:
```
...
09-Jan-2020 14:51:28.141 general: error: zone test.com/IN: journal rollforward failed: journal out of sync with zone
09-Jan-2020 14:51:28.141 general: error: zone test.com/IN: not loaded due to errors.
09-Jan-2020 14:51:28.142 general: notice: all zones loaded
09-Jan-2020 14:51:28.142 general: notice: running
```
======
Something has gone badly wrong during the 'rndc stop' - which is supposed to be a graceful shutdown of named. I'm assuming that the problem is that the .jnl file itself is corrupt, rather than that something has happened to the zone file on disk - but will ask for more data to confirm this.
```
stop [-p]
Stop the server, making sure any recent changes made through dynamic update
or IXFR are first saved to the master files of the updated zones. If -p is
specified named’s process id is returned. This allows an external process
to determine when named had completed stopping.
```
Now, since ixfr-from-differences processing could take an age to complete, I don't think it's reasonable to wait forever on the rndc stop. Possibly one solution would be have a (configurable?) timeout, after which any pending ixfr-from-differences .jnl file generation is terminated and the incomplete .jnl file discarded/removed. After all, the administrator in this scenario has just presented named with a new zone file that it asked named to load - so we know that the full copy of the zone on disk should be good and valid - we just loaded from it.
====
The workaround if this happens, is presumably to manually discard the corrupted .jnl file and restart named.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1573Rewrite qmin ans.py services2021-10-05T07:43:44ZWitold KrecickiRewrite qmin ans.py servicesLong-termWitold KrecickiWitold Krecickihttps://gitlab.isc.org/isc-projects/bind9/-/issues/1609Use native rwlock implementation2023-11-01T13:36:50ZOndřej SurýUse native rwlock implementationAs discovered by @wpk and perflab, the pthread_rwlock API gives BIND 9 better performance on modern systems than our native adaptive rwlock implementation.
We should drop the native rwlock implementation in favor of native rwlock implem...As discovered by @wpk and perflab, the pthread_rwlock API gives BIND 9 better performance on modern systems than our native adaptive rwlock implementation.
We should drop the native rwlock implementation in favor of native rwlock implementation both on POSIX and Windows platforms.March 2023 (9.16.39, 9.16.39-S1, 9.18.13, 9.18.13-S1, 9.19.11)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1642lib/dns/zone.c refactoring2023-11-02T16:51:54ZOndřej Surýlib/dns/zone.c refactoringWhile reviewing a MR that touches `lib/dns/zone.c`, I have notice several things:
- [ ] `zone_postload()` needs to be refactored, the functions spans from line 4696 to to line 5290 (e.g. 400 lines)
- [ ] `zone_rekey()` also need to be re...While reviewing a MR that touches `lib/dns/zone.c`, I have notice several things:
- [ ] `zone_postload()` needs to be refactored, the functions spans from line 4696 to to line 5290 (e.g. 400 lines)
- [ ] `zone_rekey()` also need to be refactored as it spans from line 19317 to line 19734
- [ ] `keyfetch_done()` spans from line 9993 to line 10647 (600 lines)
- [ ] `zone_nsec3chain()` ~900 lines
- [ ] The locked parameter in `zone_locked()` should be removed in favor of just locking the zone prior to the call
- [ ] There's probably a missing lock in `dns_zone_catz_enable_db()`
- [ ] There's probably a missing lock around `dns_zone_rpz_disable_db()` and `dns_zone_catz_disable_db()` call
- [ ] There's lot of unlocked access to dns_zone_t members in `zone_nsec3chain()` and `zone_sign()` at the top of the function
- [ ] There's unlocked access to dns_zone_t members in `zone_maintenance()` (`zone->view`, `zone->type`, `zone->masters`, ...)
- [ ] There's unlocked access to `zone->maxrecords` in `dns_zone_getmaxrecords()` and `dns_zone_setmaxrecords()`
- [ ] There's unlocked access to `zone->origin`, `zone->flags`, `zone->dblock`, and other members of dns_zone_t in `zone_notify()`
- [ ] `zone_rekey()` contains unlocked access to `zone->mctx`, `zone->origin`, `zone->rdclass` and other members
The list here is neither complete or necessarily correct. Some of the struct members might be read-only (e.g. protected by reference counting), but the struct comment is saying "/* Locked */", so if they are truly read-only, perhaps casting them to `const` would be a right thing to do. Also I mostly checked only static functions and looked only at `dns_zone_t *` accesses.Not plannedhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1660Review tcpdns closing2020-07-01T20:34:08ZWitold KrecickiReview tcpdns closingClosing TCPDNS socket is tricky, review it thoroughly to make sure there are no races (e.g. between closing the socket and closing socket timers).Closing TCPDNS socket is tricky, review it thoroughly to make sure there are no races (e.g. between closing the socket and closing socket timers).July 2020 (9.11.21, 9.11.21-S1, 9.16.5, 9.17.3)Witold KrecickiWitold Krecickihttps://gitlab.isc.org/isc-projects/bind9/-/issues/1673lib/isc/pk11.c depend on libdns headers2020-04-15T11:52:00ZOndřej Surýlib/isc/pk11.c depend on libdns headersThere's a circular dependency between `lib/isc/pk11.c` requiring `<dst/result.h>` making libisc build depend on libdns headers. This is weird and wrong and needs to be fixed.There's a circular dependency between `lib/isc/pk11.c` requiring `<dst/result.h>` making libisc build depend on libdns headers. This is weird and wrong and needs to be fixed.April 2020 (9.11.18, 9.16.2, 9.17.1)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1692Use include-what-you-use to trim down included header files2023-11-02T16:51:55ZOndřej SurýUse include-what-you-use to trim down included header files[include-what-you-use](https://include-what-you-use.org/) is a great tool that can be used to trim down the header files that we include in every file.
> "Include what you use" means this: for every symbol (type, function variable, or m...[include-what-you-use](https://include-what-you-use.org/) is a great tool that can be used to trim down the header files that we include in every file.
> "Include what you use" means this: for every symbol (type, function variable, or macro) that you use in foo.cc, either foo.cc or foo.h should #include a .h file that exports the declaration of that symbol. The include-what-you-use tool is a program that can be built with the clang libraries in order to analyze #includes of source files to find include-what-you-use violations, and suggest fixes for them.
>
> The main goal of include-what-you-use is to remove superfluous #includes. It does this both by figuring out what #includes are not actually needed for this file (for both .cc and .h files), and replacing #includes with forward-declares when possible.
Why this is useful?
Excerpt from [Why Include What You Use?](https://github.com/include-what-you-use/include-what-you-use/blob/master/docs/WhyIWYU.md)
* Faster Compiles
* Fewer Recompiles
* Allow Refactoring
* Self-documentation
* Dependency CuttingNot plannedOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1702Make isc_quota and isc_quota_cb opaque2023-10-31T13:50:45ZWitold KrecickiMake isc_quota and isc_quota_cb opaquehttps://gitlab.isc.org/isc-projects/bind9/-/issues/1723Replace fputs() with fprintf()2020-05-04T10:37:46ZMichał KępieńReplace fputs() with fprintf()The following discussion from !985 should be addressed:
- [ ] @michal started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/985#note_120581): (+3 comments)
> Replacing `fputs()` with `fprintf()` sounds ...The following discussion from !985 should be addressed:
- [ ] @michal started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/985#note_120581): (+3 comments)
> Replacing `fputs()` with `fprintf()` sounds like something to do
> tree-wide in another issue (may be possible with Coccinelle)?May 2020 (9.11.19, 9.11.19-S1, 9.14.12, 9.16.3)https://gitlab.isc.org/isc-projects/bind9/-/issues/1739Restore the GSS.framework (and KRB5.framework) on macOS2023-11-02T16:51:56ZOndřej SurýRestore the GSS.framework (and KRB5.framework) on macOSThe !985 has refactored GSSAPI usage and temporarily dropped use of GSS.framework (using gssapi and krb5 libraries instead of frameworks). Use of GSS.framework on macOS needs to be restored.The !985 has refactored GSSAPI usage and temporarily dropped use of GSS.framework (using gssapi and krb5 libraries instead of frameworks). Use of GSS.framework on macOS needs to be restored.Not plannedOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1740The gssapi and krb5 headers polute the public API2021-03-22T11:02:58ZOndřej SurýThe gssapi and krb5 headers polute the public APIThe GSSAPI usage currently exports the gssapi.h headers to the public API of libdns. The use of GSSAPI and KRB5 needs to become opaque, so it does not polute the libdns public API, so there's no need to include the headers in downstream...The GSSAPI usage currently exports the gssapi.h headers to the public API of libdns. The use of GSSAPI and KRB5 needs to become opaque, so it does not polute the libdns public API, so there's no need to include the headers in downstream users of the library.March 2021 (9.11.29, 9.11.29-S1, 9.16.13, 9.16.13-S1, 9.17.11)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1758Clean the duplicate and not-used code from libirs2020-04-29T12:16:17ZOndřej SurýClean the duplicate and not-used code from libirsThe libirs contains:
* reimplemenation of `getnameinfo()`, `getaddrinfo()`, `freeaddrinfo()`, and `gai_strerror()` - we can drop these
* the `irs_dnsconf` API which is experimental and not used anywhere
The leaves us with irs_resconf a...The libirs contains:
* reimplemenation of `getnameinfo()`, `getaddrinfo()`, `freeaddrinfo()`, and `gai_strerror()` - we can drop these
* the `irs_dnsconf` API which is experimental and not used anywhere
The leaves us with irs_resconf and irs_context APIs.May 2020 (9.11.19, 9.11.19-S1, 9.14.12, 9.16.3)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1763Implement and improve the PKCS#11 code2020-05-01T14:26:27ZOndřej SurýImplement and improve the PKCS#11 codeThis is umbrella issue for !3326, !3330 and !3029, and !3467.This is umbrella issue for !3326, !3330 and !3029, and !3467.May 2020 (9.11.19, 9.11.19-S1, 9.14.12, 9.16.3)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1771Refactor how we load librpz.so2023-11-02T16:58:14ZOndřej SurýRefactor how we load librpz.soCurrently, there are three ways how `librpz.so` could be linked into BIND 9. In BIND 9.17, the `dlopen()` is mandatory (via libltdl), so this needs little bit of refactoring.Currently, there are three ways how `librpz.so` could be linked into BIND 9. In BIND 9.17, the `dlopen()` is mandatory (via libltdl), so this needs little bit of refactoring.Not plannedOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1776BIND 9.16 and cache node locks for name cleaning vs. 'the thundering herd'2021-10-05T12:07:29ZCathy AlmondBIND 9.16 and cache node locks for name cleaning vs. 'the thundering herd'From [Support ticket #16212](https://support.isc.org/Ticket/Display.html?id=16212)
During investigations of intermittent 'brownouts' - periods in which named seemingly stops actioning client queries for a short period, and then resumes ...From [Support ticket #16212](https://support.isc.org/Ticket/Display.html?id=16212)
During investigations of intermittent 'brownouts' - periods in which named seemingly stops actioning client queries for a short period, and then resumes processing a second or two later (yes, delays of seconds not ms from this) we 'caught' one interesting scenario on BIND 9.16 in which it appeared that the vast majority of the active threads (netmgr and taskmgr both - so both client queries being answered from cache, AND client queries for which recursion had just taken place) were competing for the same cache node lock.
The pstack output demonstrating the problem was automatically triggered by monitoring for anomalies in inbound versus outbound network traffic.
The symptoms when this issue occurs are that:
* Outbound client-facing traffic rates plummet (well below the proportion that you would expect to see if it was only cache-misses not being serviced
* Recursive query rates plummet too
* CPU use increases - but in user space not in system space
* Recursive clients backlog increases (and may hit the limit)
* Fetchlimits may be triggered (we suspect this, and its predecessor are symptom not cause however, although triggering fetchlimits will exacerbate the situation, both from the client perspective, and as increased traffic rates as clients retry/re-send.
What we saw in the pstacks was that the majority netmgr threads (these answer directly from cache) were attempting to get a write lock on the node - for example:
```
Thread 74 (Thread 0x7f3ff366e700 (LWP 11713)):
#0 isc_rwlock_lock (rwl=rwl@entry=0x7f3f59523980, type=type@entry=isc_rwlocktype_write) at rwlock.c:57
#1 0x000000000051d826 in decrement_reference (rbtdb=rbtdb@entry=0x7f3fc6457010, node=node@entry=0x7f3eace34510, least_serial=least_serial@entry=0, nlock=nlock@entry=isc_rwlocktype_read, tlock=tlock@entry=isc_rwlocktype_none, pruning=pruning@entry=false) at rbtdb.c:2040
#2 0x00000000005215bf in detachnode (db=0x7f3fc6457010, targetp=targetp@entry=0x7f3ff366da88) at rbtdb.c:5352
#3 0x00000000005217be in rdataset_disassociate (rdataset=<optimized out>) at rbtdb.c:8691
#4 0x00000000005657e8 in dns_rdataset_disassociate (rdataset=rdataset@entry=0x7f3fad30cf28) at rdataset.c:111
#5 0x00000000004ebb21 in msgresetnames (first_section=0, msg=0x7f3fad2e1a50, msg@entry=0x7f3fad30b5f0) at message.c:438
#6 msgreset (msg=msg@entry=0x7f3fad2e1a50, everything=everything@entry=false) at message.c:524
#7 0x00000000004ec95a in dns_message_reset (msg=0x7f3fad2e1a50, intent=intent@entry=1) at message.c:760
#8 0x00000000004797ba in ns_client_endrequest (client=0x7f3fae5b8550) at client.c:229
#9 ns__client_reset_cb (client0=0x7f3fae5b8550) at client.c:1586
#10 0x0000000000632989 in isc_nmhandle_unref (handle=handle@entry=0x7f3fae5b83e0) at netmgr.c:1158
#11 0x0000000000632c30 in isc__nm_uvreq_put (req0=req0@entry=0x7f3ff366dbb8, sock=<optimized out>) at netmgr.c:1291
#12 0x00000000006357c4 in udp_send_cb (req=<optimized out>, status=<optimized out>) at udp.c:465
#13 0x00007f3ff5375153 in uv__udp_run_completed () from /lib64/libuv.so.1
#14 0x00007f3ff53754d3 in uv__udp_io () from /lib64/libuv.so.1
#15 0x00007f3ff5367c43 in uv_run () from /lib64/libuv.so.1
#16 0x0000000000632fda in nm_thread (worker0=0x138e3e0) at netmgr.c:481
#17 0x00007f3ff4f39e65 in start_thread () from /lib64/libpthread.so.0
#18 0x00007f3ff484488d in clone () from /lib64/libc.so.6
```
A handful of threads are attempting to get a read lock on the same node - for example:
```
Thread 59 (Thread 0x7f3feab0e700 (LWP 11734)):
#0 0x00007f3ff4f3d144 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0
#1 0x000000000063cc6e in isc_rwlock_lock (rwl=0x7f3f59523980, type=type@entry=isc_rwlocktype_read) at rwlock.c:48
#2 0x00000000005129c6 in rdataset_getownercase (rdataset=<optimized out>, name=0x7f3feaaffde0) at rbtdb.c:9770
#3 0x000000000056620a in towiresorted (rdataset=rdataset@entry=0x7f3ec42dee70, owner_name=owner_name@entry=0x7f3ec42dd0a0, cctx=<optimized out>, target=<optimized out>, order=<optimized out>, order_arg=order_arg@entry=0x7f3ec42b8718, partial=true, options=1, countp=0x7f3feab005dc, state=<optimized out>) at rdataset.c:444
#4 0x0000000000566e3f in dns_rdataset_towirepartial (rdataset=rdataset@entry=0x7f3ec42dee70, owner_name=owner_name@entry=0x7f3ec42dd0a0, cctx=<optimized out>, target=<optimized out>, order=<optimized out>, order_arg=order_arg@entry=0x7f3ec42b8718, options=<optimized out>, options@entry=1, countp=<optimized out>, countp@entry=0x7f3feab005dc, state=<optimized out>, state@entry=0x0) at rdataset.c:565
#5 0x00000000004ecc71 in dns_message_rendersection (msg=0x7f3ec42b8550, sectionid=sectionid@entry=1, options=options@entry=6) at message.c:2086
#6 0x00000000004780f3 in ns_client_send (client=client@entry=0x7f3ec5d4b510) at client.c:555
#7 0x0000000000485b7c in query_send (client=0x7f3ec5d4b510) at query.c:552
#8 0x000000000048de23 in ns_query_done (qctx=qctx@entry=0x7f3feab09a70) at query.c:10921
#9 0x000000000048f76d in query_respond (qctx=0x7f3feab09a70) at query.c:7414
#10 query_prepresponse (qctx=qctx@entry=0x7f3feab09a70) at query.c:9913
#11 0x000000000049181c in query_gotanswer (qctx=qctx@entry=0x7f3feab09a70, res=res@entry=0) at query.c:6836
#12 0x0000000000493a22 in query_lookup (qctx=qctx@entry=0x7f3feab09a70) at query.c:5617
#13 0x00000000004950f6 in query_zone_delegation (qctx=0x7f3feab09a70) at query.c:8003
#14 query_delegation (qctx=qctx@entry=0x7f3feab09a70) at query.c:8031
#15 0x0000000000491a1a in query_gotanswer (qctx=qctx@entry=0x7f3feab09a70, res=res@entry=65565) at query.c:6842
#16 0x0000000000493a22 in query_lookup (qctx=qctx@entry=0x7f3feab09a70) at query.c:5617
#17 0x0000000000494036 in ns__query_start (qctx=qctx@entry=0x7f3feab09a70) at query.c:5493
#18 0x000000000048de05 in ns_query_done (qctx=qctx@entry=0x7f3feab09a70) at query.c:10853
#19 0x0000000000492420 in query_dname (qctx=<optimized out>) at query.c:9806
#20 query_gotanswer (qctx=qctx@entry=0x7f3feab09a70, res=res@entry=65568) at query.c:6872
#21 0x0000000000493a22 in query_lookup (qctx=qctx@entry=0x7f3feab09a70) at query.c:5617
#22 0x00000000004950f6 in query_zone_delegation (qctx=0x7f3feab09a70) at query.c:8003
#23 query_delegation (qctx=qctx@entry=0x7f3feab09a70) at query.c:8031
#24 0x0000000000491a1a in query_gotanswer (qctx=qctx@entry=0x7f3feab09a70, res=res@entry=65565) at query.c:6842
#25 0x0000000000493a22 in query_lookup (qctx=qctx@entry=0x7f3feab09a70) at query.c:5617
#26 0x0000000000494036 in ns__query_start (qctx=qctx@entry=0x7f3feab09a70) at query.c:5493
#27 0x000000000048de05 in ns_query_done (qctx=qctx@entry=0x7f3feab09a70) at query.c:10853
#28 0x0000000000492420 in query_dname (qctx=<optimized out>) at query.c:9806
#29 query_gotanswer (qctx=qctx@entry=0x7f3feab09a70, res=res@entry=65568) at query.c:6872
#30 0x0000000000493a22 in query_lookup (qctx=qctx@entry=0x7f3feab09a70) at query.c:5617
#31 0x00000000004950f6 in query_zone_delegation (qctx=0x7f3feab09a70) at query.c:8003
#32 query_delegation (qctx=qctx@entry=0x7f3feab09a70) at query.c:8031
#33 0x0000000000491a1a in query_gotanswer (qctx=qctx@entry=0x7f3feab09a70, res=res@entry=65565) at query.c:6842
#34 0x0000000000493a22 in query_lookup (qctx=qctx@entry=0x7f3feab09a70) at query.c:5617
#35 0x0000000000494036 in ns__query_start (qctx=qctx@entry=0x7f3feab09a70) at query.c:5493
#36 0x0000000000494b26 in query_setup (client=client@entry=0x7f3ec5d4b510, qtype=<optimized out>) at query.c:5217
#37 0x0000000000497056 in ns_query_start (client=client@entry=0x7f3ec5d4b510) at query.c:11318
#38 0x000000000047b101 in ns__client_request (handle=<optimized out>, region=<optimized out>, arg=<optimized out>) at client.c:2209
#39 0x0000000000635462 in udp_recv_cb (handle=<optimized out>, nrecv=48, buf=0x7f3feab0ab00, addr=<optimized out>, flags=<optimized out>) at udp.c:329
#40 0x00007f3ff53755db in uv__udp_io () from /lib64/libuv.so.1
#41 0x00007f3ff53779c8 in uv__io_poll () from /lib64/libuv.so.1
#42 0x00007f3ff5367c70 in uv_run () from /lib64/libuv.so.1
#43 0x0000000000632fda in nm_thread (worker0=0x13926e8) at netmgr.c:481
#44 0x00007f3ff4f39e65 in start_thread () from /lib64/libpthread.so.0
#45 0x00007f3ff484488d in clone () from /lib64/libc.so.6
```
Meanwhile, the threads run by taskmgr (this bunch would have recursed) were attempting to get write locks (unsurprisingly, although depending on the node and the client query, I guess it's also possible that one might want to get a read lock):
Here's a writer:
```
Thread 50 (Thread 0x7f3fe587b700 (LWP 11746)):
#0 isc_rwlock_lock (rwl=rwl@entry=0x7f3f59523980, type=type@entry=isc_rwlocktype_write) at rwlock.c:57
#1 0x000000000051d826 in decrement_reference (rbtdb=rbtdb@entry=0x7f3fc6457010, node=node@entry=0x7f3eace34510, least_serial=least_serial@entry=0, nlock=nlock@entry=isc_rwlocktype_read, tlock=tlock@entry=isc_rwlocktype_none, pruning=pruning@entry=false) at rbtdb.c:2040
#2 0x00000000005215bf in detachnode (db=0x7f3fc6457010, targetp=0x7f3fe587acc0) at rbtdb.c:5352
#3 0x00000000004bdd83 in dns_db_detachnode (db=<optimized out>, nodep=nodep@entry=0x7f3fe587acc0) at db.c:588
#4 0x00000000004804cb in qctx_clean (qctx=qctx@entry=0x7f3fe587a830) at query.c:5097
#5 0x000000000048db5a in ns_query_done (qctx=qctx@entry=0x7f3fe587a830) at query.c:10834
#6 0x000000000048f76d in query_respond (qctx=0x7f3fe587a830) at query.c:7414
#7 query_prepresponse (qctx=qctx@entry=0x7f3fe587a830) at query.c:9913
#8 0x000000000049181c in query_gotanswer (qctx=qctx@entry=0x7f3fe587a830, res=res@entry=0) at query.c:6836
#9 0x0000000000496870 in query_resume (qctx=0x7f3fe587a830) at query.c:6134
#10 fetch_callback (task=<optimized out>, event=0x7f3ead5c9c18) at query.c:5716
#11 0x000000000064007a in dispatch (threadid=<optimized out>, manager=<optimized out>) at task.c:1152
#12 run (queuep=<optimized out>) at task.c:1344
#13 0x00007f3ff4f39e65 in start_thread () from /lib64/libpthread.so.0
#14 0x00007f3ff484488d in clone () from /lib64/libc.so.6
```
In this particular instance, every single one of the legacy i/o-handler threads was twiddling its thumbs (sitting on epoll_wait() ) - which is probably not too surprising, if no taskmgr workers are sending out queries to auth servers?
Doing stats on this particular capture (74 threads - 24x netmgr, 24x taskmgr, 24x legacy i/o plus 1 each main and the timer thread), we have:
33 instances of isc_rwlock_lock (rwl=rwl@entry=0x7f3f59523980
31 instances of rbtdb=rbtdb@entry=0x7f3fc6457010
30 instances of node=node@entry=0x7f3eace34510
It might be that it's possible to prove from the pstack output that this is a series of different names all attached to the same node, versus a single name that is expiring that all of the threads are attempting to clean-up simultaneously.
Either way, the locking is not working well in this situation - there's a lot of spinning in user space it would appear.
Hypotheses being tendered currently include:
* This scenario has always potentially existed, but using pthread-rwlocks amplifies it considerably
* Could this be a case where prefetching (enabled with default settings in this example) hits a surprise edge case?
* Is it possible we're seeing the after-effects of another delay which has resulted in late client query-response processing for something that has a very short TTL in cache?
* Is this a scenario where a client comes along and queries near-simultaneously (and probably quite innocently) for a lot of similar names under the same domain/apex very close to the time where they would all be naturally expiring from cache?
* Could it be that TTL=0 handling has broken in 9.16 with the introduction of netmgr (noting that TTL=0 responses from auth servers would be expected to be available solely to the clients that recursed and waited for the fetch completion - not to anyone who came along after the fetch had populated cache for the waiting client request to be fulfilled - this should all be in taskmgr and none of it in netmgr)?
* Do we perhaps have too many threads running (detected CPUs = 24)?BIND 9.19.xOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1778Cleanup the final remnants of platform.h2021-10-05T12:07:45ZOndřej SurýCleanup the final remnants of platform.hThere are still few remaining bits in the `platform.h` header that we need to remove and finally get rid of the header.There are still few remaining bits in the `platform.h` header that we need to remove and finally get rid of the header.BIND 9.17 BackburnerOndřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1892Reuse nmsockets in TCP2022-12-14T09:00:17ZWitold KrecickiReuse nmsockets in TCPWe currently don't reuse isc_nmsocket_t sockets at all, destroying them after a connection is closed. That's a performance hit for TCP. We should put a semi-ready (allocated + cond/mutex initialized) objects on a stack for reuse, just li...We currently don't reuse isc_nmsocket_t sockets at all, destroying them after a connection is closed. That's a performance hit for TCP. We should put a semi-ready (allocated + cond/mutex initialized) objects on a stack for reuse, just like we do with uvreqs and handles.BIND 9.19.xhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1912Refactor `fctx->client` to store just preformatted text2020-06-05T14:21:14ZOndřej SurýRefactor `fctx->client` to store just preformatted textPer https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/3575#note_133991Per https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/3575#note_133991BIND 9.17 Backburnerhttps://gitlab.isc.org/isc-projects/bind9/-/issues/1920[netmgr] tcpdns ineffective2020-12-09T09:53:34ZOndřej Surý[netmgr] tcpdns ineffectiveWhile reviewing some other stuff, in `lib/isc/netmgr/tcp.c` there's this code:
```
t->region = (isc_region_t){ .base = isc_mem_get(t->mctx,
region->length + 2),
...While reviewing some other stuff, in `lib/isc/netmgr/tcp.c` there's this code:
```
t->region = (isc_region_t){ .base = isc_mem_get(t->mctx,
region->length + 2),
.length = region->length + 2 };
*(uint16_t *)t->region.base = htons(region->length);
memmove(t->region.base + 2, region->base, region->length);
```
1) the memory returned by `isc_mem_get()` isn't guaranteed to be memory aligned, so you have unaligned write to the memory here.
2) doing `memmove()` just to add two bytes at the beginning of the buffer is inefficient.
Neither is to be fixed in this MR, but it should be fixed. The easiest way would be to rewrite the IO functions to work with `iovec`-like buffers, similar to what `uv_write()` does with `uv_buf_t[]`. Then it would be easy to just add two messages here, one with length, and one with the buffer.BIND 9.17 Backburner