BIND issueshttps://gitlab.isc.org/isc-projects/bind9/-/issues2024-02-09T10:56:23Zhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4491Use RCU instead of rwlock in isc_log unit2024-02-09T10:56:23ZOndřej SurýUse RCU instead of rwlock in isc_log unitWhile adding some extra logging for debugging purposes, I've noticed that the RWLOCK in isc_log unit can be replaces with RCU. I think this would be a nice introductory issue for @aydin.While adding some extra logging for debugging purposes, I've noticed that the RWLOCK in isc_log unit can be replaces with RCU. I think this would be a nice introductory issue for @aydin.March 2024 (9.16.49, 9.16.49-S1, 9.18.25, 9.18.25-S1, 9.19.22)Aydın MercanAydın Mercanhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4459[CVE-2023-50868] Preparing an NSEC3 closest encloser proof can exhaust CPU re...2024-03-28T14:11:11ZPetr Špačekpspacek@isc.org[CVE-2023-50868] Preparing an NSEC3 closest encloser proof can exhaust CPU resources| Quick Links | :link: |
| ------------------------ | ------------------------------------------------------------------------------ |
| Incident Manage...| Quick Links | :link: |
| ------------------------ | ------------------------------------------------------------------------------ |
| Incident Manager: | @pspacek |
| Deputy Incident Manager: | @ebf |
| Public Disclosure Date: | 2024-02-13 |
| CVSS Score: | [7.5][cvss_score] |
| Security Advisory: | isc-private/printing-press!93 |
| Mattermost Channel: | [CVE-2023-50868: NSEC3 closest encloser proof can exhaust CPU][mattermost_url] |
| Support Ticket: | N/A |
| Release Checklist: | #4555 |
[cvss_score]: https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H&version=3.1
[mattermost_url]: https://mattermost.isc.org/isc/channels/cve-2023-50868-nsec3-closest-encloser-proof-can-exhaust-cpu
:bulb: **Click [here][checklist_explanations] (internal resource) for general information about the security incident handling process.**
[checklist_explanations]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations
### Earlier Than T-5
- [x] [:link:][step_deputy] **(IM)** Pick a Deputy Incident Manager
- :no_entry_sign: [:link:][step_respond] **(IM)** Respond to the bug reporter - found internally by @pspacek
- [x] [:link:][step_public_mrs] **(SwEng)** Ensure there are no public merge requests which inadvertently disclose the issue
- [x] [:link:][step_assign_cve_id] **(IM)** Assign a CVE identifier
- [x] [:link:][step_note_cve_info] **(SwEng)** Update this issue with the assigned CVE identifier and the CVSS score
- [x] [:link:][step_versions_affected] **(SwEng)** Determine the range of product versions affected (including the Subscription Edition)
- [x] [:link:][step_workarounds] **(SwEng)** Determine whether workarounds for the problem exist
- [x] [:link:][step_coordinate] **(SwEng)** :warning: Coordinate with other parties :warning:
- [x] [:link:][step_earliest_prepare] **(Support)** ~~Prepare "earliest" notification text and hand it off to Marketing~~
- [x] [:link:][step_earliest_send] **(Marketing)** ~~Update "earliest" notification document in SF portal and send bulk email to earliest customers~~
- [x] [:link:][step_advisory_mr] **(Support)** [Create a merge request for the Security Advisory and include all readily available information in it](isc-private/printing-press!93)
- [x] [:link:][step_reproducer_mr] **(SwEng)** ~~[Prepare a private merge request containing a system test reproducing the problem](#note_434474)~~
- [x] [:link:][step_notify_support] **(SwEng)** ~~Notify Support when a reproducer is ready~~
- [x] [:link:][step_code_analysis] **(SwEng)** [Prepare a detailed explanation of the code flow triggering the problem](#note_434480)
- [x] [:link:][step_fix_mr] **(SwEng)** ~~[Prepare a private merge request with the fix](#note_434483)~~
- [x] [:link:][step_review_fix] **(SwEng)** ~~[Ensure the merge request with the fix is reviewed and has no outstanding discussions](#note_434483)~~
- [x] [:link:][step_review_docs] **(Support)** ~~[Review the documentation changes introduced by the merge request with the fix](#note_434483)~~
- [x] [:link:][step_backports] **(SwEng)** ~~[Prepare backports of the merge request addressing the problem for all affected (and still maintained) branches of a given product](#note_434483)~~
- [x] [:link:][step_finish_advisory] **(Support)** Finish preparing the Security Advisory
- [x] [:link:][step_meta_issue] **(QA)** Create (or update) the private issue containing links to fixes & reproducers for all CVEs fixed in a given release cycle
- [x] [:link:][step_changes] **(QA)** (BIND 9 only) Reserve a block of `CHANGES` placeholders once the complete set of vulnerabilities fixed in a given release cycle is determined
- [x] [:link:][step_merge_fixes] **(QA)** ~~[Merge the CVE fixes in CVE identifier order](#note_434483)~~
- [x] [:link:][step_patches] **(QA)** ~~[Prepare a standalone patch for the last stable release of each affected (and still maintained) product branch](#note_434483)~~
- [x] [:link:][step_asn_releases] **(QA)** Prepare ASN releases (as outlined in the Release Checklist)
### At T-5
- [x] [:link:][step_asn_documents] **(Marketing)** Update the text on the T-5 (from the Printing Press project) and "earliest" ASN documents in the SF portal
- [x] [:link:][step_asn_links] **(Marketing)** (BIND 9 only) Update the BIND -S information document in SF with download links to the new versions
- [x] [:link:][step_asn_send] **(Marketing)** Bulk email eligible customers to check the SF portal
- [x] [:link:][step_preannouncement] **(Marketing)** (BIND 9 only) Send a pre-announcement email to the *bind-announce* mailing list to alert users that the upcoming release will include security fixes
### At T-1
- [x] [:link:][step_packager_emails] **(First IM)** Send notifications to OS packagers
### On the Day of Public Disclosure
- [x] [:link:][step_clearance] **(IM)** [Grant QA & Marketing clearance to proceed with public release](https://mattermost.isc.org/isc/pl/rxzn1b4upbnjxrbq75dqx1m96o)
- [x] [:link:][step_publish] **(QA/Marketing)** Publish the releases (as outlined in the release checklist)
- [x] [:link:][step_matrix] **(Support)** (BIND 9 only) Add the new CVEs to the vulnerability matrix in the Knowledge Base
- [x] [:link:][step_publish_advisory] **(Support)** Bump Document Version for the Security Advisory and publish it in the Knowledge Base
- [x] [:link:][step_notifications] **(First IM)** Send notification emails to third parties
- [x] [:link:][step_mitre] **(First IM)** ~~[Advise MITRE about the disclosed CVEs](#note_436522)~~
- [x] [:link:][step_merge_advisory] **(First IM)** Merge the Security Advisory merge request
- [x] [:link:][step_embargo_end] **(IM)** Inform original reporter (if external) that the security disclosure process is complete
- [x] [:link:][step_asn_clear] **(Marketing)** Update the SF portal to clear the ASN
- [x] [:link:][step_customers] **(Marketing)** Email ASN recipients that the embargo is lifted
### After Public Disclosure
- [x] [:link:][step_regression] **(QA)** ~~[Merge a regression test reproducing the bug into all affected (and still maintained) branches](#note_434474)~~
[step_deputy]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#pick-a-deputy-incident-manager
[step_respond]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#respond-to-the-bug-reporter
[step_public_mrs]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#ensure-there-are-no-public-merge-requests-which-inadvertently-disclose-the-issue
[step_assign_cve_id]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#assign-a-cve-identifier
[step_note_cve_info]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-this-issue-with-the-assigned-cve-identifier-and-the-cvss-score
[step_versions_affected]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#determine-the-range-of-product-versions-affected-including-the-subscription-edition
[step_workarounds]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#determine-whether-workarounds-for-the-problem-exist
[step_coordinate]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#if-necessary-coordinate-with-other-parties
[step_earliest_prepare]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-earliest-notification-text-and-hand-it-off-to-marketing
[step_earliest_send]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-earliest-notification-document-in-sf-portal-and-send-bulk-email-to-earliest-customers
[step_advisory_mr]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#create-a-merge-request-for-the-security-advisory-and-include-all-readily-available-information-in-it
[step_reproducer_mr]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-private-merge-request-containing-a-system-test-reproducing-the-problem
[step_notify_support]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#notify-support-when-a-reproducer-is-ready
[step_code_analysis]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-detailed-explanation-of-the-code-flow-triggering-the-problem
[step_fix_mr]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-private-merge-request-with-the-fix
[step_review_fix]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#ensure-the-merge-request-with-the-fix-is-reviewed-and-has-no-outstanding-discussions
[step_review_docs]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#review-the-documentation-changes-introduced-by-the-merge-request-with-the-fix
[step_backports]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-backports-of-the-merge-request-addressing-the-problem-for-all-affected-and-still-maintained-branches-of-a-given-product
[step_finish_advisory]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#finish-preparing-the-security-advisory
[step_meta_issue]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#create-or-update-the-private-issue-containing-links-to-fixes-reproducers-for-all-cves-fixed-in-a-given-release-cycle
[step_changes]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-reserve-a-block-of-changes-placeholders-once-the-complete-set-of-vulnerabilities-fixed-in-a-given-release-cycle-is-determined
[step_merge_fixes]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#merge-the-cve-fixes-in-cve-identifier-order
[step_patches]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-standalone-patch-for-the-last-stable-release-of-each-affected-and-still-maintained-product-branch
[step_asn_releases]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-asn-releases-as-outlined-in-the-release-checklist
[step_asn_documents]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-the-text-on-the-t-5-from-the-printing-press-project-and-earliest-asn-documents-in-the-sf-portal
[step_asn_links]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-update-the-bind-s-information-document-in-sf-with-download-links-to-the-new-versions
[step_asn_send]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bulk-email-eligible-customers-to-check-the-sf-portal
[step_preannouncement]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-send-a-pre-announcement-email-to-the-bind-announce-mailing-list-to-alert-users-that-the-upcoming-release-will-include-security-fixes
[step_packager_emails]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#send-notifications-to-os-packagers
[step_clearance]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#grant-qa-marketing-clearance-to-proceed-with-public-release
[step_publish]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#publish-the-releases-as-outlined-in-the-release-checklist
[step_matrix]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-add-the-new-cves-to-the-vulnerability-matrix-in-the-knowledge-base
[step_publish_advisory]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bump-document-version-for-the-security-advisory-and-publish-it-in-the-knowledge-base
[step_notifications]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#send-notification-emails-to-third-parties
[step_mitre]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#advise-mitre-about-the-disclosed-cves
[step_merge_advisory]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#merge-the-security-advisory-merge-request
[step_embargo_end]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#inform-original-reporter-if-external-that-the-security-disclosure-process-is-complete
[step_asn_clear]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-the-sf-portal-to-clear-the-asn
[step_customers]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#email-asn-recipients-that-the-embargo-is-lifted
[step_regression]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#merge-a-regression-test-reproducing-the-bug-into-all-affected-and-still-maintained-branches
### Reproducer
1. Sign an empty zone with NSEC3, 150 iterations, and same NSEC3 salt for a good measure:
- [local.testiscorg.ch.zone](/uploads/b4a147bdabff809350e0a7a7b802758e/local.testiscorg.ch.zone)
- [Klocal.testiscorg.ch.+014+01043.key](/uploads/27aa1a99ac52e271ae1bf618c7fc4138/Klocal.testiscorg.ch.+014+01043.key)
- [Klocal.testiscorg.ch.+014+01043.private](/uploads/346295e7f71ed644dd44bb93e52ea531/Klocal.testiscorg.ch.+014+01043.private)
- `dnssec-signzone -u -3 0122345678912345 -H 150 -e 20380101000000 -S -o local.testiscorg.ch -O full -z local.testiscorg.ch.zone Klocal.testiscorg.ch.+014+01043`
- :point_right_tone1: [local.testiscorg.ch.zone.signed](/uploads/ba12811b13cc749084b6c1cef0c3a04a/local.testiscorg.ch.zone.signed)
2. Run an auth with the zone:
- [auth.conf](/uploads/51139fed8b8efe23eb58b82ce4b82379/auth.conf)
- `named -g -c auth.conf`
3. Run a resolver with the zone:
- [resolver.conf](/uploads/2b9a661105397636c55f9c5be13d8855/resolver.conf)
- `named -g -c resolver.conf`
4. Run attack using dnsperf:
- [randlabels.py](/uploads/30b54afbe090da16c06855f5561755df/randlabels.py)
- `python randlabels.py | dnsperf -s 127.0.0.1 -S1`
### Observed behavior
Around 200 QPS, one CPU maxed out. Tweaking dnsperf params can max out all CPUs with ~ 200 queries per core.
### Problem
For NSEC3 we have to hash all the labels between QNAME and zone name to find out a matching NSEC3 RR in authority section. This inflates number of hashes to potentially ~ `127 labels * <NSEC3 iterations> * <number of NSEC3 RRs in the message>`.
We have to cap this somehow. Coordination with other vendors is needed because BIND, Unbound, Knot Resolver, and PowerDNS in current versions are affected. This seems like a protocol issue so other vendors are most likely also affected, see the NSEC3 algorithm here: https://datatracker.ietf.org/doc/html/rfc5155#section-8.3February 2024 (9.16.47/9.16.48, 9.16.47/9.16.48-S1, 9.18.23/9.18.24, 9.18.23/9.18.24-S1, 9.19.21)https://gitlab.isc.org/isc-projects/bind9/-/issues/4451Cache overmem setting is not reset2023-12-06T18:29:37ZMark AndrewsCache overmem setting is not resetSet a low cache size then send the server a query stream. Once the cache fills the server does not recover.
```
options {
listen-on port 5555 { 127.0.0.1; };
listen-on-v6 port 5555 { ::1; };
pid-file none;
...Set a low cache size then send the server a query stream. Once the cache fills the server does not recover.
```
options {
listen-on port 5555 { 127.0.0.1; };
listen-on-v6 port 5555 { ::1; };
pid-file none;
max-cache-size 1M;
};
```December 2023 (9.18.21, 9.18.21-S1, 9.19.19)https://gitlab.isc.org/isc-projects/bind9/-/issues/4411QPDB lite2024-03-11T09:12:31ZPetr Špačekpspacek@isc.orgQPDB liteData structure replacement: Replace `dns_rbt_` calls in RBTDB with single-threaded qptrie-equivalents.
Assumptions:
- We will use structure self-cleaning ability instead of wonky RBT cleaning.
- It is not hard and can be done before 9.2...Data structure replacement: Replace `dns_rbt_` calls in RBTDB with single-threaded qptrie-equivalents.
Assumptions:
- We will use structure self-cleaning ability instead of wonky RBT cleaning.
- It is not hard and can be done before 9.20 cutoff.
Test plan:
- Write shim which acts like the current dns_rbt_ API and we will fuzz old RBT in lockstep with the new QPtrip-based API and check consistency of API behavior. If these two match it should be relatively low risk operation.March 2024 (9.16.49, 9.16.49-S1, 9.18.25, 9.18.25-S1, 9.19.22)Matthijs Mekkingmatthijs@isc.orgMatthijs Mekkingmatthijs@isc.orghttps://gitlab.isc.org/isc-projects/bind9/-/issues/4383[CVE-2023-6516] Specific recursive query patterns may lead to an out-of-memor...2024-03-28T14:03:40ZPeter Davies[CVE-2023-6516] Specific recursive query patterns may lead to an out-of-memory condition| Quick Links | :link: |
| ------------------------ | -------------------------------------------------------------------------------- |
| Incident Ma...| Quick Links | :link: |
| ------------------------ | -------------------------------------------------------------------------------- |
| Incident Manager: | @michal |
| Deputy Incident Manager: | @chuck |
| Public Disclosure Date: | 2024-02-13 |
| CVSS Score: | [7.5][cvss_score] |
| Security Advisory: | isc-private/printing-press!80 |
| Mattermost Channel: | [CVE-2023-6516: cache tree pruning may exhaust available memory][mattermost_url] |
| Support Ticket: | [SF#1406][support_ticket] |
| Release Checklist: | #4515 & #4555 |
[cvss_score]: https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H&version=3.1
[mattermost_url]: https://mattermost.isc.org/isc/channels/cve-2023-6516
[support_ticket]: https://isc.lightning.force.com/lightning/r/Case/5007V00002ZRUsMQAX/view
:bulb: **Click [here][checklist_explanations] (internal resource) for general information about the security incident handling process.**
[checklist_explanations]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations
### Earlier Than T-5
- [x] [:link:][step_deputy] **(IM)** Pick a Deputy Incident Manager
- [x] [:link:][step_respond] **(IM)** [Respond to the bug reporter](https://mattermost.isc.org/isc/pl/98sf37irnpb4jxj3fxdcu8c7so)
- [x] [:link:][step_public_mrs] **(SwEng)** Ensure there are no public merge requests which inadvertently disclose the issue
- [x] [:link:][step_assign_cve_id] **(IM)** [Assign a CVE identifier](#note_421371)
- [x] [:link:][step_note_cve_info] **(SwEng)** Update this issue with the assigned CVE identifier and the CVSS score
- [x] [:link:][step_versions_affected] **(SwEng)** [Determine the range of product versions affected (including the Subscription Edition)](#note_421380)
- [x] [:link:][step_workarounds] **(SwEng)** [Determine whether workarounds for the problem exist](#note_421384)
- [x] [:link:][step_coordinate] ~~**(SwEng)** If necessary, coordinate with other parties~~
- [x] [:link:][step_earliest_prepare] **(Support)** [Prepare "earliest" notification text and hand it off to Marketing](https://mattermost.isc.org/isc/pl/ixtdyafg7bgx5kr54omaphd1aw)
- [x] [:link:][step_earliest_send] **(Marketing)** [Update "earliest" notification document in SF portal and send bulk email to earliest customers](https://mattermost.isc.org/isc/pl/5r94b8qgo7ramx3m3cy54wsy8e)
- [x] [:link:][step_advisory_mr] **(Support)** [Create a merge request for the Security Advisory and include all readily available information in it](isc-private/printing-press!80)
- [x] [:link:][step_reproducer_mr] ~~**(SwEng)** [Prepare a private merge request containing a system test reproducing the problem](#note_421405)~~
- [x] [:link:][step_notify_support] **(SwEng)** [Notify Support when a reproducer is ready](https://mattermost.isc.org/isc/pl/w61138331pryjgr79mjpa88qby)
- [x] [:link:][step_code_analysis] **(SwEng)** [Prepare a detailed explanation of the code flow triggering the problem](#note_416989)
- [x] [:link:][step_fix_mr] **(SwEng)** [Prepare a private merge request with the fix](isc-private/bind9!619)
- [x] [:link:][step_review_fix] **(SwEng)** [Ensure the merge request with the fix is reviewed and has no outstanding discussions](https://gitlab.isc.org/isc-private/bind9/-/merge_requests/621#note_426816)
- [x] [:link:][step_review_docs] **(Support)** [Review the documentation changes introduced by the merge request with the fix](https://mattermost.isc.org/isc/pl/gmtdcmamsirgmjupeff77rtqoc)
- [x] [:link:][step_backports] **(SwEng)** Prepare backports of the merge request addressing the problem for all affected (and still maintained) branches of a given product
- [x] [:link:][step_finish_advisory] **(Support)** [Finish preparing the Security Advisory](https://gitlab.isc.org/isc-private/printing-press/-/merge_requests/80#note_426994)
- [x] [:link:][step_meta_issue] **(QA)** [Create (or update) the private issue containing links to fixes & reproducers for all CVEs fixed in a given release cycle](#4486)
- [x] [:link:][step_changes] **(QA)** (BIND 9 only) [Reserve a block of `CHANGES` placeholders once the complete set of vulnerabilities fixed in a given release cycle is determined](!8625)
- [x] [:link:][step_merge_fixes] **(QA)** Merge the CVE fixes in CVE identifier order
- [x] [:link:][step_patches] **(QA)** Prepare a standalone patch for the last stable release of each affected (and still maintained) product branch
- [x] [:link:][step_asn_releases] **(QA)** Prepare ASN releases (as outlined in the Release Checklist)
### At T-5
- [x] [:link:][step_asn_documents] **(Marketing)** Update the text on the T-5 (from the Printing Press project) and "earliest" ASN documents in the SF portal
- [x] [:link:][step_asn_links] **(Marketing)** (BIND 9 only) Update the BIND -S information document in SF with download links to the new versions
- [x] [:link:][step_asn_send] **(Marketing)** Bulk email eligible customers to check the SF portal
- [x] [:link:][step_preannouncement] **(Marketing)** (BIND 9 only) Send a pre-announcement email to the *bind-announce* mailing list to alert users that the upcoming release will include security fixes
### At T-1
- [x] [:link:][step_packager_emails] **(First IM)** Send notifications to OS packagers
### On the Day of Public Disclosure
- [x] [:link:][step_clearance] **(IM)** [Grant QA & Marketing clearance to proceed with public release](https://mattermost.isc.org/isc/pl/rc7ffqr3q7dopb7rzs8zo4ggph)
- [x] [:link:][step_publish] **(QA/Marketing)** Publish the releases (as outlined in the release checklist)
- [x] [:link:][step_matrix] **(Support)** (BIND 9 only) Add the new CVEs to the vulnerability matrix in the Knowledge Base
- [x] [:link:][step_publish_advisory] **(Support)** Bump Document Version for the Security Advisory and publish it in the Knowledge Base
- [x] [:link:][step_notifications] **(First IM)** Send notification emails to third parties
- [x] [:link:][step_mitre] **(First IM)** Advise MITRE about the disclosed CVEs
- [x] [:link:][step_merge_advisory] **(First IM)** Merge the Security Advisory merge request
- [x] [:link:][step_embargo_end] **(IM)** Inform original reporter (if external) that the security disclosure process is complete
- [x] [:link:][step_asn_clear] **(Marketing)** Update the SF portal to clear the ASN
- [x] [:link:][step_customers] **(Marketing)** Email ASN recipients that the embargo is lifted
### After Public Disclosure
- [x] [:link:][step_regression] **(QA)** ~~[Merge a regression test reproducing the bug into all affected (and still maintained) branches](#note_421405)~~
[step_deputy]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#pick-a-deputy-incident-manager
[step_respond]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#respond-to-the-bug-reporter
[step_public_mrs]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#ensure-there-are-no-public-merge-requests-which-inadvertently-disclose-the-issue
[step_assign_cve_id]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#assign-a-cve-identifier
[step_note_cve_info]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-this-issue-with-the-assigned-cve-identifier-and-the-cvss-score
[step_versions_affected]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#determine-the-range-of-product-versions-affected-including-the-subscription-edition
[step_workarounds]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#determine-whether-workarounds-for-the-problem-exist
[step_coordinate]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#if-necessary-coordinate-with-other-parties
[step_earliest_prepare]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-earliest-notification-text-and-hand-it-off-to-marketing
[step_earliest_send]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-earliest-notification-document-in-sf-portal-and-send-bulk-email-to-earliest-customers
[step_advisory_mr]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#create-a-merge-request-for-the-security-advisory-and-include-all-readily-available-information-in-it
[step_reproducer_mr]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-private-merge-request-containing-a-system-test-reproducing-the-problem
[step_notify_support]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#notify-support-when-a-reproducer-is-ready
[step_code_analysis]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-detailed-explanation-of-the-code-flow-triggering-the-problem
[step_fix_mr]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-private-merge-request-with-the-fix
[step_review_fix]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#ensure-the-merge-request-with-the-fix-is-reviewed-and-has-no-outstanding-discussions
[step_review_docs]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#review-the-documentation-changes-introduced-by-the-merge-request-with-the-fix
[step_backports]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-backports-of-the-merge-request-addressing-the-problem-for-all-affected-and-still-maintained-branches-of-a-given-product
[step_finish_advisory]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#finish-preparing-the-security-advisory
[step_meta_issue]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#create-or-update-the-private-issue-containing-links-to-fixes-reproducers-for-all-cves-fixed-in-a-given-release-cycle
[step_changes]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-reserve-a-block-of-changes-placeholders-once-the-complete-set-of-vulnerabilities-fixed-in-a-given-release-cycle-is-determined
[step_merge_fixes]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#merge-the-cve-fixes-in-cve-identifier-order
[step_patches]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-standalone-patch-for-the-last-stable-release-of-each-affected-and-still-maintained-product-branch
[step_asn_releases]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-asn-releases-as-outlined-in-the-release-checklist
[step_asn_documents]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-the-text-on-the-t-5-from-the-printing-press-project-and-earliest-asn-documents-in-the-sf-portal
[step_asn_links]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-update-the-bind-s-information-document-in-sf-with-download-links-to-the-new-versions
[step_asn_send]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bulk-email-eligible-customers-to-check-the-sf-portal
[step_preannouncement]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-send-a-pre-announcement-email-to-the-bind-announce-mailing-list-to-alert-users-that-the-upcoming-release-will-include-security-fixes
[step_packager_emails]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#send-notifications-to-os-packagers
[step_clearance]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#grant-qa-marketing-clearance-to-proceed-with-public-release
[step_publish]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#publish-the-releases-as-outlined-in-the-release-checklist
[step_matrix]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-add-the-new-cves-to-the-vulnerability-matrix-in-the-knowledge-base
[step_publish_advisory]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bump-document-version-for-the-security-advisory-and-publish-it-in-the-knowledge-base
[step_notifications]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#send-notification-emails-to-third-parties
[step_mitre]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#advise-mitre-about-the-disclosed-cves
[step_merge_advisory]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#merge-the-security-advisory-merge-request
[step_embargo_end]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#inform-original-reporter-if-external-that-the-security-disclosure-process-is-complete
[step_asn_clear]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-the-sf-portal-to-clear-the-asn
[step_customers]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#email-asn-recipients-that-the-embargo-is-lifted
[step_regression]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#merge-a-regression-test-reproducing-the-bug-into-all-affected-and-still-maintained-branches
---
Version: 9.16.38-S1
Note: you might want to handle this case as a security bug.
We've noticed that BIND 9.16 (tested with 9.16.38-S1) can consume much more memory than max-cache-size under certain conditions. Specifically, it can be reproduced in my test environment as follows:
- run named with the attached configuration (named.conf)
- run another named instance on the same host (named-auth.conf and example.zone). The first instance forwards all recursive queries to the second instance.
- run the attached script (cachetest.py; you need python and dnspython)
- watch memory footprint of the first instance
While max-cache-size is set to 256MB, the process memory footprint will well exceed that value. And, at around 1.3GB, statistics-channel indeed shows the cache uses a lot more memory than 256M:
```json
{
"id":"0x7fdad09fe630",
"name":"cache",
"references":8,
"total":7132065424,
"inuse":1009703195,
"maxinuse":1009703195,
"malloced":1021440461,
"maxmalloced":1021440461,
"pools":0,
"hiwater":234881024,
"lowater":201326592
}
```
Also, rndc dumpdb indicates that only very few cache entries exist in the cache.
```
grep 192.0.2.1 named_dump.db | wc -l
13
```
And, when we stop named, it takes about 3 minutes to complete shutdown:
```
20-Oct-2023 20:56:45.292 stopping command channel on 127.0.0.1#953
20-Oct-2023 20:59:49.988 exiting
```
Our analysis concluded that this is because:
- a lot of "leaf" cache entries are purged due to overmemory condition (the python script's query pattern is chosen to cause it)
- many number of "prune_tree" events are sent to the rbtdb's task
- but these events are not handled fast enough, so many rbt nodes are kept in memory while even more are added by new queries
We are not fully sure exactly why the event handling is so slow, but confirmed that a patch (attached, cache.patch) to prevent excessive sending of the task events helps avoid the problem.
Interestingly, BIND 9.18.19-S1 didn't show this problem in my experiment. I've not figured out why.
You'll probably want to prevent the problem for 9.16, either by the patch or in some other way. We'd also appreciate an explanation on why it doesn't happen for 9.18.
[cache.patch](/uploads/5bcb9ee04f3d7bd4d47cbf4fbaf1b107/cache.patch)
[named.conf](/uploads/01ed1d679811c66887143a0a2c5f9bed/named.conf)
[named-auth.conf](/uploads/60d4b0aeea5dd99979247a1b251dd97e/named-auth.conf)
[example.zone](/uploads/3939ca6ebb44749948d29e56c50f07d5/example.zone)
[cachetest.py](/uploads/748af9aa4e792d8a668483110e342fda/cachetest.py)January 2024 (9.16.46, 9.16.46-S1, 9.18.22, 9.18.22-S1, 9.19.20) (❗RECALLED❗)Michał KępieńMichał Kępieńhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4367Offload AXFR and IXFR processing2023-11-07T09:39:15ZOndřej SurýOffload AXFR and IXFR processingIXFR processing can take really long time, especially when IXFR changesets are large. Offload the processing, so it doesn't block the networking threads.IXFR processing can take really long time, especially when IXFR changesets are large. Offload the processing, so it doesn't block the networking threads.November 2023 (9.16.45, 9.16.45-S1, 9.18.20, 9.18.20-S1, 9.19.18)https://gitlab.isc.org/isc-projects/bind9/-/issues/4326Reduce adb names locking contention2023-11-07T09:15:39ZOndřej SurýReduce adb names locking contentionWith our new userspace tracing probes, we were able to pinpoint the source of the contention:
| fn | type | count | min | max | sum | avg |
|-------------------------------|--------|-----...With our new userspace tracing probes, we were able to pinpoint the source of the contention:
| fn | type | count | min | max | sum | avg |
|-------------------------------|--------|---------|-------|---------|-------------|---------|
| dns_adb_createfind | mutex | 1597343 | 5772 | 210015 | 10540878621 | 6599 |
| dns_adb_agesrtt | mutex | 806318 | 5772 | 236964 | 5284236828 | 6553 |
| dns__rbtdb_detachnode | rwlock | 928583 | 4290 | 310557 | 4803076356 | 5172 |
| dns_adb_destroyfind | mutex | 556873 | 5811 | 192972 | 3703797591 | 6651 |
| dns_adb_createfind | rwlock | 556873 | 4329 | 172419 | 2802134985 | 5031 |
| dns__rbtdb_addrdataset | rwlock | 462176 | 4290 | 274560 | 2295845409 | 4967 |
| cache_find | rwlock | 439323 | 4290 | 280293 | 2276739816 | 5182 |
| isc__mempool_destroy | mutex | 271276 | 5811 | 204321 | 1926270294 | 7100 |
| isc__mempool_create | mutex | 271276 | 5850 | 141609 | 1885239369 | 6949 |
| fctx_cancelqueries | mutex | 189589 | 5850 | 144027 | 1301757054 | 6866 |
| dns__rbtdb_nodefullname | rwlock | 229742 | 4368 | 187512 | 1157715000 | 5039 |
| dns_adb_getcookie | mutex | 132632 | 5928 | 139035 | 1137403254 | 8575 |
| find_deepest_zonecut | rwlock | 224027 | 4290 | 156936 | 1105641693 | 4935 |
| dns__rbtdb_findnodeintree | rwlock | 181563 | 4368 | 168246 | 948466155 | 5223 |
| reactivate_node | rwlock | 181581 | 4329 | 163917 | 928872165 | 5115 |
| dns_ntatable_covered | rwlock | 158872 | 4407 | 176280 | 857866503 | 5399 |
| fctx__done.constprop.0 | mutex | 109350 | 5889 | 122031 | 750861813 | 6866 |
| dns_adb_setudpsize | mutex | 68673 | 6318 | 222495 | 682022757 | 9931 |
| dns_adb_adjustsrtt | mutex | 80787 | 5928 | 165360 | 606351837 | 7505 |
| fctx_cancelquery | mutex | 80787 | 6006 | 141921 | 573609972 | 7100 |
| fctx_query | mutex | 80787 | 5967 | 137943 | 552959550 | 6844 |
| rctx_done | mutex | 80778 | 5889 | 118326 | 549924336 | 6807 |
| resquery_destroy | mutex | 80787 | 5850 | 129090 | 541517847 | 6703 |
| activeempty | rwlock | 107788 | 4251 | 139464 | 521984073 | 4842 |
| dns_resolver_destroyfetch | mutex | 54899 | 6006 | 126360 | 447498324 | 8151 |
| validated | mutex | 58506 | 5850 | 127608 | 442221507 | 7558 |
| fctx_start | mutex | 54666 | 5928 | 135954 | 429194220 | 7851 |
| dns_resolver_createfetch | mutex | 53692 | 6006 | 140088 | 390963261 | 7281 |
| resquery_response | mutex | 51786 | 6123 | 155922 | 382768581 | 7391 |
| dns_aclelement_match | rwlock | 62649 | 4329 | 120900 | 371844018 | 5935 |
| get_attached_fctx | mutex | 53692 | 5967 | 120627 | 367318809 | 6841 |
| fetch_callback | mutex | 49034 | 5772 | 142896 | 364519662 | 7434 |
| get_attached_and_locked_entry | mutex | 53556 | 5928 | 117000 | 361141638 | 6743 |
| zone_find | rwlock | 71262 | 4251 | 182871 | 347663121 | 4878 |
| dns__rbtdb_currentversion | rwlock | 69986 | 4368 | 111696 | 346155108 | 4946 |
| cache_findzonecut | rwlock | 64631 | 4329 | 226278 | 335597886 | 5192 |
| clean_namehooks | mutex | 51702 | 5811 | 124995 | 332844369 | 6437 |
| release_fctx | rwlock | 53459 | 4446 | 112476 | 278282199 | 5205 |
| get_attached_and_locked_entry | rwlock | 53556 | 4368 | 98592 | 271329786 | 5066 |
| get_attached_fctx | rwlock | 53692 | 4329 | 709020 | 266725095 | 4967 |
| delete_callback | rwlock | 44974 | 4368 | 49569 | 212821830 | 4732 |
| ns_query_cancel | mutex | 21337 | 6045 | 118404 | 196894893 | 9227 |
| prune_tree | rwlock | 22934 | 4290 | 94029 | 172068039 | 7502 |
| purge_stale_entries | mutex | 21807 | 5928 | 130143 | 147269265 | 6753 |
| mutex_lock | mutex | 11905 | 6006 | 1802736 | 145038075 | 12182 |
| dns_adb_changeflags | mutex | 20232 | 5889 | 110916 | 138579753 | 6849 |
| ns_client_recursing | mutex | 15949 | 6357 | 110526 | 135737472 | 8510 |
| dns_adb_setcookie | mutex | 16890 | 6084 | 139503 | 126144447 | 7468 |
| rdataset_getownercase | rwlock | 23833 | 4329 | 122538 | 125803704 | 5278 |
| cds_wfcq_dequeue_blocking | mutex | 15988 | 6240 | 187980 | 122401383 | 7655 |
| shutdown_names | mutex | 18901 | 5967 | 49569 | 121754802 | 6441 |
| clean_finds_at_name | mutex | 14878 | 5850 | 116922 | 100255584 | 6738 |
| find_coveringnsec | rwlock | 18681 | 4485 | 110487 | 100217130 | 5364 |
| resume_qmin | mutex | 12414 | 6006 | 122928 | 95990232 | 7732 |
| fctx_finddone | mutex | 12404 | 5889 | 100425 | 91508859 | 7377 |
| zone_shutdown | rwlock | 104 | 5382 | 6337149 | 90942969 | 874451 |
| dns_adb_ednsto | mutex | 9616 | 6435 | 74568 | 90021906 | 9361 |
| je_malloc_mutex_lock_slow | mutex | 32 | 18564 | 6550284 | 78068913 | 2439653 |
| dns_zonemgr_releasezone | rwlock | 208 | 4836 | 4666077 | 66756261 | 320943 |
| ns_client_qnamereplace | mutex | 7387 | 6162 | 89661 | 56466618 | 7644 |
| isc_log_doit | mutex | 3463 | 5889 | 87477 | 23883873 | 6896 |
| isc_log_doit | rwlock | 3463 | 4524 | 70863 | 20813169 | 6010 |
| dns_adb_getudpsize | mutex | 2118 | 6786 | 91767 | 20512557 | 9684 |
| destroy | mutex | 106 | 6942 | 1579539 | 16286439 | 153645 |
| zone_maintenance | mutex | 732 | 5928 | 133653 | 13287963 | 18152 |
| rdataset_settrust | rwlock | 1176 | 4446 | 48048 | 6342882 | 5393 |
| zone__settimer | mutex | 317 | 6045 | 394290 | 5682846 | 17926 |
| zone_postload | rwlock | 207 | 4680 | 67431 | 4023786 | 19438 |
| dns_adb_plainresponse | mutex | 380 | 6708 | 53859 | 4017819 | 10573 |
(The table continues with less important stuff...)
If you look closely, the cumulative time we spend in the adb mutexes is huge:
| fn | type | count | min | max | sum | avg |
|-------------------------------|--------|---------|-------|---------|-------------|---------|
| dns_adb_createfind | mutex | 1597343 | 5772 | 210015 | 10540878621 | 6599 |
| dns_adb_agesrtt | mutex | 806318 | 5772 | 236964 | 5284236828 | 6553 |
| dns_adb_destroyfind | mutex | 556873 | 5811 | 192972 | 3703797591 | 6651 |
| dns_adb_getcookie | mutex | 132632 | 5928 | 139035 | 1137403254 | 8575 |
| dns_adb_setudpsize | mutex | 68673 | 6318 | 222495 | 682022757 | 9931 |
| dns_adb_adjustsrtt | mutex | 80787 | 5928 | 165360 | 606351837 | 7505 |
| dns_adb_changeflags | mutex | 20232 | 5889 | 110916 | 138579753 | 6849 |
| dns_adb_setcookie | mutex | 16890 | 6084 | 139503 | 126144447 | 7468 |
| dns_adb_ednsto | mutex | 9616 | 6435 | 74568 | 90021906 | 9361 |
| dns_adb_getudpsize | mutex | 2118 | 6786 | 91767 | 20512557 | 9684 |
| dns_adb_plainresponse | mutex | 380 | 6708 | 53859 | 4017819 | 10573 |
This is something that is definitely worth addressing.November 2023 (9.16.45, 9.16.45-S1, 9.18.20, 9.18.20-S1, 9.19.18)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4325Creating and destroying pools locking is contended2023-10-04T13:26:28ZOndřej SurýCreating and destroying pools locking is contendedAccording to the recent measurements, the `isc_mempool_create()` and `isc_mempool_destroy()` are in the TOP 10 of most contended locks (measures as a total time waiting for a lock).According to the recent measurements, the `isc_mempool_create()` and `isc_mempool_destroy()` are in the TOP 10 of most contended locks (measures as a total time waiting for a lock).November 2023 (9.16.45, 9.16.45-S1, 9.18.20, 9.18.20-S1, 9.19.18)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4306Implement incremental hashing2023-11-09T10:56:18ZOndřej SurýImplement incremental hashingWhen computing the hash for the hashtables (both isc_ht and isc_hashmap), we sometimes use `dns_name_t` + something (something could be class, type or arbitrary value).
Currently, this requires copying the content of `dns_name_t` into p...When computing the hash for the hashtables (both isc_ht and isc_hashmap), we sometimes use `dns_name_t` + something (something could be class, type or arbitrary value).
Currently, this requires copying the content of `dns_name_t` into packed structure with the "something" appended, so we can hash that as a single `uint8_t` array.
Incremental rehashing would allow "fractured" keys.November 2023 (9.16.45, 9.16.45-S1, 9.18.20, 9.18.20-S1, 9.19.18)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4264Improve locking in dns_nta2023-11-01T11:34:06ZEvan HuntImprove locking in dns_ntaAfter converting the NTA to use a QP-trie instead of an RBT, the locking can be cleaned up. (See [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/7811#note_395512) in !7811.)
-After converting the NTA to use a QP-trie instead of an RBT, the locking can be cleaned up. (See [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/7811#note_395512) in !7811.)
-BIND 9.19.xEvan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/4234[CVE-2023-4408] Parsing large DNS messages may cause excessive CPU load2024-03-28T12:11:21ZShoham Danino[CVE-2023-4408] Parsing large DNS messages may cause excessive CPU load| Quick Links | :link: |
| ------------------------ | ------------------------------------------------------------------------------ |
| Incident Manage...| Quick Links | :link: |
| ------------------------ | ------------------------------------------------------------------------------ |
| Incident Manager: | @matthijs |
| Deputy Incident Manager: | @michal |
| Public Disclosure Date: | 2024-02-13 |
| CVSS Score: | [7.5][cvss_score] |
| Security Advisory: | isc-private/printing-press!76 |
| Mattermost Channel: | [CVE-2023-4408: O(n²) complexity in DNS message parsing logic][mattermost_url] |
| Support Ticket: | N/A |
| Release Checklist: | #4515 & #4555 |
[cvss_score]: https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H&version=3.1
[mattermost_url]: https://mattermost.isc.org/isc/channels/cve-2023-4408
:bulb: **Click [here][checklist_explanations] (internal resource) for general information about the security incident handling process.**
[checklist_explanations]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations
### Earlier Than T-5
- [x] [:link:][step_deputy] **(IM)** Pick a Deputy Incident Manager
- [x] [:link:][step_respond] **(IM)** [Respond to the bug reporter](#note_394703)
- [x] [:link:][step_public_mrs] **(SwEng)** Ensure there are no public merge requests which inadvertently disclose the issue
- [x] [:link:][step_assign_cve_id] **(IM)** [Assign a CVE identifier](#note_396170)
- [x] [:link:][step_note_cve_info] **(SwEng)** Update this issue with the assigned CVE identifier and the CVSS score
- [x] [:link:][step_versions_affected] **(SwEng)** [Determine the range of product versions affected (including the Subscription Edition)](#note_396176)
- [x] [:link:][step_workarounds] **(SwEng)** [Determine whether workarounds for the problem exist](#note_396179)
- [X] [:link:][step_coordinate] **(SwEng)** [If necessary, coordinate with other parties](#note_397738)
- [x] [:link:][step_earliest_prepare] **(Support)** Prepare "earliest" notification text and hand it off to Marketing
- [x] [:link:][step_earliest_send] **(Marketing)** Update "earliest" notification document in SF portal and send bulk email to earliest customers
- [x] [:link:][step_advisory_mr] **(Support)** [Create a merge request for the Security Advisory and include all readily available information in it](isc-private/printing-press!76)
- [x] [:link:][step_reproducer_mr] ~~**(SwEng)** [Prepare a private merge request containing a system test reproducing the problem](#note_416143)~~
- [x] [:link:][step_notify_support] **(SwEng)** [Notify Support when a reproducer is ready](https://mattermost.isc.org/isc/pl/i5jke6nbhffizdduy4opewc56w)
- [x] [:link:][step_code_analysis] **(SwEng)** [Prepare a detailed explanation of the code flow triggering the problem](#note_416180)
- [x] [:link:][step_fix_mr] **(SwEng)** [Prepare a private merge request with the fix](isc-private/bind9!560)
- [x] [:link:][step_review_fix] **(SwEng)** Ensure the merge request with the fix is reviewed and has no outstanding discussions
- [x] [:link:][step_review_docs] **(Support)** [Review the documentation changes introduced by the merge request with the fix](#note_417455)
- [x] [:link:][step_backports] **(SwEng)** Prepare backports of the merge request addressing the problem [for](isc-private/bind9!585) [all](isc-private/bind9!586) [affected](isc-private/bind9!587) (and still maintained) branches of a given product
- [x] [:link:][step_finish_advisory] **(Support)** [Finish preparing the Security Advisory](https://mattermost.isc.org/isc/pl/76qusomtej88tjkqagiu1hfroy)
- [x] [:link:][step_meta_issue] **(QA)** [Create (or update) the private issue containing links to fixes & reproducers for all CVEs fixed in a given release cycle](#4486)
- [x] [:link:][step_changes] **(QA)** (BIND 9 only) [Reserve a block of `CHANGES` placeholders once the complete set of vulnerabilities fixed in a given release cycle is determined](!8625)
- [x] [:link:][step_merge_fixes] **(QA)** Merge the CVE fixes in CVE identifier order
- [x] [:link:][step_patches] **(QA)** Prepare a standalone patch for the last stable release of each affected (and still maintained) product branch
- [x] [:link:][step_asn_releases] **(QA)** Prepare ASN releases (as outlined in the Release Checklist)
### At T-5
- [x] [:link:][step_asn_documents] **(Marketing)** Update the text on the T-5 (from the Printing Press project) and "earliest" ASN documents in the SF portal
- [x] [:link:][step_asn_links] **(Marketing)** (BIND 9 only) Update the BIND -S information document in SF with download links to the new versions
- [x] [:link:][step_asn_send] **(Marketing)** Bulk email eligible customers to check the SF portal
- [x] [:link:][step_preannouncement] **(Marketing)** (BIND 9 only) Send a pre-announcement email to the *bind-announce* mailing list to alert users that the upcoming release will include security fixes
### At T-1
- [x] [:link:][step_packager_emails] **(First IM)** Send notifications to OS packagers
### On the Day of Public Disclosure
- [x] [:link:][step_clearance] **(IM)** [Grant QA & Marketing clearance to proceed with public release](https://mattermost.isc.org/isc/pl/hg9aypmxnjnjzc7ktwfo6xcpky)
- [x] [:link:][step_publish] **(QA/Marketing)** Publish the releases (as outlined in the release checklist)
- [x] [:link:][step_matrix] **(Support)** (BIND 9 only) Add the new CVEs to the vulnerability matrix in the Knowledge Base
- [x] [:link:][step_publish_advisory] **(Support)** Bump Document Version for the Security Advisory and publish it in the Knowledge Base
- [x] [:link:][step_notifications] **(First IM)** Send notification emails to third parties
- [x] [:link:][step_mitre] **(First IM)** Advise MITRE about the disclosed CVEs
- [x] [:link:][step_merge_advisory] **(First IM)** Merge the Security Advisory merge request
- [x] [:link:][step_embargo_end] **(IM)** [Inform original reporter (if external) that the security disclosure process is complete](#note_436560)
- [x] [:link:][step_asn_clear] **(Marketing)** Update the SF portal to clear the ASN
- [x] [:link:][step_customers] **(Marketing)** Email ASN recipients that the embargo is lifted
### After Public Disclosure
- [x] [:link:][step_regression] **(QA)** ~~[Merge a regression test reproducing the bug into all affected (and still maintained) branches](#note_416143)~~
[step_deputy]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#pick-a-deputy-incident-manager
[step_respond]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#respond-to-the-bug-reporter
[step_public_mrs]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#ensure-there-are-no-public-merge-requests-which-inadvertently-disclose-the-issue
[step_assign_cve_id]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#assign-a-cve-identifier
[step_note_cve_info]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-this-issue-with-the-assigned-cve-identifier-and-the-cvss-score
[step_versions_affected]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#determine-the-range-of-product-versions-affected-including-the-subscription-edition
[step_workarounds]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#determine-whether-workarounds-for-the-problem-exist
[step_coordinate]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#if-necessary-coordinate-with-other-parties
[step_earliest_prepare]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-earliest-notification-text-and-hand-it-off-to-marketing
[step_earliest_send]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-earliest-notification-document-in-sf-portal-and-send-bulk-email-to-earliest-customers
[step_advisory_mr]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#create-a-merge-request-for-the-security-advisory-and-include-all-readily-available-information-in-it
[step_reproducer_mr]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-private-merge-request-containing-a-system-test-reproducing-the-problem
[step_notify_support]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#notify-support-when-a-reproducer-is-ready
[step_code_analysis]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-detailed-explanation-of-the-code-flow-triggering-the-problem
[step_fix_mr]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-private-merge-request-with-the-fix
[step_review_fix]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#ensure-the-merge-request-with-the-fix-is-reviewed-and-has-no-outstanding-discussions
[step_review_docs]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#review-the-documentation-changes-introduced-by-the-merge-request-with-the-fix
[step_backports]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-backports-of-the-merge-request-addressing-the-problem-for-all-affected-and-still-maintained-branches-of-a-given-product
[step_finish_advisory]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#finish-preparing-the-security-advisory
[step_meta_issue]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#create-or-update-the-private-issue-containing-links-to-fixes-reproducers-for-all-cves-fixed-in-a-given-release-cycle
[step_changes]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-reserve-a-block-of-changes-placeholders-once-the-complete-set-of-vulnerabilities-fixed-in-a-given-release-cycle-is-determined
[step_merge_fixes]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#merge-the-cve-fixes-in-cve-identifier-order
[step_patches]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-a-standalone-patch-for-the-last-stable-release-of-each-affected-and-still-maintained-product-branch
[step_asn_releases]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#prepare-asn-releases-as-outlined-in-the-release-checklist
[step_asn_documents]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-the-text-on-the-t-5-from-the-printing-press-project-and-earliest-asn-documents-in-the-sf-portal
[step_asn_links]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-update-the-bind-s-information-document-in-sf-with-download-links-to-the-new-versions
[step_asn_send]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bulk-email-eligible-customers-to-check-the-sf-portal
[step_preannouncement]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-send-a-pre-announcement-email-to-the-bind-announce-mailing-list-to-alert-users-that-the-upcoming-release-will-include-security-fixes
[step_packager_emails]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#send-notifications-to-os-packagers
[step_clearance]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#grant-qa-marketing-clearance-to-proceed-with-public-release
[step_publish]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#publish-the-releases-as-outlined-in-the-release-checklist
[step_matrix]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bind-9-only-add-the-new-cves-to-the-vulnerability-matrix-in-the-knowledge-base
[step_publish_advisory]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#bump-document-version-for-the-security-advisory-and-publish-it-in-the-knowledge-base
[step_notifications]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#send-notification-emails-to-third-parties
[step_mitre]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#advise-mitre-about-the-disclosed-cves
[step_merge_advisory]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#merge-the-security-advisory-merge-request
[step_embargo_end]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#inform-original-reporter-if-external-that-the-security-disclosure-process-is-complete
[step_asn_clear]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#update-the-sf-portal-to-clear-the-asn
[step_customers]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#email-asn-recipients-that-the-embargo-is-lifted
[step_regression]: https://gitlab.isc.org/isc-private/isc-wiki/-/wikis/Security-Incident-Handling-Checklist-Explanations#merge-a-regression-test-reproducing-the-bug-into-all-affected-and-still-maintained-branches
---
Hi,
Continuing our research, regarding CVE 2023-2828 ([issue/4055](https://gitlab.isc.org/isc-projects/bind9/-/issues/4055)) we discovered yet another possible vulnerability in BIND9 resolvers where one malicious CNAME query can cause over 27,000,000 calls to the dns_name_equal function and over 1,000,000,000 CPU instructions.
In the getsection function in message.c file there is a check if a name is already present in the section before appending the new name to the section.
```
...
if (!dns_name_equal(dns_rootname, name) ||
sectionid != DNS_SECTION_ADDITIONAL ||
msg->opt != NULL)
{
DO_ERROR(DNS_R_FORMERR);
}
...
for (count = 0; count < msg->counts[sectionid]; count++) {
...
```
The function checks whether each name is already present in the section by sending all the previous names with the new name to _dns_name_equal_ function - which causes a quadratic function call.
For CNAME queries, the BIND resolver has a resolution limit of 17, so for each malicious CNAME query the resolver executes 17 queries to the authoritative name server.
I tested an answer with 1800 RRsets long CNAME chain, and as a result, the dns_name_equal function is called i-1 times for RRset i for i = n to n-17.
In the experiment, I observed 27,304,801 calls for _dns_name_equal_ function, and my calculation indeed predicts:
`∑ n^2 / 2`. n from 1800 to 1784 = 27,295,948
I used Valgrind and Kcachegrind for checking that. Here is the valgind file: [callgrind.out.cname_shoham1.shoham.fun](/uploads/09aef4edae0384e93d1a891781dcfc20/callgrind.out.cname_shoham1.shoham.fun)
You can see here the number of function calls and the CPU instructions:
![fig_callgrind.out.cname_shoham1.shoham.fun](/uploads/e878d48174b9a8ebf0a8549eb8391e04/fig_callgrind.out.cname_shoham1.shoham.fun.png)
And here is an example of the answer section I respond:
![fig_shoham1.shoham.fun_pcap](/uploads/f13a330dac95e9e63483635c97bf075d/fig_shoham1.shoham.fun_pcap.png)
How to reproduce the vulnerability and additional technical information:
For this experiment, I used my Azure environment:
I used 3 machines: a client, resolver and authoritative name server
All my machines are Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz x64 with 2 vCPUs 8 GiB RAM and Linux (Ubuntu 20.04) OS.
I used BIND 9.16.42 version with no configuration changes. Here is the config.log file: [config.log](/uploads/3e4d410a1d2ea1c4185b0faaf492f229/config.log)
and the conf.named.option file: [named.conf.options](/uploads/12e35dc45738b7b2a29b16c2092902db/named.conf.options)
In the authoritative, I have a CNAME chain with 1800 RRsets (from shoham1.shoham.fun to shoham1800.shoham.fun). Attaching the zonfile here: [shoham.fun.forward](/uploads/c4279fc7a7997435c0fedf8268195edc/shoham.fun.forward)
The client issued a single query using this command: dig shoham1.shoham.fun. @<my_resolver_ip>
To reproduce the attack:
Turn on the resolver with the Valgrind tool using the following command: valgrind --tool=callgrind named -g -c /etc/named.conf.
From the client, query shoham1.shoham.fun using this command: dig shoham1.shoham.fun. @<your_resolver_ip>
Close the Valgrind resolve (Ctrl c will work)
Open the Callgrind file using Qcachegrind: qcachegrind ./callgrind.out.<the resolver PID>
You can also create your authoritative with the zonefile attached or with any long CNAME chain, here is a script for that:
```
with open('zonfile.txt', 'w') as f:
for i in range(1, 1801):
if i % 1800 != 0:
print(f'shoham{i} 86400 IN CNAME shoham{i + 1}',file=f)
else:
print(f'shoham{i} 86400 IN A 1.1.1.1',file=f)
```
Please don't hesitate to ask any question or any additional information.
Thanks,
Shoham Danino, Anat Bremler-Barr, Yehuda Afek and Yuval ShavittJanuary 2024 (9.16.46, 9.16.46-S1, 9.18.22, 9.18.22-S1, 9.19.20) (❗RECALLED❗)Matthijs Mekkingmatthijs@isc.orgMatthijs Mekkingmatthijs@isc.orghttps://gitlab.isc.org/isc-projects/bind9/-/issues/4228ThreadSanitizer: heap-use-after-free dispatch.c:1188 in dns_dispatch_createtcp2023-08-31T15:15:06ZArаm SаrgsyаnThreadSanitizer: heap-use-after-free dispatch.c:1188 in dns_dispatch_createtcpA new TSAN report triggered by the `doth` system test: https://gitlab.isc.org/isc-projects/bind9/-/jobs/3550989
[tsan.named.151843](/uploads/6bde0c5faf19ff7aac5cc2f001d78562/tsan.named.151843)
[core.151843-backtrace.txt](/uploads/e742c...A new TSAN report triggered by the `doth` system test: https://gitlab.isc.org/isc-projects/bind9/-/jobs/3550989
[tsan.named.151843](/uploads/6bde0c5faf19ff7aac5cc2f001d78562/tsan.named.151843)
[core.151843-backtrace.txt](/uploads/e742c88c78f8330bcfd7ea0ef23c76f3/core.151843-backtrace.txt)September 2023 (9.16.44, 9.16.44-S1, 9.18.19, 9.18.19-S1, 9.19.17)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4223dns_badcache uses yet another own hash table implementation2023-08-02T08:14:15ZOndřej Surýdns_badcache uses yet another own hash table implementation...replace it with isc_hashmap (or isc_ht)...replace it with isc_hashmap (or isc_ht)August 2023 (9.16.43, 9.16.43-S1, 9.18.18, 9.18.18-S1, 9.19.16)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4196Possible race between RPZ and catalog zones db update notify callback registr...2023-08-02T08:53:52ZArаm SаrgsyаnPossible race between RPZ and catalog zones db update notify callback registrationsThis is a follow-up issue from an MM discussion.
Excerpt from the discussion:
>```
>If the updatenotify in catz cb runs at the same time as RPZ updatenotify (this could happen) the list becomes trashed.
>...
>The register / unregistere...This is a follow-up issue from an MM discussion.
Excerpt from the discussion:
>```
>If the updatenotify in catz cb runs at the same time as RPZ updatenotify (this could happen) the list becomes trashed.
>...
>The register / unregistered can run in parallel.
>```August 2023 (9.16.43, 9.16.43-S1, 9.18.18, 9.18.18-S1, 9.19.16)https://gitlab.isc.org/isc-projects/bind9/-/issues/4185The recursive CNAME resolving can block the thread for long time2023-07-28T11:55:05ZOndřej SurýThe recursive CNAME resolving can block the thread for long timeIt was noted that `query_cname()` chaining inside `ns_query` can take quite a lot of time which blocks the other clients (on the same thread) increasing the latency for waiting clients.
![perf.script.cut.combined.stacks-21.svg](/uploads...It was noted that `query_cname()` chaining inside `ns_query` can take quite a lot of time which blocks the other clients (on the same thread) increasing the latency for waiting clients.
![perf.script.cut.combined.stacks-21.svg](/uploads/aea0afd5f79d27d43d83dab687e24b79/perf.script.cut.combined.stacks-21.svg)August 2023 (9.16.43, 9.16.43-S1, 9.18.18, 9.18.18-S1, 9.19.16)https://gitlab.isc.org/isc-projects/bind9/-/issues/4173Improve the documentation on the NOFOLLOW mode in the dns_resolver mode2023-06-30T07:19:53ZOndřej SurýImprove the documentation on the NOFOLLOW mode in the dns_resolver modeThe following discussion from !6267 should be addressed:
- [ ] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/6267#note_384658): (+3 comments)
> I think the commit message for this could ...The following discussion from !6267 should be addressed:
- [ ] @ondrej started a [discussion](https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/6267#note_384658): (+3 comments)
> I think the commit message for this could do with a little bit more explanation. The code in resolver is hard to follow as is - an the result from `rctx_answer()` actually propagates down below to `rctx_done()` where it ends up in `fctx_done_detach/fctx__done`. But that still doesn't really explain where the `DNS_R_DELEGATION` is consumed in the end. Is that in `ns_query` or anywhere else?
>
> It's also not helpful that the code in `resquery_response()` has inverted logic.
>
> And furthermore `DNS_FETCHOPT_NOFOLLOW` is very poorly explained (the `/*%< Don't follow delegations */` simply doesn't do it). Could we take this as an opportunity to actually explain somewhere in the code in more words what it does and why it is needed?July 2023 (9.18.17, 9.18.17-S1, 9.19.15)https://gitlab.isc.org/isc-projects/bind9/-/issues/4149huge waste of space in lib/isc/result.c leading to large libisc.so2023-06-20T14:10:21ZAndreas Kinzlerhuge waste of space in lib/isc/result.c leading to large libisc.sowhen switching to bind9 9.18 on my Gentoo systems I was wondering about the huge increase in *.so sizes. I tracked the problem down to very inefficient space usage in lib/isc/result.c. The compiled object file is >5MB just because of the...when switching to bind9 9.18 on my Gentoo systems I was wondering about the huge increase in *.so sizes. I tracked the problem down to very inefficient space usage in lib/isc/result.c. The compiled object file is >5MB just because of the 2 lookup tables. I converted the file back to switch/case and the final solib libisc-9.18.15.so went down to <600 KB from over 5800 KB.July 2023 (9.18.17, 9.18.17-S1, 9.19.15)https://gitlab.isc.org/isc-projects/bind9/-/issues/4096use uv_now() to get consistent time when running netmgr callbacks2023-07-28T12:09:32ZEvan Huntuse uv_now() to get consistent time when running netmgr callbacksThe following discussion from !7937 should be addressed:
> As we are working with libuv timeouts, it would be better to use `uv_now()` (perhaps disguised as `isc_nmhandle_now()` instead of `isc_time_now()` - that way you would not have ...The following discussion from !7937 should be addressed:
> As we are working with libuv timeouts, it would be better to use `uv_now()` (perhaps disguised as `isc_nmhandle_now()` instead of `isc_time_now()` - that way you would not have to worry about consistent value - that would be provided by libuv.
>
> However, as we need to backport this to 9.18, such changes should be a separate commit or even separate MR.
> `isc_nmhandle_now()` (or maybe `isc_loop_now()`) sounds like a good idea but should be a different issue.August 2023 (9.16.43, 9.16.43-S1, 9.18.18, 9.18.18-S1, 9.19.16)Evan HuntEvan Hunthttps://gitlab.isc.org/isc-projects/bind9/-/issues/4086Data race netmgr/netmgr.c:596 in isc___nmsocket_prep_destroy2023-07-28T12:38:53ZMichal NowakData race netmgr/netmgr.c:596 in isc___nmsocket_prep_destroyWhen testing the [AWS autoscaler](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3406310), this TSAN error popped up in the `stress` system test on `system:gcc:tsan`.
```
WARNING: ThreadSanitizer: data race
Write of size 1 at 0x000...When testing the [AWS autoscaler](https://gitlab.isc.org/isc-projects/bind9/-/jobs/3406310), this TSAN error popped up in the `stress` system test on `system:gcc:tsan`.
```
WARNING: ThreadSanitizer: data race
Write of size 1 at 0x000000000001 by main thread (mutexes: write M2, write M2):
#0 isc___nmsocket_prep_destroy netmgr/netmgr.c:596 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#1 isc___nmsocket_detach netmgr/netmgr.c:660 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#2 nmhandle__destroy netmgr/netmgr.c:931 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#3 nmhandle_destroy netmgr/netmgr.c:966
#4 isc_nmhandle_unref netmgr/netmgr.c:982
#5 isc_nmhandle_detach netmgr/netmgr.c:982 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#6 dispentry_destroy lib/dns/dispatch.c:469 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#7 dns_dispentry_unref lib/dns/dispatch.c:488 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#8 dns_dispentry_detach lib/dns/dispatch.c:488 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#9 dns_dispatch_done lib/dns/dispatch.c:1797 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#10 request_cancel lib/dns/request.c:805 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#11 dns_request_cancel lib/dns/request.c:818 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#12 dns_requestmgr_shutdown lib/dns/request.c:192 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#13 dns_view_detach lib/dns/view.c:478 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#14 load_configuration bin/named/server.c:9729 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#15 loadconfig bin/named/server.c:10305 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#16 reload bin/named/server.c:10331 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#17 named_server_reloadcommand bin/named/server.c:10659 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#18 named_control_docommand bin/named/control.c:250 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#19 control_command bin/named/controlconf.c:401 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#20 isc__async_cb lib/isc/async.c:112 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#21 uv__async_io /usr/src/libuv-v1.44.1/src/unix/async.c:163 (BuildId: c935a505813749d6b0282805da62fd79140fd966)
#22 thread_body lib/isc/thread.c:88 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#23 isc_thread_main lib/isc/thread.c:119 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#24 isc_loopmgr_run lib/isc/loop.c:452 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#25 main bin/named/main.c:1532 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
Previous write of size 1 at 0x000000000001 by thread T1:
#0 isc___nmsocket_prep_destroy netmgr/netmgr.c:596 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#1 udp_close_cb netmgr/udp.c:942 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#2 uv__finish_close /usr/src/libuv-v1.44.1/src/unix/core.c:308 (BuildId: c935a505813749d6b0282805da62fd79140fd966)
#3 thread_body lib/isc/thread.c:88 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#4 thread_run lib/isc/thread.c:103 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
Location is heap block of size 2136 at 0x000000000029 allocated by thread T1:
#0 malloc <null> (BuildId: 0dfec843367c3385ed82c09439308cfa1112e54e)
#1 mallocx lib/isc/jemalloc_shim.h:65 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#2 mem_get lib/isc/mem.c:305
#3 isc__mem_get lib/isc/mem.c:674 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#4 isc_nm_udpconnect netmgr/udp.c:794 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#5 udp_dispatch_connect lib/dns/dispatch.c:1980 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#6 dns_dispatch_connect lib/dns/dispatch.c:2083 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#7 dns_request_create lib/dns/request.c:680 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#8 notify_send_toaddr lib/dns/zone.c:12324 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#9 isc__async_cb lib/isc/async.c:112 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#10 uv__async_io /usr/src/libuv-v1.44.1/src/unix/async.c:163 (BuildId: c935a505813749d6b0282805da62fd79140fd966)
#11 thread_body lib/isc/thread.c:88 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#12 thread_run lib/isc/thread.c:103 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
Mutex M2 (0x000000000038) created at:
#0 pthread_mutex_init <null> (BuildId: 0dfec843367c3385ed82c09439308cfa1112e54e)
#1 dns_requestmgr_create lib/dns/request.c:147 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#2 dns_view_createresolver lib/dns/view.c:624 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#3 configure_view bin/named/server.c:4739 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#4 load_configuration bin/named/server.c:9193 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#5 loadconfig bin/named/server.c:10305 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#6 reload bin/named/server.c:10331 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#7 named_server_reloadcommand bin/named/server.c:10659 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#8 named_control_docommand bin/named/control.c:250 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#9 control_command bin/named/controlconf.c:401 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#10 isc__async_cb lib/isc/async.c:112 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#11 uv__async_io /usr/src/libuv-v1.44.1/src/unix/async.c:163 (BuildId: c935a505813749d6b0282805da62fd79140fd966)
#12 thread_body lib/isc/thread.c:88 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#13 isc_thread_main lib/isc/thread.c:119 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#14 isc_loopmgr_run lib/isc/loop.c:452 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#15 main bin/named/main.c:1532 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
Mutex M2 (0x000000000044) created at:
#0 pthread_mutex_init <null> (BuildId: 0dfec843367c3385ed82c09439308cfa1112e54e)
#1 dns_requestmgr_create lib/dns/request.c:150 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#2 dns_view_createresolver lib/dns/view.c:624 (BuildId: 13eb6cd62dc8cfe9ad825c1457c71cc82be5efb4)
#3 configure_view bin/named/server.c:4739 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#4 load_configuration bin/named/server.c:9193 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#5 loadconfig bin/named/server.c:10305 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#6 reload bin/named/server.c:10331 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#7 named_server_reloadcommand bin/named/server.c:10659 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#8 named_control_docommand bin/named/control.c:250 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#9 control_command bin/named/controlconf.c:401 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
#10 isc__async_cb lib/isc/async.c:112 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#11 uv__async_io /usr/src/libuv-v1.44.1/src/unix/async.c:163 (BuildId: c935a505813749d6b0282805da62fd79140fd966)
#12 thread_body lib/isc/thread.c:88 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#13 isc_thread_main lib/isc/thread.c:119 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#14 isc_loopmgr_run lib/isc/loop.c:452 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#15 main bin/named/main.c:1532 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
Thread T1 'isc-loop-0002' (running) created by main thread at:
#0 pthread_create <null> (BuildId: 0dfec843367c3385ed82c09439308cfa1112e54e)
#1 isc_thread_create lib/isc/thread.c:142 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#2 isc_loopmgr_run lib/isc/loop.c:446 (BuildId: f82fc18af69228f16693926bb890ef35731d4a11)
#3 main bin/named/main.c:1532 (BuildId: a8d11c463d35223877314591808060f2a02dd191)
SUMMARY: ThreadSanitizer: data race netmgr/netmgr.c:596 in isc___nmsocket_prep_destroy
```
[named.run](/uploads/07f857da0d094997e9369acb1b99a2ab/named.run)August 2023 (9.16.43, 9.16.43-S1, 9.18.18, 9.18.18-S1, 9.19.16)Ondřej SurýOndřej Surýhttps://gitlab.isc.org/isc-projects/bind9/-/issues/4045glue-cache scales very poorly on multi-CPU systems2023-06-14T09:43:01ZPetr Špačekpspacek@isc.orgglue-cache scales very poorly on multi-CPU systems### Summary
Glue cache scales very poorly and suffers from lock contention. It eventually improves max QPS by 1/3 on 16-thread system with delegation-heavy workload. Maxing out QPS on 16-thread system takes over 300 seconds of query loa...### Summary
Glue cache scales very poorly and suffers from lock contention. It eventually improves max QPS by 1/3 on 16-thread system with delegation-heavy workload. Maxing out QPS on 16-thread system takes over 300 seconds of query load.
### BIND version used
* ~"Affects v9.18": v9.18.14
* Other versions were not tested, but are assumed to be affected
### Steps to reproduce
* configure delegation-heavy zone, e.g. SE from https://zonedata.iis.se/
* issue queries which hit delegations, preferably unique: [querydb.xz](/uploads/b3b8bf5b546acb28148133b9f854a738/querydb.xz)
### What is the current *bug* behavior?
Initially QPS is very low, and adding CPUs is not improving performance. Gradually BIND builds glue-cache and overall QPS improves.
### What is the expected *correct* behavior?
* Initial QPS should not be that low.
* Adding CPUs should improve performance, also initially.
### Workaround
```
options {
glue-cache no;
};
```
This provides more predictable performance but incurs ~ 1/3 performance hit (in terms of max QPS).
### Benchmarks
* 16-thread machine in AWS, type c5n.4xlarge
* BIND v9.18.4 with glue-cache on / off
* SE zone serial 2021122008
* client kxdpgun: `kxdpgun -t 5 -Q $QPS -i querydb 10.10.126.46 -p 5300`
* 5-second tests, QPS in the table below is average
* Individual lines in table are successive tests
* Each step starts with the same query set (so successive tests repeat some of the queries)
* QPS step is +50k QPS
* QPS is incremented only if reponse rate was >= 99 %
You can see that `glue-cache yes;` requires significant warm-up time and eventually provides up to 1/3 higher **max** QPS than configuration with `glue-cache no;`. Problem is the ridiculously long warm-up phase.
`glue-cache no` config hits max QPS right away without any warm-up.
Raw data - each line is one 5-second benchmark:
| glue-cache yes | | | glue-cache no | |
|----------------|---------------|--|---------------|---------------|
| QPS | Response rate | | QPS | Response rate |
| 50000 | 77 % | | 300000 | 99 % |
| 50000 | 90 % | | 350000 | 97 % |
| 50000 | 79 % | | 350000 | 96 % |
| 50000 | 99 % | | 350000 | 96 % |
| 100000 | 69 % | | 350000 | 97 % |
| 100000 | 80 % | | 350000 | max reached |
| 100000 | 99 % | | | |
| 150000 | 74 % | | | |
| 150000 | 77 % | | | |
| 150000 | 83 % | | | |
| 150000 | 96 % | | | |
| 150000 | 99 % | | | |
| 200000 | 79 % | | | |
| 200000 | 80 % | | | |
| 200000 | 82 % | | | |
| 200000 | 82 % | | | |
| 200000 | 17 % | | | |
| 200000 | 22 % | | | |
| 200000 | 28 % | | | |
| 200000 | 39 % | | | |
| 200000 | 62 % | | | |
| 200000 | 99 % | | | |
| 250000 | 82 % | | | |
| 250000 | 83 % | | | |
| 250000 | 84 % | | | |
| 250000 | 85 % | | | |
| 250000 | 87 % | | | |
| 250000 | 90 % | | | |
| 250000 | 95 % | | | |
| 250000 | 99 % | | | |
| 300000 | 85 % | | | |
| 300000 | 85 % | | | |
| 300000 | 86 % | | | |
| 300000 | 86 % | | | |
| 300000 | 87 % | | | |
| 300000 | 88 % | | | |
| 300000 | 90 % | | | |
| 300000 | 93 % | | | |
| 300000 | 98 % | | | |
| 300000 | 99 % | | | |
| 350000 | 86 % | | | |
| 350000 | 87 % | | | |
| 350000 | 87 % | | | |
| 350000 | 87 % | | | |
| 350000 | 88 % | | | |
| 350000 | 88 % | | | |
| 350000 | 89 % | | | |
| 350000 | 90 % | | | |
| 350000 | 92 % | | | |
| 350000 | 94 % | | | |
| 350000 | 98 % | | | |
| 350000 | 99 % | | | |
| 400000 | 88 % | | | |
| 400000 | 88 % | | | |
| 400000 | 88 % | | | |
| 400000 | 88 % | | | |
| 400000 | 89 % | | | |
| 400000 | 89 % | | | |
| 400000 | 90 % | | | |
| 400000 | 90 % | | | |
| 400000 | 91 % | | | |
| 400000 | 92 % | | | |
| 400000 | 93 % | | | |
| 400000 | 95 % | | | |
| 400000 | 98 % | | | |
| 400000 | 99 % | | | |
| 450000 | 82 % | | | |
| 450000 | 82 % | | | |
| 450000 | 84 % | | | |
| 450000 | 83 % | | | |
| 450000 | 83 % | | | |
| 450000 | 83 % | | | |
| 450000 | 84 % | | | |
| 450000 | 84 % | | | |
| 450000 | 85 % | | | |
| 450000 | 85 % | | | |
| 450000 | 86 % | | | |
| 450000 | 85 % | | | |
| 450000 | 90 % | | | |
| 450000 | 88 % | | | |
| 450000 | 91 % | | | |
| 450000 | 92 % | | | |
| 450000 | max reached | | | |
Flame chart with sleeper + waker threads generated by [offwaketime.py](https://github.com/iovisor/bcc/blob/d27fd5a7bc8a37679cd3bc0bbdb63f713b0be36f/tools/offwaketime.py):
![offcpu.svg](/uploads/096d197619ed8064a45d330828f78b23/offcpu.svg)
(Sorry for missing stack frames, but you get the point.)June 2023 (9.16.42, 9.16.42-S1, 9.18.16, 9.18.16-S1, 9.19.14)