[CVE-2024-1737] BIND's database will be slow if a very large number of RRs exist at the same name
Quick Links | |
---|---|
Incident Manager: | @mnowak |
Deputy Incident Manager: | @greg |
Public Disclosure Date: | 2024-07-23 |
CVSS Score: | 7.5 |
Security Advisory: | isc-private/printing-press!110 |
Mattermost Channel: | CVE-2024-1737 |
Support Ticket: | N/A |
Release Checklist: | #4735 (closed) |
Earlier Than T-5
-
🔗 (IM) Pick a Deputy Incident Manager -
🔗 (IM) Respond to the bug reporter -
🔗 (SwEng) Ensure there are no public merge requests which inadvertently disclose the issue -
🔗 (IM) Assign a CVE identifier -
🔗 (SwEng) Update this issue with the assigned CVE identifier and the CVSS score -
🔗 (SwEng) Determine the range of product versions affected (including the Subscription Edition) -
🔗 (SwEng) Determine whether workarounds for the problem exist -
🔗 (SwEng) If necessary, coordinate with other parties -
🔗 (Support) Prepare "earliest" notification text and hand it off to Marketing -
🔗 (Marketing) Update "earliest" notification document in SF portal and send bulk email to earliest customers -
🔗 (Support) Create a merge request for the Security Advisory and include all readily available information in it -
🔗 (SwEng)Prepare a private merge request containing a system test reproducing the problem -
🔗 (SwEng) Notify Support when a reproducer is ready -
🔗 (SwEng) Prepare a detailed explanation of the code flow triggering the problem -
🔗 (SwEng) Prepare a private merge request with the fix -
🔗 (SwEng) Ensure the merge request with the fix is reviewed and has no outstanding discussions -
🔗 (Support) Review the documentation changes introduced by the merge request with the fix -
🔗 (SwEng) Prepare backports of the merge request addressing the problem for all affected (and still maintained) branches of a given product -
🔗 (Support) Finish preparing the Security Advisory -
🔗 (QA) Create (or update) the private issue containing links to fixes & reproducers for all CVEs fixed in a given release cycle -
🔗 (QA) (BIND 9 only) Reserve a block ofCHANGES
placeholders once the complete set of vulnerabilities fixed in a given release cycle is determined -
🔗 (QA) Merge the CVE fixes in CVE identifier order -
🔗 (QA) Prepare a standalone patch for the last stable release of each affected (and still maintained) product branch -
🔗 (QA) Prepare ASN releases (as outlined in the Release Checklist)
At T-5
-
🔗 (Marketing) Update the text on the T-5 (from the Printing Press project) and "earliest" ASN documents in the SF portal -
🔗 (Marketing) (BIND 9 only) Update the BIND -S information document in SF with download links to the new versions -
🔗 (Marketing) Bulk email eligible customers to check the SF portal -
🔗 (Marketing) (BIND 9 only) Send a pre-announcement email to the bind-announce mailing list to alert users that the upcoming release will include security fixes
At T-1
-
🔗 (First IM) Send notifications to OS packagers
On the Day of Public Disclosure
-
🔗 (IM) Grant QA & Marketing clearance to proceed with public release -
🔗 (QA/Marketing) Publish the releases (as outlined in the release checklist) -
🔗 (Support) (BIND 9 only) Add the new CVEs to the vulnerability matrix in the Knowledge Base -
🔗 (Support) Bump Document Version for the Security Advisory and publish it in the Knowledge Base -
🔗 (First IM) Send notification emails to third parties -
🔗 (First IM) Advise MITRE about the disclosed CVEs -
🔗 (First IM) Merge the Security Advisory merge request -
🔗 (IM) Inform original reporter (if external) that the security disclosure process is complete -
🔗 (Marketing) Update the SF portal to clear the ASN -
🔗 (Marketing) Email ASN recipients that the embargo is lifted
After Public Disclosure
-
🚫 🔗 (QA) Merge a regression test reproducing the bug into all affected (and still maintained) branches
Summary
-
RBTDB handling of nodes with many RRs of the same type seems to be inefficient. This causes significant performance drop if a node has LOTS of RRs on it.
-
RBTDB handling of nodes with many RR types seems to be inefficient. (This is speculation based on my accidentally passing through rbtdb.c.) This causes significant performance drop if a node has LOTS of types on it.
It seems like an attack vector:
- for secondaries or systems exposed to updates from not-really-trusted parties
- for resolvers which can be queried by an attacker
BIND version used
- ~"Affects v9.16": 5c327f20
- Affects v9.18: v9_18_3
- ~"Affects v9.19": b18b0e41
Steps to reproduce
TL;DR Start authoritative server with a zone which has lots of RRs of the same type.
"slow" variant with many RRs on one node
dig example.com SOA example.com NS > many.db
for I3 in $(seq 0 255); do for I2 in $(seq 0 128); do echo "big 0 IN A 0.0.$I3.$I2"; done; done >> many.db
time named-checkzone example.com many.db
$ time named-checkzone example.com example.com.db
zone example.com/IN: loaded serial 2022040444
OK
real 0m39.496s
user 0m39.407s
sys 0m0.030s
- named.conf
named -g -c named.conf -n1
yes 'big.example.com A' | dnsperf -l 5
Statistics:
Queries sent: 1666
Queries completed: 1666 (100.00%)
Queries lost: 0 (0.00%)
Response codes: NOERROR 1666 (100.00%)
Average packet size: request 33, response 497
Run time (s): 5.246397
Queries per second: 317.551264
Average Latency (s): 0.307935 (min 0.003612, max 0.373912)
Latency StdDev (s): 0.063756
fast variant
Same configuration, different zone file content.
dig example.com SOA example.com NS > example.com.db
for I3 in $(seq 0 255); do for I2 in $(seq 0 128); do echo "big$I2$I3 0 IN A 0.0.$I3.$I2"; done; done >> many.db
time named-checkzone example.com many.db
zone example.com/IN: loaded serial 2022040444
OK
real 0m0.327s
user 0m0.317s
sys 0m0.010s
while true; do for I3 in $(seq 0 255); do for I2 in $(seq 0 128); do echo "big$I2$I3 A"; done; done; done | dnsperf -l 5
Statistics:
Queries sent: 163332
Queries completed: 163332 (100.00%)
Queries lost: 0 (0.00%)
Response codes: NXDOMAIN 163332 (100.00%)
Average packet size: request 25, response 100
Run time (s): 5.002245
Queries per second: 32651.739369
Average Latency (s): 0.002966 (min 0.000034, max 0.005476)
Latency StdDev (s): 0.000589
See #3403 for the second reproducer, but there's no magic sauce - just add many RRTYPEs under a single name.
What is the current bug behavior?
QPS is down by factor of ~ 100 x.
What is the expected correct behavior?
Not a huge drop in QPS.
Relevant logs and/or screenshots
See also #3403 .