[CVE-2024-1975] SIG(0) can be used to exhaust CPU resources
Quick Links | |
---|---|
Incident Manager: | @pspacek |
Deputy Incident Manager: | @peterd |
Public Disclosure Date: | 2024-07-23 |
CVSS Score: | 7.5 |
Security Advisory: | isc-private/printing-press!102 |
Mattermost Channel: | CVE-2024-1975 |
Support Ticket: | N/A |
Release Checklist: | #4735 (closed) |
Earlier Than T-5
-
🔗 (IM) Pick a Deputy Incident Manager -
🚫 🔗 (IM)Respond to the bug reporter- found internally -
🔗 (SwEng) Ensure there are no public merge requests which inadvertently disclose the issue -
🔗 (IM) Assign a CVE identifier -
🔗 (SwEng) Update this issue with the assigned CVE identifier and the CVSS score -
🔗 (SwEng) Determine the range of product versions affected (including the Subscription Edition) -
🔗 (SwEng) Determine whether workarounds for the problem exist -
🔗 (SwEng) If necessary, coordinate with other parties -
🔗 (Support) Prepare "earliest" notification text and hand it off to Marketing -
🔗 (Marketing) Update "earliest" notification document in SF portal and send bulk email to earliest customers -
🔗 (Support) Create a merge request for the Security Advisory and include all readily available information in it -
🚫 🔗 (SwEng) Prepare a private merge request containing a system test reproducing the problem - a performance issue, no automated test -
🔗 (SwEng) Notify Support when a reproducer is ready -
🔗 (SwEng) Prepare a detailed explanation of the code flow triggering the problem -
🔗 (SwEng) Prepare a private merge request with the fix + one for removal from older branches -
🔗 (SwEng) Ensure the merge request with the fix is reviewed and has no outstanding discussions -
🔗 (Support) Review the documentation changes introduced by the merge request with the fix -
🔗 (SwEng) Prepare backports of the merge request addressing the problem for all affected (and still maintained) branches of a given product -
🔗 (Support) Finish preparing the Security Advisory -
🔗 (QA) Create (or update) the private issue containing links to fixes & reproducers for all CVEs fixed in a given release cycle -
🔗 (QA) (BIND 9 only) Reserve a block ofCHANGES
placeholders once the complete set of vulnerabilities fixed in a given release cycle is determined -
🔗 (QA) Merge the CVE fixes in CVE identifier order -
🔗 (QA) Prepare a standalone patch for the last stable release of each affected (and still maintained) product branch -
🔗 (QA) Prepare ASN releases (as outlined in the Release Checklist)
At T-5
-
🔗 (Marketing) Update the text on the T-5 (from the Printing Press project) and "earliest" ASN documents in the SF portal -
🔗 (Marketing) (BIND 9 only) Update the BIND -S information document in SF with download links to the new versions -
🔗 (Marketing) Bulk email eligible customers to check the SF portal -
🔗 (Marketing) (BIND 9 only) Send a pre-announcement email to the bind-announce mailing list to alert users that the upcoming release will include security fixes
At T-1
-
🔗 (First IM) Send notifications to OS packagers
On the Day of Public Disclosure
-
🔗 (IM) Grant QA & Marketing clearance to proceed with public release -
🔗 (QA/Marketing) Publish the releases (as outlined in the release checklist) -
🔗 (Support) (BIND 9 only) Add the new CVEs to the vulnerability matrix in the Knowledge Base -
🔗 (Support) Bump Document Version for the Security Advisory and publish it in the Knowledge Base -
🔗 (First IM) Send notification emails to third parties -
🔗 (First IM) Advise MITRE about the disclosed CVEs -
🔗 (First IM) Merge the Security Advisory merge request -
🚫 🔗 (IM) Inform original reporter (if external) that the security disclosure process is complete -
🔗 (Marketing) Update the SF portal to clear the ASN -
🔗 (Marketing) Email ASN recipients that the embargo is lifted
After Public Disclosure
-
🚫 🔗 (QA) Merge a regression test reproducing the bug into all affected (and still maintained) branches
Summary
Authoritative servers with a KEY RR in a zone or validating resolvers are vulnerable to trivial CPU exhaustion attacks using SIG(0) protocol.
BIND versions affected
All versions with SIG(0) support. Tested configuration:
BIND 9.19.19-dev (Development Release) <id:de2009e>
compiled by GCC 13.2.1 20230801
compiled with OpenSSL version: OpenSSL 3.1.4 24 Oct 2023
linked to OpenSSL version: OpenSSL 3.1.4 24 Oct 2023
DNSSEC algorithms: RSASHA1 NSEC3RSASHA1 RSASHA256 RSASHA512 ECDSAP256SHA256 ECDSAP384SHA384 ED25519 ED448
DS algorithms: SHA-1 SHA-256 SHA-384
HMAC algorithms: HMAC-MD5 HMAC-SHA1 HMAC-SHA224 HMAC-SHA256 HMAC-SHA384 HMAC-SHA512
TKEY mode 2 support (Diffie-Hellman): no
TKEY mode 3 support (GSS-API): yes
- ~"Affects v9.19": c8c0f4bb
- Affects v9.18: 44e4b5cb
- ~"Affects v9.16": 161d69ab
- Affects v9.11 (EoL): v9.11.37-S1
Preconditions and assumptions
- Server must be configured with at least one KEY RR in a zone OR KEY RR must be in cache - with trust level secure (DNSSEC-validated record).
- Example configuration: https://jpmens.net/2010/12/01/securing-dynamic-dns-updates-ddns-with-sig0/
- Essentially any zone with KEY RR in it should suffice. If an attacker has ability to add KEY RR he can prepare conditions for the attack.
- Attacker is able to find DNS name of the KEY RR.
Attacker's abilities
Attacker simply needs to send a signed message which refers the existing KEY RR in one of the zones OR in cache. The signature can be invalid (any syntactically valid message is sufficient).
Impact
- Inability to respond to legitimate queries,
- CPU resource exhaustion on the server.
The impact gets higher with number of KEY RRs on the same name - it seems we are trying all KEY RRs until we find a match or exhaust all options.
Values were measured on build from the bind-9.18
branch, listed as thousands QPS, named
running on a single CPU core to make measurement easier.
State | QPS [k] |
---|---|
legitimate cache hits - before the attack | 70 |
attack | 1 |
legitimate traffic when the attack is ongoing | 10 |
Steps to reproduce
-
Beware: Test over UDP. TCP also suffers from #4481 (closed) which throws results off.
-
Configure BIND to be authoritative for a zone containing KEY RR:
- Start BIND server with command:
named -g -c named.conf -n1
(single thread to simplify measurement) - Simulate legitimate clients using:
yes '. A' | dnsperf -S1 -O suppress=timeout -c 256
- Simulate attack traffic using attached script & data file:
python udploop.py 127.0.0.1 53 sig0-bad-query.tcpdns --report-interval 1
Please note the binary containing queries does NOT have a valid signature. It's a valid message which pretends to be signed with the KEY RR from the zone file, but it's not a valid signature.
What is the current bug behavior?
- Legitimate QPS drops significantly. Impact on legit traffic QPS is roughly 60x higher than for an ordinary query without SIG(0).
- Side effect: log spam with one log line per attacker's message
What is the expected correct behavior?
Probably limit on resources we are willing to spend on SIG(0), or expensive crypto in general.
Possibly also smaller TCP buffer size (or smaller buffers elsewhere, I don't exactly know where the messages are being buffered).
Relevant logs
This particular variant of attack logs one message per query:
request has invalid signature: RRSIG failed to verify (BADSIG)