Authoritative NSEC3 zones are slow when queried with lots of labels
Summary
Queries for QNAMES with large number of labels, targeting authoritative zones with NSEC3, cause significant QPS drop. Essentially it's random subdomain attack against auth - which I thought is not relevant.
When comparing single-label QNAMEs with 100 labels QNAMEs the QPS drops to about 1/4 of the original QPS - assuming a zone with 0 extra NSEC3 iterations. The drop happens mainly when DO=1 is set.
Extra NSEC3 iterations make the problem more pronounced. With 150 iterations performance drops to about 1/8 or so.
BIND version used
- Affects v9.19: de2009e3 - 53 vs 14 k QPS
- Affects v9.18: 6817bf12 - 43 vs 11 k QPS
- Affects v9.16: 161d69ab - 50 vs. 12 k QPS
- Affects v9.11 (EoL): v9.11.37-S1 - 5.4 vs 3.7 k QPS
- Other versions were not tested
Steps to reproduce
- Sign an empty zone with NSEC3, 0 iterations, some random salt:
- local.testiscorg.ch.zone
- Klocal.testiscorg.ch.+014+01043.key
- Klocal.testiscorg.ch.+014+01043.private
dnssec-signzone -u -3 0122345678912345 -H 0 -e 20380101000000 -S -o local.testiscorg.ch -O full -z local.testiscorg.ch.zone Klocal.testiscorg.ch.+014+01043
-
👉🏻 local.testiscorg.ch.zone.signed
- Run an auth with the zone:
- auth.conf
named -g -c auth.conf
- Run attack using dnsperf, utilizing single CPU core for simplicity:
- randlabels.py
python randlabels.py | dnsperf -s 10.53.0.2 -S1 -D
- The
-D
is very important. The drop is not that significant when queries have DO=0.
Compare with randnames.py - single random label.
What is the current bug behavior?
QPS drop to about 1/4 of single-label QPS.
What is the expected correct behavior?
Preferably no drop.
Relevant logs and/or screenshots
Here's CPU flamechart from the main branch:
Possible fixes
(If you can, link to the line of code that might be responsible for the problem.)
Edited by Nicki Křížek