[CVE-2022-0667] assertion failure on delayed DS lookup
CVE-specific actions
-
Assign a CVE identifier -
Determine CVSS score 7.0 - AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H/E:F/RL:O/RC:C
-
Determine the range of BIND versions affected (including the Subscription Edition) - 9.18.0
-
Determine whether workarounds for the problem exists - No workaround exits
-
Create a draft of the security advisory and put the information above in there -
Prepare a detailed description of the problem which should include the following by default: - instructions for reproducing the problem (a system test is good enough)
- explanation of code flow which triggers the problem (a system test is not good enough)
-
Prepare a private merge request containing the following items in separate commits: - a test for the issue (may be moved to a separate merge request for deferred merging)
- a fix for the issue
- documentation updates (
CHANGES
, release notes, anything else applicable)
-
Ensure the merge request from the previous step is reviewed by SWENG staff and has no outstanding discussions -
Ensure the documentation changes introduced by the merge request addressing the problem are reviewed by Support and Marketing staff -
Prepare backports of the merge request addressing the problem for all affected (and still maintained) BIND branches (backporting might affect the issue's scope and/or description) -
Prepare a standalone patch for the last stable release of each affected (and still maintained) BIND branch
Release-specific actions
-
Create/update the private issue containing links to fixes & reproducers for all CVEs fixed in a given release cycle -
Reserve a block of CHANGES
placeholders once the complete set of vulnerabilities fixed in a given release cycle is determined -
Ensure the merge requests containing CVE fixes are merged into security-*
branches in CVE identifier order
Post-disclosure actions
-
Merge a regression test reproducing the bug into all affected (and still maintained) BIND branches
Summary
BIND crashes when under heavy cache-miss load, while configured to forward queries to some other recursor.
BIND version used
Affects v9.18 : v9.18.0
Other versions are under investigation.
Steps to reproduce
Well, that's a problem. It happens only under load, when forwarding, and only when some timeouts happen.
What is the current bug behavior?
Crash on assert:
assertion_failed (file=0x7fea3376b05b "resolver.c", line=7117, type=isc_assertiontype_insist, cond=0x7fea3376b3a6 "__v > 0")
GDB
(gdb) bt
#0 0x00007fea3288ad22 in raise () from /usr/lib/libc.so.6
#1 0x00007fea32874862 in abort () from /usr/lib/libc.so.6
#2 0x000055fad8cb1821 in assertion_failed (file=0x7fea3376b05b "resolver.c", line=7117, type=isc_assertiontype_insist, cond=0x7fea3376b3a6 "__v > 0") at main.c:238
#3 0x00007fea337f88da in isc_assertion_failed (file=0x7fea3376b05b "resolver.c", line=7117, type=isc_assertiontype_insist, cond=0x7fea3376b3a6 "__v > 0") at assertions.c:49
#4 0x00007fea3368552f in fctx__detach (fctxp=0x7fea2c8f4a38, file=0x7fea3376b05b "resolver.c", line=7275, func=0x7fea3376f3f0 <__func__.5> "resume_dslookup") at resolver.c:7117
#5 0x00007fea33685c78 in resume_dslookup (task=0x7fea285a85c0, event=0x0) at resolver.c:7275
#6 0x00007fea33826e35 in task_run (task=0x7fea285a85c0) at task.c:820
#7 0x00007fea3382705d in isc_task_run (task=0x7fea285a85c0) at task.c:900
#8 0x00007fea337d7b30 in isc__nm_async_task (worker=0x7fea302a9770, ev0=0x7fea28055200) at netmgr/netmgr.c:837
#9 0x00007fea337d7dc4 in process_netievent (worker=0x7fea302a9770, ievent=0x7fea28055200) at netmgr/netmgr.c:916
#10 0x00007fea337d88f9 in process_queue (worker=0x7fea302a9770, type=NETIEVENT_TASK) at netmgr/netmgr.c:1010
#11 0x00007fea337d797c in process_all_queues (worker=0x7fea302a9770) at netmgr/netmgr.c:756
#12 0x00007fea337d7a00 in async_cb (handle=0x7fea302a9ad0) at netmgr/netmgr.c:785
#13 0x00007fea32c2f92d in ?? () from /usr/lib/libuv.so.1
#14 0x00007fea32c4bd0e in ?? () from /usr/lib/libuv.so.1
#15 0x00007fea32c35438 in uv_run () from /usr/lib/libuv.so.1
#16 0x00007fea337d75ab in nm_thread (worker0=0x7fea302a9770) at netmgr/netmgr.c:691
#17 0x00007fea33830a60 in isc__trampoline_run (arg=0x7fea30277ca0) at trampoline.c:187
#18 0x00007fea32a23259 in start_thread () from /usr/lib/libpthread.so.0
#19 0x00007fea3294c5e3 in clone () from /usr/lib/libc.so.6
(gdb) frame
#4 0x00007fea3368552f in fctx__detach (fctxp=0x7fea2c8f4a38, file=0x7fea3376b05b "resolver.c", line=7275, func=0x7fea3376f3f0 <__func__.5> "resume_dslookup") at resolver.c:7117
(gdb) p /x fctx->references
$2 = 0xffffffffffffffff
What is the expected correct behavior?
Well, no crash :trollface:
Relevant configuration files
- named.conf
- 127.0.0.112 does recursion
I have a suspicion that all tracebacks had function resume_dslookup
in them. Is this function special in some way?
Relevant logs and/or screenshots
Original logs were lost, sorry. Generally the log had lots and lots of timeouts and also some "shut down hung fetch" messages.
Here is content of fctx
:
(gdb) p *fctx
$5 = {magic = 1176576289, res = 0x0, fname = {name = {magic = 1145983854,
ndata = 0x7fea21692120 "\001\061\001\060\001\060\001\060\001\061\001\060\001\071\001\061\001\060\001\060\001\066\001\062\003ip6\004arpa", length = 34, labels = 15, attributes = 1, offsets = 0x7fea21692060 "",
buffer = 0x7fea216920e0, link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, list = {head = 0x0,
tail = 0x0}},
offsets = "\000\002\004\006\b\n\f\016\020\022\024\026\030\034!", '\000' <repeats 112 times>, buffer = {
magic = 1114990113, base = 0x7fea21692120, length = 255, used = 34, current = 0, active = 0, link = {
prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, mctx = 0x0, autore = false},
data = "\001\061\001\060\001\060\001\060\001\061\001\060\001\071\001\061\001\060\001\060\001\066\001\062\003ip6\004arpa", '\000' <repeats 221 times>}, name = 0x7fea21692010, type = 43, options = 0, bucketnum = 133,
dbucketnum = 4294967295, info = 0x0, mctx = 0x0, now = 1644339489, task = 0x7fea285a85c0,
references = 18446744073709551615, state = fetchstate_done, want_shutdown = true, cloned = false,
spilled = false, control_event = {ev_size = 104, ev_attributes = 0, ev_tag = 0x0, ev_type = 262144,
ev_action = 0x7fea3367d90f <fctx_doshutdown>, ev_arg = 0x7fea21692000, ev_sender = 0x0, ev_destroy = 0x0,
ev_destroy_arg = 0x0, ev_link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, ev_ratelink = {
prev = 0xffffffffffffffff, next = 0xffffffffffffffff}}, link = {prev = 0xffffffffffffffff,
next = 0xffffffffffffffff}, events = {head = 0x0, tail = 0x0}, dfname = {name = {magic = 1145983854,
ndata = 0x7fea21692400 "", length = 1, labels = 1, attributes = 1, offsets = 0x7fea21692340 "",
buffer = 0x7fea216923c0, link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, list = {head = 0x0,
tail = 0x0}}, offsets = '\000' <repeats 127 times>, buffer = {magic = 1114990113, base = 0x7fea21692400,
length = 255, used = 1, current = 0, active = 0, link = {prev = 0xffffffffffffffff,
next = 0xffffffffffffffff}, mctx = 0x0, autore = false}, data = '\000' <repeats 254 times>},
domain = 0x7fea216922f0, nameservers = {magic = 1145983826, methods = 0x0, link = {prev = 0xffffffffffffffff,
next = 0xffffffffffffffff}, rdclass = 0, type = 0, ttl = 0, trust = 0, covers = 0, attributes = 0,
count = 4294967295, resign = 0, private1 = 0x0, private2 = 0x0, private3 = 0x0, privateuint4 = 0,
private5 = 0x0, private6 = 0x0, private7 = 0x0}, attributes = 392, timer = 0x0, expires = {
seconds = 1644339499, nanoseconds = 261051224}, expires_try_stale = {seconds = 0, nanoseconds = 0},
next_timeout = {seconds = 1644339496, nanoseconds = 897675111}, final = {seconds = 1644339501,
nanoseconds = 261051224}, interval = {seconds = 1, nanoseconds = 200000000}, qmessage = 0x0, queries = {
head = 0x0, tail = 0x0}, finds = {head = 0x0, tail = 0x0}, find = 0x0, altfinds = {head = 0x0, tail = 0x0},
altfind = 0x0, forwaddrs = {head = 0x0, tail = 0x0}, altaddrs = {head = 0x0, tail = 0x0}, forwarders = {
head = 0x0, tail = 0x0}, fwdpolicy = dns_fwdpolicy_only, bad = {head = 0x0, tail = 0x0}, edns = {head = 0x0,
tail = 0x0}, bad_edns = {head = 0x0, tail = 0x0}, validator = 0x0, validators = {head = 0x0, tail = 0x0},
cache = 0x0, adb = 0x0, ns_ttl_ok = false, ns_ttl = 0, qc = 0x0, minimized = false, qmin_labels = 1,
qmin_warning = ISC_R_SUCCESS, ip6arpaskip = false, forwarding = true, qminfname = {name = {magic = 1145983854,
ndata = 0x7fea216927b8 "\001\061\001\060\001\060\001\060\001\061\001\060\001\071\001\061\001\060\001\060\001\066\001\062\003ip6\004arpa", length = 34, labels = 15, attributes = 1, offsets = 0x7fea216926f8 "",
buffer = 0x7fea21692778, link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, list = {head = 0x0,
tail = 0x0}},
offsets = "\000\002\004\006\b\n\f\016\020\022\024\026\030\034!", '\000' <repeats 112 times>, buffer = {
magic = 1114990113, base = 0x7fea216927b8, length = 255, used = 34, current = 0, active = 0, link = {
prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, mctx = 0x0, autore = false},
data = "\001\061\001\060\001\060\001\060\001\061\001\060\001\071\001\061\001\060\001\060\001\066\001\062\003ip6\004arpa", '\000' <repeats 221 times>}, qminname = 0x7fea216926a8, qmintype = 43, qminfetch = 0x0, qminrrset = {
magic = 1145983826, methods = 0x0, link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff},
rdclass = 0, type = 0, ttl = 0, trust = 0, covers = 0, attributes = 0, count = 4294967295, resign = 0,
private1 = 0x0, private2 = 0x0, private3 = 0x0, privateuint4 = 0, private5 = 0x0, private6 = 0x0,
private7 = 0x0}, qmindcfname = {name = {magic = 1145983854, ndata = 0x7fea21692a50 "", length = 1,
labels = 1, attributes = 1, offsets = 0x7fea21692990 "", buffer = 0x7fea21692a10, link = {
prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, list = {head = 0x0, tail = 0x0}},
offsets = '\000' <repeats 127 times>, buffer = {magic = 1114990113, base = 0x7fea21692a50, length = 255,
used = 1, current = 0, active = 0, link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff},
mctx = 0x0, autore = false}, data = '\000' <repeats 254 times>}, qmindcname = 0x7fea21692940, pending = 0,
restarts = 1, timeouts = 0, nsfname = {name = {magic = 1145983854,
ndata = 0x7fea21692122 "\001\060\001\060\001\060\001\061\001\060\001\071\001\061\001\060\001\060\001\066\001\062\003ip6\004arpa", length = 32, labels = 14, attributes = 1, offsets = 0x7fea21692bb8 "",
buffer = 0x7fea21692c38, link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, list = {head = 0x0,
tail = 0x0}}, offsets = "\000\002\004\006\b\n\f\016\020\022\024\026\032\037", '\000' <repeats 113 times>,
buffer = {magic = 1114990113, base = 0x7fea21692c78, length = 255, used = 0, current = 0, active = 0, link = {
prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, mctx = 0x0, autore = false},
data = '\000' <repeats 254 times>}, nsname = 0x7fea21692b68, nsfetch = 0x0, nsrrset = {magic = 1145983826,
methods = 0x0, link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, rdclass = 0, type = 0, ttl = 0,
trust = 0, covers = 0, attributes = 0, count = 4294967295, resign = 0, private1 = 0x0, private2 = 0x0,
private3 = 0x0, privateuint4 = 0, private5 = 0x0, private6 = 0x0, private7 = 0x0}, nqueries = 0,
rand_buf = 0, rand_bits = 0, result = ISC_R_CANCELED, vresult = ISC_R_SUCCESS, exitline = 7176, start = {
seconds = 1644339489, nanoseconds = 261051224}, duration = 12003253, logged = false, querysent = 5,
referrals = 0, lamecount = 0, quotacount = 0, neterr = 1, badresp = 1, adberr = 0, findfail = 0, valfail = 0,
timeout = false, addrinfo = 0x7fea02c8ba20, depth = 0, clientstr = "<unknown>", '\000' <repeats 53 times>}
And here are logs from other runs which crashed on the same line:
Edited by Michal Nowak