Unbounded Token Parsing
Summary
During code review it was found that the lexer code in file lib/isc/lex.c
is affected by
crashes and endless loops when parsing very long strings.
Technical Details
In particular the function pushandgrow()
, which is used to enlarge buffers during parsing, can lead
to a failed allocation on 32-bit architectures and a subsequent abort()
call:
static isc_result_t
pushandgrow(isc_lex_t *lex, inputsource *source, int c) {
if (isc_buffer_availablelength(source->pushback) == 0) {
isc_buffer_t *tbuf = NULL;
unsigned int oldlen;
isc_region_t used;
isc_result_t result;
oldlen = isc_buffer_length(source->pushback);
isc_buffer_allocate(lex->mctx, &tbuf, oldlen * 2);
isc_buffer_usedregion(source->pushback, &used);
result = isc_buffer_copyregion(tbuf, &used);
INSIST(result == ISC_R_SUCCESS);
tbuf->current = source->pushback->current;
isc_buffer_free(&source->pushback);
source->pushback = tbuf;
}
isc_buffer_putuint8(source->pushback, (uint8_t)c);
return (ISC_R_SUCCESS);
}
When parsing a zone file with a very long token, the lexer will enter the function pushandgrow() repeatedly and double the buffer allocation each time the buffer space is exhausted. This quickly leads to a failed allocation on 32-bit architectures since the maximum single allocation size is at most 4GB due to the limited address space. In practice it is even much less for most allocators.
An example is shown in the following listing:
$ ./bin/dnssec/dnssec-signzone-gdb db.long
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/user/src-new/bind9/bin/dnssec/.libs/dnssec-signzone...
gdb-peda$ r
Starting program: /home/user/src-new/bind9/bin/dnssec/.libs/dnssec-signzone db.long
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
jemalloc_shim.h:68: INSIST(ptr != ((void *)0)) failed, back trace
/home/user/src-new/bind9/lib/isc/.libs/libisc-9.19.18-dev.so(+0x27859)[0xf7f76859]
/home/user/src-new/bind9/lib/isc/.libs/libisc-9.19.18-dev.so(isc_assertion_failed+0x26)[0xf7f7678a]
/home/user/src-new/bind9/lib/isc/.libs/libisc-9.19.18-dev.so(+0x4222a)[0xf7f9122a]
/home/user/src-new/bind9/lib/isc/.libs/libisc-9.19.18-dev.so(+0x4238b)[0xf7f9138b]
/home/user/src-new/bind9/lib/isc/.libs/libisc-9.19.18-dev.so(isc__mem_get+0x58)[0xf7f923d0]
/home/user/src-new/bind9/lib/isc/.libs/libisc-9.19.18-dev.so(+0x390f5)[0xf7f880f5]
/home/user/src-new/bind9/lib/isc/.libs/libisc-9.19.18-dev.so(+0x39fb8)[0xf7f88fb8]
/home/user/src-new/bind9/lib/isc/.libs/libisc-9.19.18-dev.so(isc_lex_gettoken+0x2e0)[0xf7f89334]
/home/user/src-new/bind9/lib/dns/.libs/libdns-9.19.18-dev.so(+0x771c9)[0xf7c771c9]
/home/user/src-new/bind9/lib/dns/.libs/libdns-9.19.18-dev.so(+0x78fb7)[0xf7c78fb7]
/home/user/src-new/bind9/lib/dns/.libs/libdns-9.19.18-dev.so(dns_master_loadfile+0x8b)[0xf7c8066c]
/home/user/src-new/bind9/lib/dns/.libs/libdns-9.19.18-dev.so(dns_db_load+0xbd)[0xf7c403e8]
/home/user/src-new/bind9/bin/dnssec/.libs/dnssec-signzone(+0xc13c)[0x5656113c]
/home/user/src-new/bind9/bin/dnssec/.libs/dnssec-signzone(main+0xfaa)[0x565646a6]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0x106)[0xf7a32e46]
/home/user/src-new/bind9/bin/dnssec/.libs/dnssec-signzone(_start+0x31)[0x56559e51]
Program received signal SIGABRT, Aborted.
[----------------------------------registers-----------------------------------]
EAX: 0x0
EBX: 0x2
ECX: 0xffffbfdc --> 0x0
EDX: 0x0
ESI: 0x8
EDI: 0x0
EBP: 0xffffbfdc --> 0x0
ESP: 0xffffbfc0 --> 0xffffbfdc --> 0x0
EIP: 0xf7fd0559 (<__kernel_vsyscall+9>: pop ebp)
EFLAGS: 0x200246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0xf7fd0553 <__kernel_vsyscall+3>: mov ebp,esp
0xf7fd0555 <__kernel_vsyscall+5>: sysenter
0xf7fd0557 <__kernel_vsyscall+7>: int 0x80
=> 0xf7fd0559 <__kernel_vsyscall+9>: pop ebp
0xf7fd055a <__kernel_vsyscall+10>: pop edx
0xf7fd055b <__kernel_vsyscall+11>: pop ecx
0xf7fd055c <__kernel_vsyscall+12>: ret
0xf7fd055d: nop
[------------------------------------stack-------------------------------------]
0000| 0xffffbfc0 --> 0xffffbfdc --> 0x0
0004| 0xffffbfc4 --> 0x0
0008| 0xffffbfc8 --> 0xffffbfdc --> 0x0
0012| 0xffffbfcc --> 0xf7a48e02 (<__GI_raise+194>: mov eax,DWORD PTR [esp+0x10c])
0016| 0xffffbfd0 --> 0xf7fe26e9 (<_dl_fixup+9>: add ebx,0x1a917)
0020| 0xffffbfd4 --> 0xf7fc8000 --> 0x78e18
0024| 0xffffbfd8 --> 0xf7fad5a8 (", back trace")
0028| 0xffffbfdc --> 0x0
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGABRT
0xf7fd0559 in __kernel_vsyscall ()
gdb-peda$ bt
#0 0xf7fd0559 in __kernel_vsyscall ()
#1 0xf7a48e02 in __libc_signal_restore_set (set=0xffffbfdc) at ../sysdeps/unix/sysv/linux/internal-signals.h:86
#2 __GI_raise (sig=0x6) at ../sysdeps/unix/sysv/linux/raise.c:48
#3 0xf7a31306 in __GI_abort () at abort.c:79
#4 0xf7f76792 in isc_assertion_failed (file=0xf7fb2917 "jemalloc_shim.h", line=0x44, type=isc_assertiontype_insist,
cond=0xf7fb2904 "ptr != ((void *)0)") at assertions.c:49
#5 0xf7f9122a in mallocx (size=0x8000002c, flags=0x0) at jemalloc_shim.h:68
#6 0xf7f9138b in mem_get (ctx=0x56583b60, size=0x8000002c, flags=0x0) at mem.c:305
#7 0xf7f923d0 in isc__mem_get (ctx=0x56583b60, size=0x8000002c, flags=0x0) at mem.c:744
#8 0xf7f880f5 in isc_buffer_allocate (mctx=0x56583b60, dbufp=0xffffc334, length=0x80000000) at ./include/isc/buffer.h:1085
#9 0xf7f88fb8 in pushandgrow (lex=0x5658bf30, source=0x565832e0, c=0x41) at lex.c:327
#10 0xf7f89334 in isc_lex_gettoken (lex=0x5658bf30, options=0x137, tokenp=0xffffcd44) at lex.c:451
#11 0xf7c771c9 in gettoken (lex=0x5658bf30, options=0x137, token=0xffffcd44, eol=0x1, callbacks=0xffffce84) at master.c:341
#12 0xf7c78fb7 in load_text (lctx=0x5658b520) at master.c:1090
#13 0xf7c8066c in dns_master_loadfile (master_file=0xffffd549 "db.long", top=0x56582c60, origin=0x56582c60, zclass=0x1,
options=0x0, resign=0x0, callbacks=0xffffce84, include_cb=0x0, include_arg=0x0, mctx=0x56583b60, format=dns_masterformat_text,
maxttl=0x0) at master.c:2637
#14 0xf7c403e8 in dns_db_load (db=0x56582c50, filename=0xffffd549 "db.long", format=dns_masterformat_text, options=0x0) at db.c:316
#15 0x5656113c in loadzone (file=0xffffd549 "db.long", origin=0xffffd549 "db.long", rdclass=0x1, db=0x5656d5cc <gdb>)
at dnssec-signzone.c:2579
#16 0x565646a6 in main (argc=0x0, argv=0xffffd38c) at dnssec-signzone.c:3862
#17 0xf7a32e46 in __libc_start_main (main=0x565636fc <main>, argc=0x2, argv=0xffffd384, init=0x565669b0 <__libc_csu_init>,
fini=0x56566a10 <__libc_csu_fini>, rtld_fini=0xf7fe3230 <_dl_fini>, stack_end=0xffffd37c) at ../csu/libc-start.c:308
#18 0x56559e51 in _start ()
gdb-peda$
As a proof of concept, a zone file triggering this behavior can be created using the command:
perl -e ’print "AA"x0x80000000’ > db.long
The tool bin/dnssec/ddnssec-signzone can then be used to trigger the crash. It shares the same code path with named and attackers could try to provide malicious zone files to it.
On 64-bit architectures, the parsing code will enter an endless loop due to unsigned integer truncation.
Arithmetic overflows caused by large amounts of data are notoriously hard to find using general purpose dynamic analysis methods such as fuzzing.
Solution Advice
We recommend to limit the maximum token size to a sane amount. Furthermore, it is recommended
to replace all allocation size calculations in the lexer by using the new isc_mem_callocate()
function which can detect arithmetic overflows. Besides the examples shown above, there are
also other parts of the lexer code that could be prone to similar bugs such as the following function:
static isc_result_t
grow_data(isc_lex_t *lex, size_t *remainingp, char **currp, char **prevp) {
char *tmp;
tmp = isc_mem_get(lex->mctx, lex->max_token * 2 + 1);
memmove(tmp, lex->data, lex->max_token + 1);
*currp = tmp + (*currp - lex->data);
if (*prevp != NULL) {
*prevp = tmp + (*prevp - lex->data);
}
isc_mem_put(lex->mctx, lex->data, lex->max_token + 1);
lex->data = tmp;
*remainingp += lex->max_token;
lex->max_token *= 2;
return (ISC_R_SUCCESS);
}