Assertion failures in purge_old_interfaces() during shutdown
During pre-release testing of BIND 9.16.27-S1, the rpzextra
system
test in a single CI job triggered not one, but two crashes at
shutdown. The purge_old_interfaces()
function is present in the stack
traces for both of these crashes:
-
bin/tests/system/rpzextra/ns1/core.1251
Click to expand/collapse backtrace
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007fcccde7e42a in __GI_abort () at abort.c:89 #2 0x000056407fce23d5 in assertion_failed (file=<optimized out>, line=<optimized out>, type=<optimized out>, cond=<optimized out>) at ./main.c:274 #3 0x00007fccd058592a in isc_assertion_failed (file=file@entry=0x7fccd12bc524 "interfacemgr.c", line=line@entry=652, type=type@entry=isc_assertiontype_insist, cond=cond@entry=0x7fccd12bbee0 "(__builtin_expect(!!((ifp) != ((void *)0)), 1) && __builtin_expect(!!(((const isc__magic_t *)(ifp))->magic == ((('I') << 24 | (':') << 16 | ('-') << 8 | (')')))), 1))") at assertions.c:48 #4 0x00007fccd12909c7 in purge_old_interfaces (mgr=mgr@entry=0x7fccb8000a00) at interfacemgr.c:652 #5 0x00007fccd12921f8 in ns_interfacemgr_scan0 (verbose=<optimized out>, mgr=0x7fccb8000a00) at interfacemgr.c:1116 #6 ns_interfacemgr_scan (mgr=0x7fccb8000a00, verbose=verbose@entry=false) at interfacemgr.c:1152 #7 0x00007fccd1292319 in route_event (task=<optimized out>, event=<optimized out>) at interfacemgr.c:153 #8 0x00007fccd05b829d in task_run (task=0x564081eeb210) at task.c:851 #9 isc_task_run (task=0x564081eeb210) at task.c:944 #10 0x00007fccd059be15 in isc__nm_async_task (ev0=ev0@entry=0x7fcc6c0008d0, worker=0x5640815a4540) at netmgr.c:873 #11 0x00007fccd05a1948 in process_netievent (worker=worker@entry=0x5640815a4540, ievent=0x7fcc6c0008d0) at netmgr.c:952 #12 0x00007fccd05a1b8e in process_queue (worker=worker@entry=0x5640815a4540, type=type@entry=NETIEVENT_TASK) at netmgr.c:1021 #13 0x00007fccd05a2334 in process_all_queues (worker=0x5640815a4540) at netmgr.c:792 #14 async_cb (handle=0x5640815a48a0) at netmgr.c:821 #15 0x00007fcccf13bf83 in ?? () from /usr/lib/x86_64-linux-gnu/libuv.so.1 #16 0x00007fcccf13c066 in ?? () from /usr/lib/x86_64-linux-gnu/libuv.so.1 #17 0x00007fcccf14aee8 in ?? () from /usr/lib/x86_64-linux-gnu/libuv.so.1 #18 0x00007fcccf13c924 in uv_run () from /usr/lib/x86_64-linux-gnu/libuv.so.1 #19 0x00007fccd05a1c2c in nm_thread (worker0=0x5640815a4540) at netmgr.c:727 #20 0x00007fccd05bae66 in isc__trampoline_run (arg=0x564081633aa0) at trampoline.c:198 #21 0x00007fccced144a4 in start_thread (arg=0x7fcccb0e6700) at pthread_create.c:456 #22 0x00007fcccdf32d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
-
bin/tests/system/rpzextra/ns2/core.2254
Click to expand/collapse backtrace
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007fca9280c42a in __GI_abort () at abort.c:89 #2 0x000055b9d91d53d5 in assertion_failed (file=<optimized out>, line=<optimized out>, type=<optimized out>, cond=<optimized out>) at ./main.c:274 #3 0x00007fca94f1392a in isc_assertion_failed (file=file@entry=0x7fca95c4a524 "interfacemgr.c", line=line@entry=652, type=type@entry=isc_assertiontype_insist, cond=cond@entry=0x7fca95c49ee0 "(__builtin_expect(!!((ifp) != ((void *)0)), 1) && __builtin_expect(!!(((const isc__magic_t *)(ifp))->magic == ((('I') << 24 | (':') << 16 | ('-') << 8 | (')')))), 1))") at assertions.c:48 #4 0x00007fca95c1e9c7 in purge_old_interfaces (mgr=mgr@entry=0x7fca7c000a00) at interfacemgr.c:652 #5 0x00007fca95c201f8 in ns_interfacemgr_scan0 (verbose=<optimized out>, mgr=0x7fca7c000a00) at interfacemgr.c:1116 #6 ns_interfacemgr_scan (mgr=0x7fca7c000a00, verbose=verbose@entry=false) at interfacemgr.c:1152 #7 0x00007fca95c20319 in route_event (task=<optimized out>, event=<optimized out>) at interfacemgr.c:153 #8 0x00007fca94f4629d in task_run (task=0x55b9dbc74c40) at task.c:851 #9 isc_task_run (task=0x55b9dbc74c40) at task.c:944 #10 0x00007fca94f29e15 in isc__nm_async_task (ev0=ev0@entry=0x7fca500008d0, worker=0x55b9db32f540) at netmgr.c:873 #11 0x00007fca94f2f948 in process_netievent (worker=worker@entry=0x55b9db32f540, ievent=0x7fca500008d0) at netmgr.c:952 #12 0x00007fca94f2fb8e in process_queue (worker=worker@entry=0x55b9db32f540, type=type@entry=NETIEVENT_TASK) at netmgr.c:1021 #13 0x00007fca94f30334 in process_all_queues (worker=0x55b9db32f540) at netmgr.c:792 #14 async_cb (handle=0x55b9db32f8a0) at netmgr.c:821 #15 0x00007fca93ac9f83 in ?? () from /usr/lib/x86_64-linux-gnu/libuv.so.1 #16 0x00007fca93aca066 in ?? () from /usr/lib/x86_64-linux-gnu/libuv.so.1 #17 0x00007fca93ad8ee8 in ?? () from /usr/lib/x86_64-linux-gnu/libuv.so.1 #18 0x00007fca93aca924 in uv_run () from /usr/lib/x86_64-linux-gnu/libuv.so.1 #19 0x00007fca94f2fc2c in nm_thread (worker0=0x55b9db32f540) at netmgr.c:727 #20 0x00007fca94f48e66 in isc__trampoline_run (arg=0x55b9db3bd650) at trampoline.c:198 #21 0x00007fca936a24a4 in start_thread (arg=0x7fca8fa74700) at pthread_create.c:456 #22 0x00007fca928c0d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
In both cases, the same assertion fails:
644 /*%
645 * Remove any interfaces whose generation number is not the current one.
646 */
647 static void
648 purge_old_interfaces(ns_interfacemgr_t *mgr) {
649 ns_interface_t *ifp, *next;
650 LOCK(&mgr->lock);
651 for (ifp = ISC_LIST_HEAD(mgr->interfaces); ifp != NULL; ifp = next) {
652 >>> INSIST(NS_INTERFACE_VALID(ifp));
653 next = ISC_LIST_NEXT(ifp, link);
654 if (ifp->generation != mgr->generation) {
655 char sabuf[256];
656 ISC_LIST_UNLINK(ifp->mgr->interfaces, ifp, link);
657 isc_sockaddr_format(&ifp->addr, sabuf, sizeof(sabuf));
658 isc_log_write(IFMGR_COMMON_LOGARGS, ISC_LOG_INFO,
659 "no longer listening on %s", sabuf);
660 ns_interface_shutdown(ifp);
661 ns_interface_detach(&ifp);
662 }
663 }
664 UNLOCK(&mgr->lock);
665 }
In both cases, ifp->magic
is 0, but ifp->references
varies: it is
set to 0 in one case and to 1 in the other.
A quick look at the relevant code suggests that these failures could be
caused by various imperfections in lib/ns/interface.c
. Specifically,
purge_old_interfaces()
clearly expects that any ns_interface_t
present on the mgr->interfaces
linked list is a valid interface at all
times; meanwhile, other parts of the code violate that assumption:
-
When a failure occurs in
ns_interface_create()
(e.g. whenns_clientmgr_create()
returnsISC_R_SHUTTINGDOWN
), the error handling section does not removeifp
frommgr->interfaces
(it also does not detach fromifp->mgr
). -
If the reference count for
ifp
drops to zero,ns_interface_destroy()
does not removeifp
frommgr->interfaces
.
I have not reproduced the exact same crashes locally, but it seems to me that when a route socket event is processed after one of the above code paths is taken, nothing good can happen...
As a side note, I would expect the relevant ns_interface_t
structures
in the core dumps to be memset()
to 0xde
, but I believe that recent
BIND 9.16.x releases (since !5637 (merged)) no longer use the internal memory
allocator by default and therefore ISC_MEMFLAG_FILL
is not set.
I do not think these flaws are exploitable in any way.