"INSIST(s >= size);" assertion failed in mem_putstats()
The following crash happened when the tcp
test was run on an
Alpine 3.16 Docker image:
D:tcp:Core was generated by `/builds/isc-projects/bind9/bin/named/.libs/lt-named -D tcp-ns1 -X named.lock -m'.
D:tcp:Program terminated with signal SIGABRT, Aborted.
D:tcp:#0 __restore_sigs (set=set@entry=0x7f6d912bb080) at ./arch/x86_64/syscall_arch.h:40
D:tcp:[Current thread is 1 (LWP 26680)]
D:tcp:#0 __restore_sigs (set=set@entry=0x7f6d912bb080) at ./arch/x86_64/syscall_arch.h:40
D:tcp:#1 0x00007f6d93d32561 in raise (sig=sig@entry=6) at src/signal/raise.c:11
D:tcp:#2 0x00007f6d93d08f49 in abort () at src/exit/abort.c:11
D:tcp:#3 0x0000557481282cdc in assertion_failed (file=<optimized out>, line=<optimized out>, type=<optimized out>, cond=0x7f6d937ce5f5 "s >= size") at main.c:237
D:tcp:#4 0x00007f6d9378f8d8 in isc_assertion_failed (file=file@entry=0x7f6d937ce5e7 "mem.c", line=line@entry=422, type=type@entry=isc_assertiontype_insist, cond=cond@entry=0x7f6d937ce5f5 "s >= size") at assertions.c:49
D:tcp:#5 0x00007f6d937a0b66 in mem_putstats (ctx=ctx@entry=0x7f6d8fc12000, ptr=ptr@entry=0x7f6c55248a80, size=size@entry=65535) at mem.c:422
D:tcp:#6 0x00007f6d937a1924 in isc__mem_put (ctx=0x7f6d8fc12000, ptr=0x7f6c55248a80, size=size@entry=65535, alignment=alignment@entry=0, file=file@entry=0x7f6d9351c014 "client.c", line=line@entry=1631) at mem.c:778
D:tcp:#7 0x00007f6d934edbdf in ns__client_reset_cb (client0=0x7f6c55184c00) at client.c:1631
D:tcp:#8 0x00007f6d937795b9 in nmhandle_detach_cb (handlep=handlep@entry=0x7f6c6b524db8) at netmgr/netmgr.c:1802
D:tcp:#9 0x00007f6d9377be57 in isc__nm_async_detach (ev0=0x7f6c6b524d80, worker=0x7f6d9244bf80) at netmgr/netmgr.c:2829
D:tcp:#10 process_netievent (worker=worker@entry=0x7f6d9244bf80, ievent=0x7f6c6b524d80) at netmgr/netmgr.c:947
D:tcp:#11 0x00007f6d9377c13e in process_queue (worker=worker@entry=0x7f6d9244bf80, type=type@entry=NETIEVENT_PRIORITY) at netmgr/netmgr.c:979
D:tcp:#12 0x00007f6d9377ced2 in process_all_queues (worker=0x7f6d9244bf80) at netmgr/netmgr.c:749
D:tcp:#13 async_cb (handle=0x7f6d9244c2e0) at netmgr/netmgr.c:778
D:tcp:#14 0x00007f6d93070b59 in ?? () from /usr/lib/libuv.so.1
D:tcp:#15 0x00007f6d93080834 in uv.io_poll () from /usr/lib/libuv.so.1
D:tcp:#16 0x00007f6d9307113b in uv_run () from /usr/lib/libuv.so.1
D:tcp:#17 0x00007f6d9377c509 in nm_thread (worker0=0x7f6d9244bf80) at netmgr/netmgr.c:687
D:tcp:#18 0x00007f6d937b97c0 in isc__trampoline_run (arg=0x7f6d91fd9ca0) at trampoline.c:189
D:tcp:#19 0x00007f6d93d3f1f5 in start (p=0x7f6d912bef48) at src/thread/pthread_create.c:203
D:tcp:#20 0x00007f6d93d41470 in __clone () at src/thread/x86_64/clone.s:22
This corresponds to the following code location:
411 /*!
412 * Update internal counters after a memory put.
413 */
414 static void
415 mem_putstats(isc_mem_t *ctx, void *ptr, size_t size) {
416 struct stats *stats = stats_bucket(ctx, size);
417 uint_fast32_t s, g;
418
419 UNUSED(ptr);
420
421 s = atomic_fetch_sub_release(&ctx->inuse, size);
422 >>> INSIST(s >= size);
423
424 g = atomic_fetch_sub_release(&stats->gets, 1);
425 INSIST(g >= 1);
426
427 decrement_malloced(ctx, size);
428 }
Failing this assertion seemingly means that there is less tracked memory
in use than the size
of the allocation that just got released.
However, knowing that the particular tcp
check that triggers this
crash intentionally allocates a lot of memory, I believe this is a case
of integer truncation:
$ cat test.c
#include <stdio.h>
#include <stdatomic.h>
int main(void) {
atomic_size_t inuse = 1L << 33;
size_t size = 1L << 32;
atomic_uint_fast32_t s;
printf("(before) inuse=%lu\n", atomic_load(&inuse));
printf(" size=%lu\n", size);
s = atomic_fetch_sub(&inuse, size);
printf("(after) inuse=%lu\n", atomic_load(&inuse));
printf(" s=%u\n", s);
return 0;
}
$ gcc -o test -Wall -Wextra -pedantic test.c
$ ./test
(before) inuse=8589934592
size=4294967296
(after) inuse=4294967296
s=0
Since I do not recall seeing this crash before, I assume that
ctx->inuse
in this particular job must have hit a particularly small
value in its lower 32 bits.
I believe this issue has only just been introduced, by !5518 (merged).
AFAICT, all it takes to fix this is to make s
in mem_putstats()
an
atomic_size_t
.
I do not think exploiting this would be easy to control, but I am making this issue confidential for now since a crash is involved.