Memory context reference leak intermittently triggered by the "stress" system test
The stress
system test (not the performance test that is only run
in scheduled pipelines!) fails intermittently due to the reference count
for one of the memory contexts not being brought down to zero upon
shutdown.
Example: https://gitlab.isc.org/isc-projects/bind9/-/jobs/3074540
bin/tests/system/stress/ns3/named.run
:
...
17-Jan-2023 01:18:03.406 calling free_rbtdb(zone000004.example)
17-Jan-2023 01:18:03.406 done free_rbtdb(zone000001.example)
17-Jan-2023 01:18:03.410 adjust_quantum: old=200, new=275
17-Jan-2023 01:18:03.410 adjust_quantum: old=400, new=550
17-Jan-2023 01:18:03.410 done free_rbtdb(zone000000.example)
17-Jan-2023 01:18:03.410 done free_rbtdb(zone000004.example)
17-Jan-2023 01:18:03.422 Unregistering DLZ_dlopen driver
17-Jan-2023 01:18:03.422 Unregistering SDLZ driver.
17-Jan-2023 01:18:03.422 Unregistering DLZ driver.
17-Jan-2023 01:18:03.422 exiting
Dump of all outstanding memory allocations:
None.
mem.c:675: REQUIRE(isc_refcount_current(&ctx->references) == 0) failed
Backtrace:
(lldb) bt
* thread #1, name = 'named', stop reason = signal SIGABRT
* frame #0: 0x00007f4e4f2c7ce1 libc.so.6`__GI_raise(sig=<unavailable>) at raise.c:51:1
frame #1: 0x00007f4e4f2b1537 libc.so.6`__GI_abort at abort.c:79:7
frame #2: 0x0000560aa520ee31 named`assertion_failed(file=<unavailable>, line=<unavailable>, type=<unavailable>, cond=<unavailable>) at main.c:236:3
frame #3: 0x00007f4e5161983a libisc-9.18.12-dev.so`isc_assertion_failed(file=<unavailable>, line=675, type=isc_assertiontype_require, cond=<unavailable>) at assertions.c:48:2
frame #4: 0x00007f4e5168ae2d libisc-9.18.12-dev.so`isc__mem_destroy(ctxp=0x0000560aa5d3c200, file=<unavailable>, line=1614) at mem.c:675:2
frame #5: 0x0000560aa520df76 named`main(argc=<unavailable>, argv=0x00007ffc6f3a9ec8) at main.c:1614:2
frame #6: 0x00007f4e4f2b2d0a libc.so.6`__libc_start_main(main=(named`main at main.c:1480), argc=16, argv=0x00007ffc6f3a9ec8, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007ffc6f3a9eb8) at libc-start.c:308:16
frame #7: 0x0000560aa511ba5a named`_start + 42
The context in question is the main
one. It has one outstanding
reference.
I can only confirm this with 100% certainty for v9.18 right now, but we should definitely keep our minds open wrt other branches being affected ;-)