named won't shutdown in RPZ stress test or respdiff test
RPZ stress tests on Fedora 36 amd64 (and sometimes arm64, but never FreeBSD 12 on arm64) on main
fails because of stuck named
which won't shutdown.
In the mnowak/main-stress-test-port-process-termination
branch ABRT
is sent, when named
does not shutdown after a timeout (1 and 10 minutes were tested). This problem is not visible in the production CI yet, but will when isc-private/bind-qa!43 is merged.
git bisect
revealed that the problem starts with !6907 (merged), which is present in the upcoming 9.19.7 release.
Core was generated by `/builds/isc-projects/bind9/.local/usr/local/sbin/named -d 99 -f -c ./named.conf'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007faf418c678e in epoll_wait (epfd=5, events=0x7ffdde0b1a40, maxevents=1024, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
30 return SYSCALL_CANCEL (epoll_wait, epfd, events, maxevents, timeout);
[Current thread is 1 (Thread 0x7faf40f63200 (LWP 18535))]
#0 0x00007faf418c678e in epoll_wait (epfd=5, events=0x7ffdde0b1a40, maxevents=1024, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1 0x00007faf41ca2220 in uv__io_poll (loop=0x7faf40ac4020, timeout=-1) at /usr/src/libuv-v1.44.1/src/unix/epoll.c:236
#2 0x00007faf41c86c2d in uv_run (loop=0x7faf40ac4020, mode=UV_RUN_DEFAULT) at /usr/src/libuv-v1.44.1/src/unix/core.c:391
#3 0x00007faf424b9c51 in loop_run (loop=0x7faf40ac4000) at loop.c:270
#4 loop_thread (arg=0x7faf40ac4000) at loop.c:297
#5 0x00007faf424bae41 in isc_loopmgr_run (loopmgr=0x7faf40a886c0) at loop.c:477
#6 0x00000000004240e9 in main (argc=<optimized out>, argv=<optimized out>) at main.c:1545
thread apply all bt full
: backtrace.txt
Job artifacts have been preserved in https://gitlab.isc.org/isc-projects/bind9/-/jobs/2899157.