Hang on shutdown in the "tcp" system test
The tcp
system test fails fairly often as it puts quite a bit of
strain on the test host, so this job might not have to be a hang,
but the backtrace seems to be a match:
D:tcp:#0 0x00007f1ce8f765be in pthread_barrier_wait () from /lib64/libpthread.so.0
D:tcp:[Current thread is 1 (Thread 0x7f1cecac22c0 (LWP 22507))]
D:tcp:#0 0x00007f1ce8f765be in pthread_barrier_wait () from /lib64/libpthread.so.0
D:tcp:#1 0x00007f1cea0aca7d in uv_barrier_wait () from /lib64/libuv.so.1
D:tcp:#2 0x00007f1cec14c414 in isc__nm_async_tcpdnsstop (worker=worker@entry=0x7f1ce5276000, ev0=ev0@entry=0x7f1ce4b7d800) at netmgr/tcpdns.c:670
D:tcp:#3 0x00007f1cec146a92 in process_netievent (arg=0x7f1ce4b7d800) at netmgr/netmgr.c:463
D:tcp:#4 0x00007f1cec146db3 in isc__nm_process_ievent (worker=<optimized out>, event=<optimized out>) at netmgr/netmgr.c:567
D:tcp:#5 0x00007f1cec14b585 in stop_tcpdns_child (sock=sock@entry=0x7f1ce53eac00, tid=tid@entry=0) at netmgr/tcpdns.c:605
D:tcp:#6 0x00007f1cec14bed4 in isc__nm_tcpdns_stoplistening (sock=0x7f1ce53eac00) at netmgr/tcpdns.c:632
D:tcp:#7 0x00007f1cec143356 in isc_nm_stoplistening (sock=<optimized out>) at netmgr/netmgr.c:2091
D:tcp:#8 0x00007f1cebae0559 in ns_interface_shutdown (ifp=ifp@entry=0x7f1ce528a500) at interfacemgr.c:742
D:tcp:#9 0x00007f1cebae08d9 in purge_old_interfaces (mgr=mgr@entry=0x7f1ce53d3460) at interfacemgr.c:828
D:tcp:#10 0x00007f1cebae0b8b in ns_interfacemgr_shutdown (mgr=0x7f1ce53d3460) at interfacemgr.c:447
D:tcp:#11 0x000000000044288a in shutdown_server (task=<optimized out>, event=<optimized out>) at server.c:10124
D:tcp:#12 0x00007f1cec178944 in task_run (task=<optimized out>, task@entry=0x7f1ce5225e80) at task.c:470
D:tcp:#13 0x00007f1cec178a85 in task__run (arg=0x7f1ce5225e80) at task.c:287
D:tcp:#14 0x00007f1cec160bf4 in isc__job_cb (idle=0x7f1ce52236c8) at job.c:75
D:tcp:#15 0x00007f1cea0a6a49 in uv.run_idle () from /lib64/libuv.so.1
D:tcp:#16 0x00007f1cea09fbf8 in uv_run () from /lib64/libuv.so.1
D:tcp:#17 0x00007f1cec166d31 in loop_run (loop=0x7f1ce52a1e40) at loop.c:270
D:tcp:#18 loop_thread (arg=0x7f1ce52a1e40) at loop.c:297
D:tcp:#19 0x00007f1cec167dd3 in isc_loopmgr_run (loopmgr=0x7f1ce5223540) at loop.c:477
D:tcp:#20 0x00000000004238e0 in main (argc=<optimized out>, argv=<optimized out>) at main.c:1545
I do not recall seeing this particular backtrace before, so I assumed it makes sense to at least retain job artifacts and have this problem catalogued as a GitLab issue.
Perhaps a total red herring, but isc__nm_async_tcpdnsstop()
is also
present in the backtrace for one of the threads in a failed unit test
job reported elsewhere, which also happened on Oracle Linux 8.