Hangs on shutdown with netmgr-based rndc (reference counting glitch?)
The dnstap
test fails 50 % of time on FreeBSD 12.1:
S:dnstap:2020-07-29T15:32:53+0200
T:dnstap:1:A
A:dnstap:System test dnstap
I:dnstap:PORTS:5421,5422,5423,5424,5425,5426,5427,5428,5429,5430
I:dnstap:starting servers
I:dnstap:checking that named-checkconf detects error in bad-fstrm-reopen-interval.conf
I:dnstap:checking that named-checkconf detects error in bad-fstrm-set-buffer-hint-max.conf
I:dnstap:checking that named-checkconf detects error in bad-fstrm-set-buffer-hint-min.conf
I:dnstap:checking that named-checkconf detects error in bad-fstrm-set-flush-timeout-max.conf
I:dnstap:checking that named-checkconf detects error in bad-fstrm-set-flush-timeout-min.conf
I:dnstap:checking that named-checkconf detects error in bad-fstrm-set-input-queue-size-max.conf
I:dnstap:checking that named-checkconf detects error in bad-fstrm-set-input-queue-size-min.conf
I:dnstap:checking that named-checkconf detects error in bad-fstrm-set-input-queue-size-po2.conf
I:dnstap:checking that named-checkconf detects error in bad-fstrm-set-output-notify-threshold.conf
I:dnstap:checking that named-checkconf detects error in bad-fstrm-set-output-queue-size-max.conf
I:dnstap:checking that named-checkconf detects error in bad-fstrm-set-output-queue-size-min.conf
I:dnstap:checking that named-checkconf detects error in bad-fstrm-set-reopen-interval-max.conf
I:dnstap:checking that named-checkconf detects error in bad-fstrm-set-reopen-interval-min.conf
I:dnstap:checking that named-checkconf detects error in bad-missing-dnstap-output-view.conf
I:dnstap:checking that named-checkconf detects error in bad-missing-dnstap-output.conf
I:dnstap:checking that named-checkconf detects error in bad-size-version.conf
I:dnstap:checking that named-checkconf detects no error in good-dnstap-in-options.conf
I:dnstap:checking that named-checkconf detects no error in good-dnstap-in-view.conf
I:dnstap:checking that named-checkconf detects no error in good-fstrm-reopen-interval.conf
I:dnstap:checking that named-checkconf detects no error in good-fstrm-set-buffer-hint.conf
I:dnstap:checking that named-checkconf detects no error in good-fstrm-set-flush-timeout.conf
I:dnstap:checking that named-checkconf detects no error in good-fstrm-set-input-queue-size.conf
I:dnstap:checking that named-checkconf detects no error in good-fstrm-set-output-notify-threshold.conf
I:dnstap:checking that named-checkconf detects no error in good-fstrm-set-output-queue-model-mpsc.conf
I:dnstap:checking that named-checkconf detects no error in good-fstrm-set-output-queue-model-spsc.conf
I:dnstap:checking that named-checkconf detects no error in good-fstrm-set-output-queue-size.conf
I:dnstap:checking that named-checkconf detects no error in good-fstrm-set-reopen-interval.conf
I:dnstap:checking that named-checkconf detects no error in good-size-unlimited.conf
I:dnstap:checking that named-checkconf detects no error in good-size-version.conf
I:dnstap:wait for servers to finish loading
I:dnstap:checking initial message counts
I:dnstap:checking UDP message counts
I:dnstap:checking TCP message counts
I:dnstap:checking AUTH_QUERY message counts
I:dnstap:checking AUTH_RESPONSE message counts
I:dnstap:checking CLIENT_QUERY message counts
I:dnstap:checking CLIENT_RESPONSE message counts
I:dnstap:checking RESOLVER_QUERY message counts
I:dnstap:checking RESOLVER_RESPONSE message counts
I:dnstap:checking UPDATE_QUERY message counts
I:dnstap:checking UPDATE_RESPONSE message counts
I:dnstap:checking reopened message counts
dnstap-read: dns_dt_openfile: failure
dnstap-read: dns_dt_openfile: failure
dnstap-read: dns_dt_openfile: failure
dnstap-read: dns_dt_openfile: failure
dnstap-read: dns_dt_openfile: failure
dnstap-read: dns_dt_openfile: failure
dnstap-read: dns_dt_openfile: failure
dnstap-read: dns_dt_openfile: failure
dnstap-read: dns_dt_openfile: failure
dnstap-read: dns_dt_openfile: failure
I:dnstap:checking UDP message counts
I:dnstap:ns3 0 expected 2
I:dnstap:failed
I:dnstap:checking TCP message counts
I:dnstap:checking AUTH_QUERY message counts
I:dnstap:checking AUTH_RESPONSE message counts
I:dnstap:checking CLIENT_QUERY message counts
I:dnstap:ns3 0 expected 1
I:dnstap:failed
I:dnstap:checking CLIENT_RESPONSE message counts
I:dnstap:ns3 0 expected 1
I:dnstap:failed
I:dnstap:checking RESOLVER_QUERY message counts
I:dnstap:checking RESOLVER_RESPONSE message counts
I:dnstap:checking UPDATE_QUERY message counts
I:dnstap:checking UPDATE_RESPONSE message counts
I:dnstap:checking dnstap-read hex output
dnstap-read: dns_dt_openfile: failure
I:dnstap:failed
I:dnstap:checking unix socket message counts
I:dnstap:checking UDP message counts
I:dnstap:checking TCP message counts
I:dnstap:checking AUTH_QUERY message counts
I:dnstap:checking AUTH_RESPONSE message counts
I:dnstap:checking CLIENT_QUERY message counts
I:dnstap:checking CLIENT_RESPONSE message counts
I:dnstap:checking RESOLVER_QUERY message counts
I:dnstap:checking RESOLVER_RESPONSE message counts
I:dnstap:checking UPDATE_QUERY message counts
I:dnstap:checking UPDATE_RESPONSE message counts
I:dnstap:checking reopened unix socket message counts
I:dnstap:checking UDP message counts
I:dnstap:checking TCP message counts
I:dnstap:checking AUTH_QUERY message counts
I:dnstap:checking AUTH_RESPONSE message counts
I:dnstap:checking CLIENT_QUERY message counts
I:dnstap:checking CLIENT_RESPONSE message counts
I:dnstap:checking RESOLVER_QUERY message counts
I:dnstap:checking RESOLVER_RESPONSE message counts
I:dnstap:checking UPDATE_QUERY message counts
I:dnstap:checking UPDATE_RESPONSE message counts
I:dnstap:checking large packet printing
I:dnstap:checking 'rndc -roll <value>' (no versions)
I:dnstap:no response from ns3
I:dnstap:failed
I:dnstap:ns3 didn't die when sent a SIGTERM
rndc: connect failed: 10.53.0.3#5430: connection refused
I:dnstap:failed
I:dnstap:checking 'rndc -roll <value>' (versions)
I:dnstap:exit status: 5
I:dnstap:stopping servers
I:dnstap:ns3 died before a SIGTERM was sent
I:dnstap:stopping servers failed
I:dnstap:Core dump(s) found: dnstap/ns3/named.core
D:dnstap:backtrace from dnstap/ns3/named.core:
D:dnstap:--------------------------------------------------------------------------------
D:dnstap:Core was generated by `/usr/home/newman/bind9/bin/named/.libs/named -D dnstap-ns3 -X named.lock -m reco'.
D:dnstap:Program terminated with signal SIGABRT, Aborted.
D:dnstap:#0 0x0000000800e7793a in _nanosleep () from /lib/libc.so.7
D:dnstap:[Current thread is 1 (LWP 100491)]
D:dnstap:#0 0x0000000800e7793a in _nanosleep () from /lib/libc.so.7
D:dnstap:#1 0x0000000800d0f11c in ?? () from /lib/libthr.so.3
D:dnstap:#2 0x0000000800ecd356 in usleep () from /lib/libc.so.7
D:dnstap:#3 0x00000008002e65fa in isc_nm_destroy (mgr0=0x2740b0 <named_g_nm>) at netmgr/netmgr.c:422
D:dnstap:#4 0x000000000023884e in destroy_managers () at main.c:970
D:dnstap:#5 cleanup () at main.c:1304
D:dnstap:#6 main (argc=<optimized out>, argv=0x7fffffffda50) at main.c:1561
D:dnstap:--------------------------------------------------------------------------------
D:dnstap:full backtrace from dnstap/ns3/named.core saved in named.core-backtrace.txt
D:dnstap:core dump dnstap/ns3/named.core archived as dnstap/ns3/named.core.gz
R:dnstap:FAIL
E:dnstap:2020-07-29T15:35:09+0200
FAIL dnstap (exit status: 1)
(If the framework can't stop the server ABRT
is sent, hence the backtrace.)
NS3 named.run
file: named.run
Git bisect says that 3551d3ff is the culprit:
3551d3ffd2afe6542e79a5571b326bd77fdd265e is the first bad commit
commit 3551d3ffd2afe6542e79a5571b326bd77fdd265e
Author: Evan Hunt <each@isc.org>
Date: Thu Apr 16 13:06:42 2020 -0700
convert rndc and control channel to use netmgr
Edited by Michał Kępień