Deadlock with concurrent `rndc addzone`/`rndc delzone` commands
Summary
A deadlock can happens between concurrent rndc delzone
and rndc addzone
commands (on different zones) making the named process and every following rndc
commands stuck.
BIND version used
bind9 debian buster-backports package 1:9.16.13-1~bpo10+1
BIND 9.16.13-Debian (Stable Release) <id:072e758>
running on Linux x86_64 5.11.8-200.fc33.x86_64 #1 SMP Mon Mar 22 01:35:01 UTC 2021
built by make with '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=/usr/include' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-silent-rules' '--libdir=/usr/lib/x86_64-linux-gnu' '--runstatedir=/run' '--disable-maintainer-mode' '--disable-dependency-tracking' '--libdir=/usr/lib/x86_64-linux-gnu' '--sysconfdir=/etc/bind' '--with-python=python3' '--localstatedir=/' '--enable-threads' '--enable-largefile' '--with-libtool' '--enable-shared' '--enable-static' '--with-gost=no' '--with-openssl=/usr' '--with-gssapi=/usr' '--with-libidn2' '--with-json-c' '--with-lmdb=/usr' '--with-gnu-ld' '--with-maxminddb' '--with-atf=no' '--enable-ipv6' '--enable-rrl' '--enable-filter-aaaa' '--disable-native-pkcs11' '--enable-dnstap' 'build_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -fdebug-prefix-map=/build/bind9-xJGMKz/bind9-9.16.13=. -fstack-protector-strong -Wformat -Werror=format-security -fno-strict-aliasing -fno-delete-null-pointer-checks -DNO_VERSION_DATE -DDIG_SIGCHASE' 'LDFLAGS=-Wl,-z,relro -Wl,-z,now' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2'
compiled by GCC 8.3.0
compiled with OpenSSL version: OpenSSL 1.1.1d 10 Sep 2019
linked to OpenSSL version: OpenSSL 1.1.1d 10 Sep 2019
compiled with libuv version: 1.24.1
linked to libuv version: 1.24.1
compiled with libxml2 version: 2.9.4
linked to libxml2 version: 20904
compiled with json-c version: 0.12.1
linked to json-c version: 0.12.1
compiled with zlib version: 1.2.11
linked to zlib version: 1.2.11
linked to maxminddb version: 1.3.2
compiled with protobuf-c version: 1.3.1
linked to protobuf-c version: 1.3.1
threads support is enabled
default paths:
named configuration: /etc/bind/named.conf
rndc configuration: /etc/bind/rndc.conf
DNSSEC root key: /etc/bind/bind.keys
nsupdate session key: //run/named/session.key
named PID file: //run/named/named.pid
named lock file: //run/named/named.lock
geoip-directory: /usr/share/GeoIP
Has been also reproduced on a 9.11.25 build, but not on the debian buster version (1:9.11.5.P4+dfsg-5.1+deb10u3)
Steps to reproduce
The issue can be triggered by running many rndc addzone
& rndc delzone
in parallel.
A repository with a POC is available here: https://github.com/bengentil/bind-deadlock
What is the current bug behavior?
When the bug happens the named process is deadlocked and all rndc
commands are stuck.
What is the expected correct behavior?
The expected behavior is to have all rndc
commands to be executed to completion without any deadlock.
Relevant configuration files
I don't have any specific configuration in place to reproduce this issue, as far as I know this issue is not avoidable by configuration.
Relevant logs and/or screenshots
An extract of a gdb backtrace of the deadlocked process (only the 2 relevant threads):
Thread 16 (Thread 0x7f549b6bf700 (LWP 58)):
#0 0x00007f54a5a9e00c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1 0x00007f54a5f560a9 in isc_task_beginexclusive (task0=<optimized out>) at task.c:1662
task = <optimized out>
manager = 0x7f54a02ba010
#2 0x000055a0f169bf99 in do_addzone (zoneconf=<optimized out>, text=0x7f549b6bebc8, redirect=false, zoneobj=0x7f5495d48a38, name=0x7f549b6be830, view=0x7f5490bc1ea0, cfg=0x7f5495d4e048, server=0x7f54a02c0010) at ./server.c:13398
tresult = <optimized out>
zone = 0x0
dbi = 2428247712
result = 0
txn = 0x0
result = <optimized out>
tresult = <optimized out>
zone = <optimized out>
txn = <optimized out>
dbi = <optimized out>
dbp = <optimized out>
#3 named_server_changezone (server=0x7f54a02c0010, command=<optimized out>, text=text@entry=0x7f549b6bebc8) at ./server.c:13800
result = <optimized out>
addzone = true
redirect = false
cfg = 0x7f5495d4e048
zoneconf = 0x7f5495d48b00
zoneobj = 0x7f5495d48a38
zonename = 0x7f5458008430 "test1.local"
view = 0x7f5490bc1ea0
buf = {magic = 1114990113, base = 0x7f5458008430, length = 11, used = 11, current = 11, active = 0, link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, mctx = 0x0, autore = false}
fname = {name = {magic = 1145983854, ndata = 0x7f549b6be940 "\005test1\005local", length = 13, labels = 3, attributes = 1, offsets = 0x7f549b6be880 "", buffer = 0x7f549b6be900, link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, list = {head = 0x0, tail = 0x0}}, offsets = "\000\006\f\000\000\000\000\000\240\307Z\245T\177\000\000\000\200\255\373\320\224\004\255\200\314\005XT\177\000\000\200\314\005XT\177\000\000\200\314\005XT\177\000\000\200\314\005XT\177\000\000\000E\027^\345\027Y\223\310\353k\233T\177\000\000\350\203,\240T\177\000\000\000\352k\233T\177\000\000\320\351k\233T\177\000\000@\366+\240T\177\000\000\277\306X\245T\177\000\000 \000\000\000\060\000\000\000\320\351k\233T\177\000", buffer = {magic = 1114990113, base = 0x7f549b6be940, length = 255, used = 13, current = 0, active = 0, link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, mctx = 0x0, autore = false}, data = "\005test1\005local\000\177\000\000\000\020\000\000\000\000\000\000\000E\027^\345\027Y\223\a\000\000\000\000\000\000\000\310\353k\233T\177\000\000\000\000\000\000\000\000\000\000\270\352k\233T\177\000\000p_\316\225T\177\000\000\300!\371\245T\177\000\000\000\000\000\000\000\000\000\000\242\347\362\245T\177\000\000\260\351k\233T\177\000\000\260\216\025\243T\177\000\000\060\000\000\000\060\000\000\000\220\352k\233T\177\000\000\320\351k\233T\177\000\000\000E\027^\345\027Y\223buffer-0x7f549\000\000\223\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000w_\316\225T\177\000\000\260E\005XT\177\000\000\002\000\000\000\000\000\000\000"...}
dnsname = 0x7f549b6be830
#4 0x000055a0f167d7d7 in named_control_docommand (message=<optimized out>, readonly=<optimized out>, text=text@entry=0x7f549b6bebc8) at control.c:209
data = <optimized out>
cmdline = 0x7f54580545b0 "addzone test1.local {type master; file \"/var/cache/bind/test1.local.db\"; auto-dnssec maintain; inline-signing yes; key-directory \"/var/cache/bind/keys\";};"
command = 0x7f5495ce5f70 "addzone"
result = 0
log_level = <optimized out>
src = {magic = 1114990113, base = 0x7f54580545b0, length = 154, used = 154, current = 8, active = 0, link = {prev = 0xffffffffffffffff, next = 0xffffffffffffffff}, mctx = 0x0, autore = false}
lex = 0x7f54a02c83e8
#5 0x000055a0f16809f3 in control_recvmessage (task=0x7f5495d25010, event=<optimized out>) at controlconf.c:476
conn = 0x7f5495d43e50
listener = 0x7f5495d4d120
key = 0x7f54a02bf640
request = 0x7f5458005240
response = 0x0
algorithm = 163
secret = {rstart = 0x7f54a315bab0 "\342\061\016`\001\201\065\067\242U\313\033|\321\334\350\303\335\213OX\252u\f\210\033v\271K\221x\342\020\273\025\243T\177", rend = 0x7f54a315bad0 "\020\273\025\243T\177"}
now = 1617996781
b = <optimized out>
r = {base = 0x7f54a1ec4f0d ";", length = 2716618509}
text = 0x7f544c046db0
result = <optimized out>
eresult = <optimized out>
_ctrl = 0x7f54580540e0
sent = 1617996781
exp = 1617996841
nonce = 459112386
data = 0x0
#6 0x00007f54a5f53e4b in dispatch (threadid=<optimized out>, manager=0x7f54a02ba010) at task.c:1153
dispatch_count = 24
done = false
finished = false
requeue = false
event = 0x7f5495d43ed0
task = 0x7f5495d25010
task = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
dispatch_count = <optimized out>
done = <optimized out>
requeue = <optimized out>
finished = <optimized out>
event = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
was_idle = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
empty = <optimized out>
i = <optimized out>
__atomic_store_ptr = <optimized out>
__atomic_store_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
#7 run (queuep=<optimized out>) at task.c:1345
tq = <optimized out>
manager = 0x7f54a02ba010
threadid = <optimized out>
#8 0x00007f54a5f5a282 in isc__trampoline_run (arg=0x55a0f2d9d970) at trampoline.c:191
trampoline = 0x55a0f2d9d970
result = <optimized out>
#9 0x00007f54a5a97fa3 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#10 0x00007f54a562d4cf in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
Thread 15 (Thread 0x7f549bec0700 (LWP 57)):
#0 0x00007f54a5aa129c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1 0x00007f54a5a9a87a in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#2 0x000055a0f168caef in rmzone (task=<optimized out>, event=<optimized out>) at ./server.c:13890
dz = 0x7f5495d34ff0
zone = 0x7f5464008d10
raw = 0x0
mayberaw = <optimized out>
zonename = "test2.local", '\000' <repeats 17 times>, "T\177", '\000' <repeats 26 times>, "\377\377\377\377\377\377\377\377RSND\377\377\377\377\000\000\000\000\000\000\000\000", '\377' <repeats 16 times>, '\000' <repeats 20 times>, "\377\377\377\377", '\000' <repeats 72 times>, "X\002\000\000\000\000\000\000"...
view = 0x7f5490bc1ea0
cfg = <optimized out>
dbp = 0x0
added = true
result = <optimized out>
txn = 0x0
dbi = 4294967295
#3 0x00007f54a5f53e4b in dispatch (threadid=<optimized out>, manager=0x7f54a02ba010) at task.c:1153
dispatch_count = 0
done = false
finished = false
requeue = false
event = 0x7f549559be48
task = 0x7f5495d25be0
task = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
dispatch_count = <optimized out>
done = <optimized out>
requeue = <optimized out>
finished = <optimized out>
event = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
was_idle = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
empty = <optimized out>
i = <optimized out>
__atomic_store_ptr = <optimized out>
__atomic_store_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
__atomic_load_ptr = <optimized out>
__atomic_load_tmp = <optimized out>
#4 run (queuep=<optimized out>) at task.c:1345
tq = <optimized out>
manager = 0x7f54a02ba010
threadid = <optimized out>
#5 0x00007f54a5f5a282 in isc__trampoline_run (arg=0x55a0f2d9d7e0) at trampoline.c:191
trampoline = 0x55a0f2d9d7e0
result = <optimized out>
#6 0x00007f54a5a97fa3 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#7 0x00007f54a562d4cf in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
The corresponding relevant logs:
Apr 9 19:33:01 f402d8b027a0 named[42]: received control channel command 'delzone test2.local'
Apr 9 19:33:01 f402d8b027a0 named[42]: zone test2.local scheduled for removal via delzone
Apr 9 19:33:01 f402d8b027a0 named[42]: received control channel command 'addzone test1.local {type master; file "/var/cache/bind/test1.local.db"; auto-dnssec maintain; inline-signing yes; key-directory "/var/cache/bind/keys";};'
Apr 9 19:33:01 f402d8b027a0 named[42]: deleting zone test2.local in view _default via delzone
We can see that rndc addzone test1.local [...]
has been executed during an already running rndc delzone test2.local
and both threads are now waiting.
The thread 15/LWP 57 (delzone) is waiting for a lock owned by thread 16/LWP 58 (addzone)
(gdb) thread 15
[Switching to thread 15 (Thread 0x7f549bec0700 (LWP 57))]
#0 0x00007f54a5aa129c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) frame 2
#2 0x000055a0f168caef in rmzone (task=<optimized out>, event=<optimized out>) at ./server.c:13890
13890 in ./server.c
(gdb) p *(&view->new_zone_lock->__data->__owner)
$9 = 58
isc_task_beginexclusive waits while this is true: manager->halted + 1 < manager->workers
(gdb) thread 16
[Switching to thread 16 (Thread 0x7f549b6bf700 (LWP 58))]
#0 0x00007f54a5a9e00c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) frame 1
#1 0x00007f54a5f560a9 in isc_task_beginexclusive (task0=<optimized out>) at task.c:1662
1662 task.c: No such file or directory.
(gdb) p manager->halted
$10 = 6
(gdb) p manager->workers
$11 = 8
Observations
The rmzone relevant code is:
-
server.c:13890
13855 rmzone(isc_task_t *task, isc_event_t *event) { [...] 13888 #ifdef HAVE_LMDB 13889 /* Make sure we can open the NZD database */ ->13890 LOCK(&view->new_zone_lock);
The addzone relevant code is:
-
13337 do_addzone(named_server_t *server, ns_cfgctx_t *cfg, dns_view_t *view, 13338 dns_name_t *name, cfg_obj_t *zoneconf, const cfg_obj_t *zoneobj, 13339 bool redirect, isc_buffer_t **text) { [...] 13350 LOCK(&view->new_zone_lock); [...] 13387 /* Make sure we can open the NZD database */ 13388 result = nzd_writable(view); 13389 if (result != ISC_R_SUCCESS) { 13390 TCHECK(putstr(text, "unable to open NZD database for '")); 13391 TCHECK(putstr(text, view->new_zone_db)); 13392 TCHECK(putstr(text, "'")); 13393 result = ISC_R_FAILURE; 13394 goto cleanup; 13395 } 13396 #endif /* HAVE_LMDB */ 13397 ->13398 result = isc_task_beginexclusive(server->task);
-
1634 isc_task_beginexclusive(isc_task_t *task0) { [...] 1660 while (manager->halted + 1 < manager->workers) { 1661 wake_all_queues(manager); ->1662 WAIT(&manager->halt_cond, &manager->halt_lock); 1663 }
My understanding is that the rmzone
is executed between the LOCK(&view->new_zone_lock)
and the isc_task_beginexclusive
in do_addzone
resulting in:
-
do_addzone
waits for other workers to stop (ie. waits forrmzone
to stop) -
rmzone
waits for theview->new_zone_lock
locked bydo_addzone
-
do_addzone
sees always 2 workers and theview->new_zone_lock
is never unlocked