9.16.15 authoritative-only server problem with managed-keys.bind and managed-keys.bind.jnl
After upgrading from a BIND 9.11 to BIND 9.16.15, on an authoritative-only server, the managed-keys directory seems to be acquiring a proliferation of randomly-named files:
-rw-r--r-- 1 exampgrp examp 388 May 4 19:00 db-3yL0fLNf
-rw-r--r-- 1 exampgrp examp 388 May 4 15:00 db-4adyLhrt
-rw-r--r-- 1 exampgrp examp 388 May 5 03:00 db-7cCsPU4J
-rw-r--r-- 1 exampgrp examp 388 May 4 23:00 db-b3WBFwr6
-rw-r--r-- 1 exampgrp examp 388 May 5 15:00 db-SrQjIXGa
-rw-r--r-- 1 exampgrp examp 388 May 5 11:00 db-TJOlU34F
-rw-r--r-- 1 exampgrp examp 388 May 5 07:00 db-Ye1zfBJ6
-rw-r--r-- 1 exampgrp examp 2184 May 4 15:00 jn-gxAa8SHH
-rw-r--r-- 1 exampgrp examp 2184 May 4 23:00 jn-icq7YhrW
-rw-r--r-- 1 exampgrp examp 2184 May 5 15:00 jn-k7K4TU56
-rw-r--r-- 1 exampgrp examp 2184 May 5 03:00 jn-UCuOnEb5
-rw-r--r-- 1 exampgrp examp 2184 May 5 07:00 jn-UOSzEplT
-rw-r--r-- 1 exampgrp examp 2184 May 4 19:00 jn-xzjQr64l
-rw-r--r-- 1 exampgrp examp 2184 May 5 11:00 jn-zS215QN1
-rw-r--r-- 1 exampgrp examp 388 May 5 16:20 managed-keys.bind
-rw-r--r-- 1 exampgrp examp 1697 May 5 16:20 managed-keys.bind.jnl
What is unusual about this server is that it cannot reach the Internet (it is used for provisioning only). It has no client-based recursive role, and has nothing in named.conf relating to dnssec-validation.
When upgrading from BIND 9.11 to BIND 9.16, the default changed from dnssec-validation yes;
(which does nothing without a manually-configured trust anchor) to dnssec-validation auto;
. This uses the built-on root trust anchor, which it then attempts to refresh.
Disabling dnssec-validation resolves the problem on this server, but it looks as if there is an unexpected edge case in named's behaviour nevertheless, to do with the zone database and associated jnl file that are being used to keep track of any root trust anchor changes.
Here's what is logged, that is generating these temporary backup db and jn files:
06-May-2021 09:40:08.691 general: error: dns_rdata_fromtext: managed-keys/managed-keys.bind:10: near eol: unexpected end of input
06-May-2021 09:40:08.691 zoneload: error: managed-keys-zone: loading from master file managed-keys/managed-keys.bind failed: unexpected end of input
06-May-2021 09:40:08.691 zoneload: error: managed-keys-zone: journal rollforward failed: journal out of sync with zone
06-May-2021 09:40:08.691 general: warning: managed-keys-zone: unable to load from 'managed-keys/managed-keys.bind.jnl'; renaming file to 'managed-keys/jn-chyMHlcj' for failure analysis and retransferring.
06-May-2021 09:40:08.691 general: warning: managed-keys-zone: unable to load from 'managed-keys/managed-keys.bind'; renaming file to 'managed-keys/db-uDFb1cWj' for failure analysis and retransferring.
Along with:
06-May-2021 09:40:08.691 dnssec: error: managed-keys-zone: failed to initialize managed-keys (out of range): DNSSEC validation is at risk
This happens every 4 hours (just as named is restarted as it happens, although named is restarted much more often than this - every 10 minutes on this server - a quirk of the provisioning strategy).
In addition, as named is restarted (every 10 minutes or so), we see this being logged, but without the temporary db and jn files being created:
05-May-2021 10:20:08.227 zoneload: error: managed-keys-zone: loading from master file managed-keys/managed-keys.bind failed: unexpected end of input
05-May-2021 10:20:08.228 zoneload: info: managed-keys-zone: loaded serial 4
Meanwhile (and apart from the variants in the SOA), this is what's in both managed-keys.bind, and in the various temporary copies of it:
$ORIGIN .
$TTL 0 ; 0 seconds
@ IN SOA . . (
17 ; serial
0 ; refresh (0 seconds)
0 ; retry (0 seconds)
0 ; expire (0 seconds)
0 ; minimum (0 seconds)
)
KEYDATA 20210506172018 19700101000000 19700101000000 0 0 0 (
) ; ZSK; alg = 0; key id = 0
; next refresh: Thu, 06 May 2021 17:20:18 GMT
; no trust
The situation is anyway broken (really, don't enable DNSSEC validation on a server that cannot reach the Internet and that doesn't need to do recursion!), but I'm opening this issue in case anyone else encounters the same problem. There may also be some edge-case management of .jnl file formats that need to be sorted out?
Details and uploaded files available in Support Ticket #18475