RFC 5011: confusing use of add hold-down timer
The add hold-down timer in each trust anchor's managed-keys.bind
record is "overloaded" with multiple semantic meanings:
a) it is the point in time in the future when an untrusted key should become trusted,
b) it is also the point in time in the past since which a given key has been trusted,
c) it also determines whether a given key is an initializing key or not (if the add hold-down timer is set to 0, the key is treated as an initializing one).
This does not break RFC 5011, but mixing different semantic meanings in code causes at least three undesired side effects:
-
"Doubled refresh cycles" after loading a
managed-keys.bind
file created by a previousnamed
instance.I think log excerpts best demonstrate this issue:
$ cat /etc/named.conf options { directory "/tmp"; }; key rndc_key { algorithm hmac-md5; secret "1234abcd8765"; }; controls { inet ::1 port 9953 allow { ::1; } keys { rndc_key; }; }; $ named $ rndc managed-keys status view: _default next scheduled event: Tue, 23 Jun 2020 07:12:16 GMT name: . keyid: 20326 algorithm: RSASHA256 flags: SEP next refresh: Tue, 23 Jun 2020 07:12:16 GMT trusted since: Mon, 22 Jun 2020 07:12:16 GMT $ rndc stop $ named $ rndc managed-keys status view: _default next scheduled event: Tue, 23 Jun 2020 07:12:16 GMT name: . keyid: 20326 algorithm: RSASHA256 flags: SEP next refresh: Tue, 23 Jun 2020 07:12:25 GMT trusted since: Mon, 22 Jun 2020 07:12:16 GMT
During the first
named
run, everything looks as expected.During the second run, however, here is what happens:
-
load_secroots()
schedules an immediate key refresh. -
When the key refresh is started, the
set_refreshkeytimer()
call inzone_refreshkeys()
schedules the next key event to the key refresh time stored inmanaged-keys.bind
by the previousnamed
instance (this is fine). -
When the refresh is finished (i.e. a
./DNSKEY
response is received), the refresh timer for the key is updated, but the timer set in step 2 does not get updated because... it is (understandably) set to a time earlier than the revised key refresh time.
Effectively, this causes
named
to refresh each key twice per each refresh period - once according to the previous instance's cycle, once according to the current instance's cycle.A keen reader would notice that the above only means that
named
will consider refreshing a given key twice during each refresh period because key timers are examined before sending out a refresh query and only the keys really needing a refresh at that point are queried for. Well, yes, but this is where we reach the second issue. -
-
All trusted keys are refreshed during all key events, regardless of their refresh timer.
Due to semantic meanings a) and b) being conflated, the
kd.addhd <= now
check which is meant to trigger a key refresh when a given (untrusted) key is meant to become trusted always evaluates totrue
for all keys which are already trusted (because, by definition, their add hold-down timer must be in the past).Issues 1 & 2 combined cause
named
to send way more refresh queries than actually mandated by RFC 5011 (at least 2x as many as required with a single trust anchor). -
Confusing log messages.
Another issue stemming from semantic meanings a) and b) being conflated is that a different
kd.addhd <= now
check which is meant to log a message once a given (untrusted) key becomes trusted always evaluates totrue
for keys which are already trusted, causing the following message to be logged after every refresh of each key:managed-keys-zone: Key 20326 for zone . is now trusted (acceptance timer complete)
This message is confusing because it suggests that the key's add hold-down timer has only fired just now - but in fact that key has likely been trusted before the refresh was even scheduled.
Here is a two-line summary of all three issues:
22-Jun-2020 09:43:16.871 managed-keys-zone: Key 20326 for zone . is now trusted (acceptance timer complete)
22-Jun-2020 09:43:26.896 managed-keys-zone: Key 20326 for zone . is now trusted (acceptance timer complete)
As an administrator, I find these log messages unexpected at best.
I do not have any verified solutions to propose; one idea I have is to
implement a helper routine that would replace the keydata.addhd <= now
checks with something more nuanced that would also check the trust
status of a given key. I believe this could solve issues 2 & 3, which
would make issue 1 more benign.
@each: thoughts?