RRSIG(SOA) RRset not at zone apex triggers a infinite, busy resigning loop
This issue was discovered accidentally while looking for
suspiciously long named
logs generated by system tests.
As every valid zone must contain exactly one SOA RRset and it must be placed at the zone apex, RRSIG(SOA) RRsets owned by any name other than the zone apex are broken by definition and should be removed after (during?) zone load.
The way zone_resigninc()
works is that it considers the resigning
process finished as soon as it encounters an SOA record that needs
to be resigned (because any previous signing processes are expected to
cause the SOA record to have the most recent signature). However, only
the type of the record is checked, not its owner name. Therefore,
not-at-zone-apex RRSIG(SOA) RRsets also (incorrectly) interrupt zone
resigning, but only for a short period of time: the signatures for these
RRsets are not refreshed, which means the first such RRset encountered
during the resigning process will indefinitely remain the RRset which
has the closest expiry date. This in turn causes zone resigning to be
rescheduled very quickly, only to lead to the same outcome shortly
afterwards. This results in the server being stuck in an infinite
resigning loop.