"nsip-wait-recurse yes" may not be enforced in certain circumstances
This problem is similar to the one affecting nxdomain-redirect
, described in #923: an NS RRset which should be subject to RPZ processing expires by the time the RPZ code attempts to fetch it from cache. An example occurrence of this issue can be found in https://gitlab.isc.org/isc-projects/bind9/-/jobs/188782: foo.child.example.tld/NS
is cached at 22:46:41.994 but foo.child.example.tld/A
processing resumes from recursion at 22:46:42.000 (different Unix timestamp). Since foo.child.example.tld/NS
is received with TTL 0 which is then overridden with 1 by resolver code, it is already expired by the time rpz_rrset_find()
is called. This makes the resolver behave as if nsip-wait-recurse
was set to no
even if it is set to yes
.
This problem would likely have been discovered much earlier if it was not for the fact that the nsip-wait-recurse
check in the rpzrecurse
system test is overly lax: it assumes that as long as named
instances with nsip-wait-recurse yes
and nsip-wait-recurse no
need a different number of seconds to return a response to the test query, everything works as intended. Meanwhile, I believe that if the named
instance with nsip-wait-recurse yes
takes any less than 5 seconds (default dig
timeout) to return a response, it means the issue was triggered (because ns5
is programmed to not respond to NS queries at all).
Here is a crude patch against current master that demonstrates both issues (not recursing with nsip-wait-recurse yes
and that the test can yield false negatives - so the patch will not cause a test failure!):
diff --git a/lib/ns/query.c b/lib/ns/query.c
index 0efcafd578..906e354f70 100644
--- a/lib/ns/query.c
+++ b/lib/ns/query.c
@@ -16,6 +16,7 @@
#include <inttypes.h>
#include <stdbool.h>
#include <string.h>
+#include <unistd.h>
#include <isc/hex.h>
#include <isc/mem.h>
@@ -5427,6 +5428,14 @@ fetch_callback(isc_task_t *task, isc_event_t *event) {
LOCK(&client->query.fetchlock);
if (client->query.fetch != NULL) {
+ static bool slept = false;
+ dns_fixedname_t _fixed;
+ dns_name_t *_name = dns_fixedname_initname(&_fixed);
+ RUNTIME_CHECK(dns_name_fromstring(_name, "foo.child.example.tld", 0, NULL) == ISC_R_SUCCESS);
+ if (!slept && dns_name_equal(client->query.qname, _name) && client->query.qtype == 1) {
+ sleep(1);
+ slept = true;
+ }
/*
* This is the fetch we've been waiting for.
*/