Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • BIND BIND
  • Project information
    • Project information
    • Activity
    • Labels
    • Planning hierarchy
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 530
    • Issues 530
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 98
    • Merge requests 98
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • ISC Open Source Projects
  • BINDBIND
  • Issues
  • #924

Closed
Open
Created Mar 06, 2019 by Michał Kępień@michalOwner

"nsip-wait-recurse yes" may not be enforced in certain circumstances

This problem is similar to the one affecting nxdomain-redirect, described in #923: an NS RRset which should be subject to RPZ processing expires by the time the RPZ code attempts to fetch it from cache. An example occurrence of this issue can be found in https://gitlab.isc.org/isc-projects/bind9/-/jobs/188782: foo.child.example.tld/NS is cached at 22:46:41.994 but foo.child.example.tld/A processing resumes from recursion at 22:46:42.000 (different Unix timestamp). Since foo.child.example.tld/NS is received with TTL 0 which is then overridden with 1 by resolver code, it is already expired by the time rpz_rrset_find() is called. This makes the resolver behave as if nsip-wait-recurse was set to no even if it is set to yes.

This problem would likely have been discovered much earlier if it was not for the fact that the nsip-wait-recurse check in the rpzrecurse system test is overly lax: it assumes that as long as named instances with nsip-wait-recurse yes and nsip-wait-recurse no need a different number of seconds to return a response to the test query, everything works as intended. Meanwhile, I believe that if the named instance with nsip-wait-recurse yes takes any less than 5 seconds (default dig timeout) to return a response, it means the issue was triggered (because ns5 is programmed to not respond to NS queries at all).

Here is a crude patch against current master that demonstrates both issues (not recursing with nsip-wait-recurse yes and that the test can yield false negatives - so the patch will not cause a test failure!):

diff --git a/lib/ns/query.c b/lib/ns/query.c
index 0efcafd578..906e354f70 100644
--- a/lib/ns/query.c
+++ b/lib/ns/query.c
@@ -16,6 +16,7 @@
 #include <inttypes.h>
 #include <stdbool.h>
 #include <string.h>
+#include <unistd.h>
 
 #include <isc/hex.h>
 #include <isc/mem.h>
@@ -5427,6 +5428,14 @@ fetch_callback(isc_task_t *task, isc_event_t *event) {
 
 	LOCK(&client->query.fetchlock);
 	if (client->query.fetch != NULL) {
+		static bool slept = false;
+		dns_fixedname_t _fixed;
+		dns_name_t *_name = dns_fixedname_initname(&_fixed);
+		RUNTIME_CHECK(dns_name_fromstring(_name, "foo.child.example.tld", 0, NULL) == ISC_R_SUCCESS);
+		if (!slept && dns_name_equal(client->query.qname, _name) && client->query.qtype == 1) {
+			sleep(1);
+			slept = true;
+		}
 		/*
 		 * This is the fetch we've been waiting for.
 		 */
Assignee
Assign to
Time tracking