Consider changing the "shut down hung fetch" log message
!5612 (merged) reintroduced the global recursive query handling timer, to enable netmgr-based dispatch code to deal with some pathological resolution scenarios relatively cleanly (i.e. without indefinitely holding on to resources). A new log message was introduced, too:
4612 static void
4613 fctx_expired(isc_task_t *task, isc_event_t *event) {
4614 fetchctx_t *fctx = event->ev_arg;
4615
4616 REQUIRE(VALID_FCTX(fctx));
4617
4618 UNUSED(task);
4619
4620 isc_log_write(dns_lctx, DNS_LOGCATEGORY_RESOLVER,
4621 DNS_LOGMODULE_RESOLVER, ISC_LOG_INFO,
4622 "shut down hung fetch while resolving '%s'", fctx->info);
4623 fctx_shutdown(fctx);
4624 isc_event_free(&event);
4625 }
Empirical evidence from a small resolver with often-flaky connectivity
suggests that this log message may be confusing because no detection
of hung fetches actually takes place - rather, fctx_expired()
is
simply the timer callback. In other words, the "shut down hung fetch"
message may be logged e.g. due to query timeouts (caused by intermittent
network issues) preventing the resolution from succeeding. The log
message in question should therefore arguably be revised as it cannot be
used to reliably discern between the resolver code itself having issues
with a given query and e.g. network connectivity failing for a few
seconds.