Skip to content
  • Diego Fronza's avatar
    Add stale-answer-client-timeout option · 3478794a
    Diego Fronza authored and Matthijs Mekking's avatar Matthijs Mekking committed
    The general logic behind the addition of this new feature works as
    folows:
    
    When a client query arrives, the basic path (query.c / ns_query_recurse)
    was to create a fetch, waiting for completion in fetch_callback.
    
    With the introduction of stale-answer-client-timeout, a new event of
    type DNS_EVENT_TRYSTALE may invoke fetch_callback, whenever stale
    answers are enabled and the fetch took longer than
    stale-answer-client-timeout to complete.
    
    When an event of type DNS_EVENT_TRYSTALE triggers fetch_callback, we
    must ensure that the folowing happens:
    
    1. Setup a new query context with the sole purpose of looking up for
       stale RRset only data, for that matters a new flag was added
       'DNS_DBFIND_STALEONLY' used in database lookups.
    
        . If a stale RRset is found, mark the original client query as
          answered (with a new query attribute named NS_QUERYATTR_ANSWERED),
          so when the fetch completion event is received later, we avoid
          answering the client twice.
    
        . If a stale RRset is not found, cleanup and wait for the normal
          fetch completion event.
    
    2. In ns_query_done, we must change this part:
    	/*
    	 * If we're recursing then just return; the query will
    	 * resume when recursion ends.
    	 */
    	if (RECURSING(qctx->client)) {
    		return (qctx->result);
    	}
    
       To this:
    
    	if (RECURSING(qctx->client) && !QUERY_STALEONLY(qctx->client)) {
    		return (qctx->result);
    	}
    
       Otherwise we would not proceed to answer the client if it happened
       that a stale answer was found when looking up for stale only data.
    
    When an event of type DNS_EVENT_FETCHDONE triggers fetch_callback, we
    proceed as before, resuming query, updating stats, etc, but a few
    exceptions had to be added, most important of which are two:
    
    1. Before answering the client (ns_client_send), check if the query
       wasn't already answered before.
    
    2. Before detaching a client, e.g.
       isc_nmhandle_detach(&client->reqhandle), ensure that this is the
       fetch completion event, and not the one triggered due to
       stale-answer-client-timeout, so a correct call would be:
       if (!QUERY_STALEONLY(client)) {
            isc_nmhandle_detach(&client->reqhandle);
       }
    
    Other than these notes, comments were added in code in attempt to make
    these updates easier to follow.
    
    (cherry picked from commit 171a5b75)
    3478794a