1. 08 Apr, 2021 7 commits
  2. 07 Apr, 2021 13 commits
  3. 02 Apr, 2021 10 commits
    • Matthijs Mekking's avatar
      Merge branch '2594-servestale-staleonly-recursion-race-v9_16' into 'v9_16' · cae34f75
      Matthijs Mekking authored
      Serve-stale "staleonly" recursion race condition
      
      See merge request !4860
      cae34f75
    • Matthijs Mekking's avatar
      If RPZ config'd, bail stale-answer-client-timeout · 194a72b3
      Matthijs Mekking authored
      When we are recursing, RPZ processing is not allowed. But when we are
      performing a lookup due to "stale-answer-client-timeout", we are still
      recursing. This effectively means that RPZ processing is disabled on
      such a lookup.
      
      In this case, bail the "stale-answer-client-timeout" lookup and wait
      for recursion to complete, as we we can't perform the RPZ rewrite
      rules reliably.
      
      (cherry picked from commit 3d3a6415)
      194a72b3
    • Matthijs Mekking's avatar
      Rename "staleonly" · 29bcd113
      Matthijs Mekking authored
      The dboption DNS_DBFIND_STALEONLY caused confusion because it implies
      we are looking for stale data **only** and ignore any active RRsets in
      the cache. Rename it to DNS_DBFIND_STALETIMEOUT as it is more clear
      the option is related to a lookup due to "stale-answer-client-timeout".
      
      Rename other usages of "staleonly", instead use "lookup due to...".
      Also rename related function and variable names.
      
      (cherry picked from commit 839df941)
      29bcd113
    • Matthijs Mekking's avatar
      Restore the RECURSIONOK attribute after staleonly · 34dd6521
      Matthijs Mekking authored
      When doing a staleonly lookup we don't want to fallback to recursion.
      After all, there are obviously problems with recursion, otherwise we
      wouldn't do a staleonly lookup.
      
      When resuming from recursion however, we should restore the
      RECURSIONOK flag, allowing future required lookups for this client
      to recurse.
      
      (cherry picked from commit 3f81d79f)
      34dd6521
    • Matthijs Mekking's avatar
      Remove result exception on staleonly lookup · 114dc788
      Matthijs Mekking authored
      When implementing "stale-answer-client-timeout", we decided that
      we should only return positive answers prematurely to clients. A
      negative response is not useful, and in that case it is better to
      wait for the recursion to complete.
      
      To do so, we check the result and if it is not ISC_R_SUCCESS, we
      decide that it is not good enough. However, there are more return
      codes that could lead to a positive answer (e.g. CNAME chains).
      
      This commit removes the exception and now uses the same logic that
      other stale lookups use to determine if we found a useful stale
      answer (stale_found == true).
      
      This means we can simplify two test cases in the serve-stale system
      test: nodata.example is no longer treated differently than data.example.
      
      (cherry picked from commit aaed7f9d)
      114dc788
    • Matthijs Mekking's avatar
      Add notes and changes for [#2594] · 4b253330
      Matthijs Mekking authored
      Pretty newsworthy.
      
      (cherry picked from commit e44bcc6f)
      4b253330
    • Matthijs Mekking's avatar
      Remove INSIST on NS_QUERYATTR_ANSWERED · 06823aa2
      Matthijs Mekking authored
      The NS_QUERYATTR_ANSWERED attribute is to prevent sending a response
      twice. Without the attribute, this may happen if a staleonly lookup
      found a useful answer and sends a response to the client, and later
      recursion ends and also tries to send a response.
      
      The attribute was also used to mask adding a duplicate RRset. This is
      considered harmful. When we created a response to the client with a
      stale only lookup (regardless if we actually have send the response),
      we should clear the rdatasets that were added during that lookup.
      
      Mark such rdatasets with the a new attribute,
      DNS_RDATASETATTR_STALE_ADDED. Set a query attribute
      NS_QUERYATTR_STALEOK if we may have added rdatasets during a stale
      only lookup. Before creating a response on a normal lookup, check if
      we can expect rdatasets to have been added during a staleonly lookup.
      If so, clear the rdatasets from the message with the attribute
      DNS_RDATASETATTR_STALE_ADDED set.
      
      (cherry picked from commit 3d5429f6)
      06823aa2
    • Matthijs Mekking's avatar
      Simplify when to detach the client · 33d61b96
      Matthijs Mekking authored
      With stale-answer-client-timeout, we may send a response to the client,
      but we may want to hold on to the network manager handle, because
      recursion is going on in the background, or we need to refresh a
      stale RRset.
      
      Simplify the setting of 'nodetach':
      * During a staleonly lookup we should not detach the nmhandle, so just
        set it prior to 'query_lookup()'.
      * During a staleonly "stalefirst" lookup set the 'nodetach' to true
        if we are going to refresh the RRset.
      
      Now there is no longer the need to clear the 'nodetach' if we go
      through the "dbfind_stale", "stale_refresh_window", or "stale_only"
      paths.
      
      (cherry picked from commit 48b0dc15)
      33d61b96
    • Matthijs Mekking's avatar
      Refactor stale lookups, ignore active RRsets · b1496d19
      Matthijs Mekking authored
      When doing a staleonly lookup, ignore active RRsets from cache. If we
      don't, we may add a duplicate RRset to the message, and hit an
      assertion failure in query.c because adding the duplicate RRset to the
      ANSWER section failed.
      
      This can happen on a race condition. When a client query is received,
      the recursion is started. When 'stale-answer-client-timeout' triggers
      around the same time the recursion completes, the following sequence
      of events may happen:
      1. Queue the "try stale" fetch_callback() event to the client task.
      2. Add the RRsets from the authoritative response to the cache.
      3. Queue the "fetch complete" fetch_callback() event to the client task.
      4. Execute the "try stale" fetch_callback(), which retrieves the
         just-inserted RRset from the database.
      5. In "ns_query_done()" we are still recursing, but the "staleonly"
         query attribute has already been cleared. In other words, the
         query will resume when recursion ends (it already has ended but is
         still on the task queue).
      6. Execute the "fetch complete" fetch_callback(). It finds the answer
         from recursion in the cache again and tries to add the duplicate to
         the answer section.
      
      This commit changes the logic for finding stale answers in the cache,
      such that on "stale_only" lookups actually only stale RRsets are
      considered. It refactors the code so that code paths for "dbfind_stale",
      "stale_refresh_window", and "stale_only" are more clear.
      
      First we call some generic code that applies in all three cases,
      formatting the domain name for logging purposes, increment the
      trystale stats, and check if we actually found stale data that we can
      use.
      
      The "dbfind_stale" lookup will return SERVFAIL if we didn't found a
      usable answer, otherwise we will continue with the lookup
      (query_gotanswer()). This is no different as before the introduction of
      "stale-answer-client-timeout" and "stale-refresh-time".
      
      The "stale_refresh_window" lookup is similar to the "dbfind_stale"
      lookup: return SERVFAIL if we didn't found a usable answer, otherwise
      continue with the lookup (query_gotanswer()).
      
      Finally the "stale_only" lookup.
      
      If the "stale_only" lookup was triggered because of an actual client
      timeout (stale-answer-client-timeout > 0), and if database lookup
      returned a stale usable RRset, trigger a response to the client.
      Otherwise return and wait until the recursion completes (or the
      resolver query times out).
      
      If the "stale_only" lookup is a "stale-anwer-client-timeout 0" lookup,
      preferring stale data over a lookup. In this case if there was no stale
      data, or the data was not a positive answer, retry the lookup with the
      stale options cleared, a.k.a. a normal lookup. Otherwise, continue
      with the lookup (query_gotanswer()) and refresh the stale RRset. This
      will trigger a response to the client, but will not detach the handle
      because a fetch will be created to refresh the RRset.
      
      (cherry picked from commit 92f7a678)
      b1496d19
    • Matthijs Mekking's avatar
      Keep track of allow client detach · fcf8fb4f
      Matthijs Mekking authored
      The stale-answer-client-timeout feature introduced a dependancy on
      when a client may be detached from the handle. The dboption
      DNS_DBFIND_STALEONLY was reused to track this attribute. This overloads
      the meaning of this database option, and actually introduced a bug
      because the option was checked in other places. In particular, in
      'ns_query_done()' there is a check for 'RECURSING(qctx->client) &&
      (!QUERY_STALEONLY(&qctx->client->query) || ...' and the condition is
      satisfied because recursion has not completed yet and
      DNS_DBFIND_STALEONLY is already cleared by that time (in
      query_lookup()), because we found a useful answer and we should detach
      the client from the handle after sending the response.
      
      Add a new boolean to the client structure to keep track of client
      detach from handle is allowed or not. It is only disallowed if we are
      in a staleonly lookup and we didn't found a useful answer.
      
      (cherry picked from commit fee16424)
      fcf8fb4f
  4. 01 Apr, 2021 7 commits
    • Ondřej Surý's avatar
      Merge branch '2607-remove-custom-spnego-v9_16' into 'v9_16' · bcae8ec0
      Ondřej Surý authored
      Remove custom ISC SPNEGO implementation (v9.16)
      
      See merge request !4855
      bcae8ec0
    • Mark Andrews's avatar
      Add CHANGES and release note for GL #2607 · 99132eda
      Mark Andrews authored
      99132eda
    • Ondřej Surý's avatar
      Move the dummy shims to single ifndef GSSAPI block · 565a6a56
      Ondřej Surý authored
      Previously, every function had it's own #ifdef GSSAPI #else #endif block
      that defined shim function in case GSSAPI was not being used.  Now the
      dummy shim functions have be split out into a single #else #endif block
      at the end of the file.
      
      This makes the gssapictx.c similar to 9.17.x code, making the backports
      and reviews easier.
      565a6a56
    • Mark Andrews's avatar
      Add Heimdal compatibility support · 3fd30e16
      Mark Andrews authored
      The Heimdal Kerberos library handles the OID sets in a different manner.
      Unify the handling of the OID sets between MIT and Heimdal
      implementations by dynamically creating the OID sets instead of using
      static predefined set.  This is how upstream recommends to handle the
      OID sets.
      3fd30e16
    • Mark Andrews's avatar
      Request krb5 CFLAGS and LIBS from $KRB5_CONFIG · 6b0b0c6a
      Mark Andrews authored
      The GSSAPI now needs both gssapi and krb5 libraries, so we need to
      request both CFLAGS and LIBS from the configure script.
      6b0b0c6a
    • Mark Andrews's avatar
      Remove custom ISC SPNEGO implementation · a875dcc6
      Mark Andrews authored
      The custom ISC SPNEGO mechanism implementation is no longer needed on
      the basis that all major Kerberos 5/GSSAPI (mit-krb5, heimdal and
      Windows) implementations support SPNEGO mechanism since 2006.
      
      This commit removes the custom ISC SPNEGO implementation, and removes
      the option from both autoconf and win32 Configure script.  Unknown
      options are being ignored, so this doesn't require any special handling.
      a875dcc6
    • Mark Andrews's avatar
      Handle expected signals in tsiggss authsock.pl script · 216a9718
      Mark Andrews authored
      When the authsock.pl script would be terminated with a signal,
      it would leave the pidfile around.  This commit adds a signal
      handler that cleanups the pidfile on signals that are expected.
      216a9718
  5. 31 Mar, 2021 2 commits
  6. 26 Mar, 2021 1 commit