1. 23 Nov, 2020 4 commits
    • Evan Hunt's avatar
      Merge branch '2288-dig-interrupt-crash' into 'main' · dbb4c3a0
      Evan Hunt authored
      Resolve ""dig" crashes when interrupted while waiting for a TCP connection"
      
      Closes #2288
      
      See merge request !4397
      dbb4c3a0
    • Evan Hunt's avatar
      dig could crash on interrupt · 17145e4e
      Evan Hunt authored
      dig could crash if it was shut down by an interrupt while a connection
      was pending.
      17145e4e
    • Michał Kępień's avatar
      Merge branch 'michal/enable-stress-tests-to-be-run-on-demand' into 'main' · a4487688
      Michał Kępień authored
      Enable "stress" tests to be run on demand
      
      See merge request !4313
      a4487688
    • Michał Kępień's avatar
      Enable "stress" tests to be run on demand · f2309422
      Michał Kępień authored
      The "stress" test can be run in different ways, depending on:
      
        - the tested scenario (authoritative, recursive),
        - the operating system used (Linux, FreeBSD),
        - the architecture used (amd64, arm64).
      
      Currently, all supported "stress" test variants are automatically
      launched for all scheduled pipelines and for pipelines started for tags;
      there is no possibility of running these tests on demand, which could be
      useful in certain circumstances.
      
      Employ the "only:variables" key to enable fine-grained control over the
      list of "stress" test jobs to be run for a given pipeline.  Three CI
      variables are used to specify the list of "stress" test jobs to create:
      
        - BIND_STRESS_TEST_MODE: specifies the test mode to use; must be
          explicitly set in order for any "stress" test job to be created;
          allowed values are: "authoritative", "recursive",
      
        - BIND_STRESS_TEST_OS: specifies the operating system to run the test
          on; allowed values are: "linux", "freebsd"; defaults to "linux", may
          be overridden at pipeline creation time,
      
        - BIND_STRESS_TEST_ARCH: specifies the architecture to run the test
          on; allowed values are: "amd64", "arm64"; defaults to "amd64", may
          be overridden at pipeline creation time.
      
      Since case-insensitive regular expressions are used for determining
      which jobs to run, every variable described above may contain multiple
      values.  For example, setting the BIND_STRESS_TEST_MODE variable to
      "authoritative,recursive" will cause the "stress" test to be run in both
      supported scenarios (either on the default OS/architecture combination,
      i.e. Linux/amd64, or, if the relevant variables are explicitly
      specified, the requested OS/architecture combinations).
      f2309422
  2. 19 Nov, 2020 4 commits
  3. 18 Nov, 2020 2 commits
  4. 11 Nov, 2020 21 commits
    • Ondřej Surý's avatar
      Merge branch '2255-dig-crashed-in-tcp_connected-on-openbsd' into 'main' · ff2bc789
      Ondřej Surý authored
      Turn all the callback to be always asynchronous
      
      Closes #2255
      
      See merge request !4386
      ff2bc789
    • Ondřej Surý's avatar
      Turn all the callback to be always asynchronous · a49d8856
      Ondřej Surý authored
      When calling the high level netmgr functions, the callback would be
      sometimes called synchronously if we catch the failure directly, or
      asynchronously if it happens later.  The synchronous call to the
      callback could create deadlocks as the caller would not expect the
      failed callback to be executed directly.
      a49d8856
    • Diego dos Santos Fronza's avatar
      Merge branch '2066-fix-serve-stale' into 'main' · fece7a48
      Diego dos Santos Fronza authored
      Resolve "Fix serve-stale so that it is usable when needed"
      
      Closes #2066
      
      See merge request !4273
      fece7a48
    • Diego dos Santos Fronza's avatar
      Update ARM and other documents · 1ba2215c
      Diego dos Santos Fronza authored
      1ba2215c
    • Diego dos Santos Fronza's avatar
      b4c99753
    • Diego dos Santos Fronza's avatar
    • Diego dos Santos Fronza's avatar
    • Diego dos Santos Fronza's avatar
      Check 'stale-refresh-time' when sharing cache between views · 581e2a8f
      Diego dos Santos Fronza authored
      This commit ensures that, along with previous restrictions, a cache is
      shareable between views only if their 'stale-refresh-time' value are
      equal.
      581e2a8f
    • Matthijs Mekking's avatar
      Add two more system tests for stale-refresh-time · e99671e8
      Matthijs Mekking authored
      Add one test that checks the behavior when serve-stale is enabled
      via configuration (as opposed to enabled via rndc).
      
      Add one test that checks the behavior when stale-refresh-time is
      disabled (set to 0).
      e99671e8
    • Matthijs Mekking's avatar
      Change serve-stale test stale-answer-ttl · dee778de
      Matthijs Mekking authored
      Using a 'stale-answer-ttl' the same value as the authoritative ttl
      value makes it hard to differentiate between a response from the
      stale cache and a response from the authoritative server.
      
      Change the stale-answer-ttl from 2 to 4, so that it differs from the
      authoritative ttl.
      dee778de
    • Diego dos Santos Fronza's avatar
      Wait for multiple parallel dig commands to fully finish · cc70ea86
      Diego dos Santos Fronza authored
      The strategy of running many dig commands in parallel and
      waiting for the respective output files to be non empty was
      resulting in random test failures, hard to reproduce, where
      it was possible that the subsequent reading of the files could
      have been failing due to the file's content not being fully flushed.
      
      Instead of checking if output files are non empty, we now wait
      for the dig processes to finish.
      cc70ea86
    • Diego dos Santos Fronza's avatar
      Added system test for stale-refresh-time · a3dbc5fb
      Diego dos Santos Fronza authored
      This test works as follow:
      - Query for data.example rrset.
      - Sleep until its TTL expires (2 secs).
      - Disable authoritative server.
      - Query for data.example again.
      - Since server is down, answer come from stale cache, which has
        a configured stale-answer-ttl of 3 seconds.
      - Enable authoritative server.
      - Query for data.example again
      - Since last query before activating authoritative server failed, and
        since 'stale-refresh-time' seconds hasn't elapsed yet, answer should
        come from stale cache and not from the authoritative server.
      a3dbc5fb
    • Diego dos Santos Fronza's avatar
      Adjusted ancient rrset system test · fc074f15
      Diego dos Santos Fronza authored
      Before the stale-refresh-time feature, the system test for ancient rrset
      was somewhat based on the average time the previous tests and queries
      were taking, thus not very precise.
      
      After the addition of stale-refresh-time the system test for ancient
      rrset started to fail since the queries for stale records (low
      max-stale-ttl) were not taking the time to do a full resolution
      anymore, since the answers now were coming from the cache (because the
      rrset were stale and within stale-refresh-time window after the
      previous resolution failure).
      
      To handle this, the correct time to wait before rrset become ancient is
      calculated from max-stale-ttl configuration plus the TTL set in the
      rrset used in the tests (ans2/ans.pl).
      
      Then before sending queries for ancient rrset, we check if we need to
      sleep enough to ensure those rrset will be marked as ancient.
      fc074f15
    • Diego dos Santos Fronza's avatar
      Warn if 'stale-refresh-time' < 30 (default) · 5e47a13f
      Diego dos Santos Fronza authored
      RFC 8767 recommends that attempts to refresh to be done no more
      frequently than every 30 seconds.
      
      Added check into named-checkconf, which will warn if values below the
      default are found in configuration.
      
      BIND will also log the warning during loading of configuration in the
      same fashion.
      5e47a13f
    • Diego dos Santos Fronza's avatar
      Add stale-refresh-time option · 4827ad0e
      Diego dos Santos Fronza authored
      Before this update, BIND would attempt to do a full recursive resolution
      process for each query received if the requested rrset had its ttl
      expired. If the resolution fails for any reason, only then BIND would
      check for stale rrset in cache (if 'stale-cache-enable' and
      'stale-answer-enable' is on).
      
      The problem with this approach is that if an authoritative server is
      unreachable or is failing to respond, it is very unlikely that the
      problem will be fixed in the next seconds.
      
      A better approach to improve performance in those cases, is to mark the
      moment in which a resolution failed, and if new queries arrive for that
      same rrset, try to respond directly from the stale cache, and do that
      for a window of time configured via 'stale-refresh-time'.
      
      Only when this interval expires we then try to do a normal refresh of
      the rrset.
      
      The logic behind this commit is as following:
      
      - In query.c / query_gotanswer(), if the test of 'result' variable falls
        to the default case, an error is assumed to have happened, and a call
        to 'query_usestale()' is made to check if serving of stale rrset is
        enabled in configuration.
      
      - If serving of stale answers is enabled, a flag will be turned on in
        the query context to look for stale records:
        query.c:6839
        qctx->client->query.dboptions |= DNS_DBFIND_STALEOK;
      
      - A call to query_lookup() will be made again, inside it a call to
        'dns_db_findext()' is made, which in turn will invoke rbdb.c /
        cache_find().
      
      - In rbtdb.c / cache_find() the important bits of this change is the
        call to 'check_stale_header()', which is a function that yields true
        if we should skip the stale entry, or false if we should consider it.
      
      - In check_stale_header() we now check if the DNS_DBFIND_STALEOK option
        is set, if that is the case we know that this new search for stale
        records was made due to a failure in a normal resolution, so we keep
        track of the time in which the failured occured in rbtdb.c:4559:
        header->last_refresh_fail_ts = search->now;
      
      - In check_stale_header(), if DNS_DBFIND_STALEOK is not set, then we
        know this is a normal lookup, if the record is stale and the query
        time is between last failure time + stale-refresh-time window, then
        we return false so cache_find() knows it can consider this stale
        rrset entry to return as a response.
      
      The last additions are two new methods to the database interface:
      - setservestale_refresh
      - getservestale_refresh
      
      Those were added so rbtdb can be aware of the value set in configuration
      option, since in that level we have no access to the view object.
      4827ad0e
    • Michal Nowak's avatar
      Merge branch '1913-remove-unused-leftovers' into 'main' · 04d9ac63
      Michal Nowak authored
      Resolve "Remove unused leftovers"
      
      Closes #1913
      
      See merge request !4366
      04d9ac63
    • Michal Nowak's avatar
      Add CHANGES entry · 096b0b21
      Michal Nowak authored
      096b0b21
    • Michal Nowak's avatar
      Add unused headers check to CI · a0d359bb
      Michal Nowak authored
      a0d359bb
    • Michal Nowak's avatar
      Drop unused headers · 90880522
      Michal Nowak authored
      90880522
    • Michal Nowak's avatar
      Merge branch 'mnowak/drop-OPENSSL_LIB' into 'main' · 221d5049
      Michal Nowak authored
      Drop @OPENSSL_LIB@ in bigkey
      
      See merge request !4383
      221d5049
    • Michal Nowak's avatar
      Drop @OPENSSL_LIB@ in bigkey · 24d5052e
      Michal Nowak authored
      @OPENSSL_LIB@ was brought back with the
      2f9f6f1f revert.
      24d5052e
  5. 10 Nov, 2020 9 commits
    • Mark Andrews's avatar
      Merge branch... · b88a0d7c
      Mark Andrews authored
      Merge branch '2211-tsan-error-previous_closest_nsec-dns_rbt_findnode-vs-subtractrdataset' into 'main'
      
      Resolve "tsan error previous_closest_nsec(dns_rbt_findnode) vs subtractrdataset"
      
      Closes #2211
      
      See merge request !4259
      b88a0d7c
    • Mark Andrews's avatar
      Address TSAN error between dns_rbt_findnode() and subtractrdataset(). · 244f84a8
      Mark Andrews authored
      Having dns_rbt_findnode() in previous_closest_nsec() check of
      node->data is a optimisation that triggers a TSAN error with
      subtractrdataset().  find_closest_nsec() still needs to check if
      the NSEC record are active or not and look for a earlier NSEC records
      if it isn't.  Set DNS_RBTFIND_EMPTYDATA so node->data isn't referenced
      without the node lock being held.
      
          WARNING: ThreadSanitizer: data race
          Read of size 8 at 0x000000000001 by thread T1 (mutexes: read M1, read M2):
          #0 dns_rbt_findnode lib/dns/rbt.c:1708
          #1 previous_closest_nsec lib/dns/rbtdb.c:3760
          #2 find_closest_nsec lib/dns/rbtdb.c:3942
          #3 zone_find lib/dns/rbtdb.c:4091
          #4 dns_db_findext lib/dns/db.c:536
          #5 query_lookup lib/ns/query.c:5582
          #6 ns__query_start lib/ns/query.c:5505
          #7 query_setup lib/ns/query.c:5229
          #8 ns_query_start lib/ns/query.c:11380
          #9 ns__client_request lib/ns/client.c:2166
          #10 processbuffer netmgr/tcpdns.c:230
          #11 dnslisten_readcb netmgr/tcpdns.c:309
          #12 read_cb netmgr/tcp.c:832
          #13 <null> <null>
          #14 <null> <null>
      
          Previous write of size 8 at 0x000000000001 by thread T2 (mutexes: write M3):
          #0 subtractrdataset lib/dns/rbtdb.c:7133
          #1 dns_db_subtractrdataset lib/dns/db.c:742
          #2 diff_apply lib/dns/diff.c:368
          #3 dns_diff_apply lib/dns/diff.c:459
          #4 do_one_tuple lib/dns/update.c:247
          #5 update_one_rr lib/dns/update.c:275
          #6 delete_if_action lib/dns/update.c:689
          #7 foreach_rr lib/dns/update.c:471
          #8 delete_if lib/dns/update.c:716
          #9 dns_update_signaturesinc lib/dns/update.c:1948
          #10 receive_secure_serial lib/dns/zone.c:15637
          #11 dispatch lib/isc/task.c:1152
          #12 run lib/isc/task.c:1344
          #13 <null> <null>
      
          Location is heap block of size 130 at 0x000000000028 allocated by thread T3:
          #0 malloc <null>
          #1 default_memalloc lib/isc/mem.c:713
          #2 mem_get lib/isc/mem.c:622
          #3 mem_allocateunlocked lib/isc/mem.c:1268
          #4 isc___mem_allocate lib/isc/mem.c:1288
          #5 isc__mem_allocate lib/isc/mem.c:2453
          #6 isc___mem_get lib/isc/mem.c:1037
          #7 isc__mem_get lib/isc/mem.c:2432
          #8 create_node lib/dns/rbt.c:2239
          #9 dns_rbt_addnode lib/dns/rbt.c:1202
          #10 dns_rbtdb_create lib/dns/rbtdb.c:8668
          #11 dns_db_create lib/dns/db.c:118
          #12 receive_secure_db lib/dns/zone.c:16154
          #13 dispatch lib/isc/task.c:1152
          #14 run lib/isc/task.c:1344
          #15 <null> <null>
      
          Mutex M1 (0x000000000040) created at:
          #0 pthread_rwlock_init <null>
          #1 isc_rwlock_init lib/isc/rwlock.c:39
          #2 dns_rbtdb_create lib/dns/rbtdb.c:8527
          #3 dns_db_create lib/dns/db.c:118
          #4 receive_secure_db lib/dns/zone.c:16154
          #5 dispatch lib/isc/task.c:1152
          #6 run lib/isc/task.c:1344
          #7 <null> <null>
      
          Mutex M2 (0x000000000044) created at:
          #0 pthread_rwlock_init <null>
          #1 isc_rwlock_init lib/isc/rwlock.c:39
          #2 dns_rbtdb_create lib/dns/rbtdb.c:8600
          #3 dns_db_create lib/dns/db.c:118
          #4 receive_secure_db lib/dns/zone.c:16154
          #5 dispatch lib/isc/task.c:1152
          #6 run lib/isc/task.c:1344
          #7 <null> <null>
      
          Mutex M3 (0x000000000046) created at:
          #0 pthread_rwlock_init <null>
          #1 isc_rwlock_init lib/isc/rwlock.c:39
          #2 dns_rbtdb_create lib/dns/rbtdb.c:8600
          #3 dns_db_create lib/dns/db.c:118
          #4 receive_secure_db lib/dns/zone.c:16154
          #5 dispatch lib/isc/task.c:1152
          #6 run lib/isc/task.c:1344
          #7 <null> <null>
      
          Thread T1 (running) created by main thread at:
          #0 pthread_create <null>
          #1 isc_thread_create pthreads/thread.c:73
          #2 isc_nm_start netmgr/netmgr.c:232
          #3 create_managers bin/named/main.c:909
          #4 setup bin/named/main.c:1223
          #5 main bin/named/main.c:1523
      
          Thread T2 (running) created by main thread at:
          #0 pthread_create <null>
          #1 isc_thread_create pthreads/thread.c:73
          #2 isc_taskmgr_create lib/isc/task.c:1434
          #3 create_managers bin/named/main.c:915
          #4 setup bin/named/main.c:1223
          #5 main bin/named/main.c:1523
      
          Thread T3 (running) created by main thread at:
          #0 pthread_create <null>
          #1 isc_thread_create pthreads/thread.c:73
          #2 isc_taskmgr_create lib/isc/task.c:1434
          #3 create_managers bin/named/main.c:915
          #4 setup bin/named/main.c:1223
          #5 main bin/named/main.c:1523
      
          SUMMARY: ThreadSanitizer: data race lib/dns/rbt.c:1708 in dns_rbt_findnode
      244f84a8
    • Michal Nowak's avatar
      Merge branch 'mnowak/revert-4350' into 'main' · 79d5a67e
      Michal Nowak authored
      Revert "Drop bigkey"
      
      See merge request !4369
      79d5a67e
    • Michal Nowak's avatar
      Revert "Drop bigkey" · 2f9f6f1f
      Michal Nowak authored
      This reverts commit ef670335.
      
      It is believed that the bigkey test is still useful.
      2f9f6f1f
    • Matthijs Mekking's avatar
      Merge branch 'matthijs-query-header-cleanup' into 'main' · d8caccb9
      Matthijs Mekking authored
      Cleanup query.h duplicate definitions
      
      See merge request !4381
      d8caccb9
    • Matthijs Mekking's avatar
      b7856d26
    • Ondřej Surý's avatar
      Merge branch '1840-netmgr-tls-layer' into 'main' · 152f49b6
      Ondřej Surý authored
      Server-side TLS support in netmgr
      
      Closes #1840
      
      See merge request !3532
      152f49b6
    • Witold Krecicki's avatar
      CHANGES note · bc19dc84
      Witold Krecicki authored
      bc19dc84
    • Ondřej Surý's avatar
      netmgr: Add additional safeguards to netmgr/tls.c · fa424225
      Ondřej Surý authored
      This commit adds couple of additional safeguards against running
      sends/reads on inactive sockets.  The changes was modeled after the
      changes we made to netmgr/tcpdns.c
      fa424225