1. 21 Jul, 2020 6 commits
    • Ondřej Surý's avatar
      Add CHANGES and release note for #1775 · 2b4f0f03
      Ondřej Surý authored
      2b4f0f03
    • Ondřej Surý's avatar
      Change the dns_name hashing to use 32-bit values · a9182c89
      Ondřej Surý authored
      Change the dns_hash_name() and dns_hash_fullname() functions to use
      isc_hash32() as the maximum hashtable size in rbt is 0..UINT32_MAX
      large.
      a9182c89
    • Ondřej Surý's avatar
      Add isc_hash32() and rename isc_hash_function() to isc_hash64() · f59fd49f
      Ondřej Surý authored
      As the names suggest the original isc_hash64 function returns 64-bit
      long hash values and the isc_hash32() returns 32-bit values.
      f59fd49f
    • Ondřej Surý's avatar
      Add HalfSipHash 2-4 reference implementation · 344d66aa
      Ondřej Surý authored
      The HalfSipHash implementation has 32-bit keys and returns 32-bit
      value.
      344d66aa
    • Ondřej Surý's avatar
      Remove OpenSSL based SipHash 2-4 implementation · 21d751df
      Ondřej Surý authored
      Creation of EVP_MD_CTX and EVP_PKEY is quite expensive, so until we fix the code
      to reuse the OpenSSL contexts and keys we'll use our own implementation of
      siphash instead of trying to integrate with OpenSSL.
      21d751df
    • Ondřej Surý's avatar
      Fix the rbt hashtable and grow it when setting max-cache-size · e24bc324
      Ondřej Surý authored
      There were several problems with rbt hashtable implementation:
      
      1. Our internal hashing function returns uint64_t value, but it was
         silently truncated to unsigned int in dns_name_hash() and
         dns_name_fullhash() functions.  As the SipHash 2-4 higher bits are
         more random, we need to use the upper half of the return value.
      
      2. The hashtable implementation in rbt.c was using modulo to pick the
         slot number for the hash table.  This has several problems because
         modulo is: a) slow, b) oblivious to patterns in the input data.  This
         could lead to very uneven distribution of the hashed data in the
         hashtable.  Combined with the single-linked lists we use, it could
         really hog-down the lookup and removal of the nodes from the rbt
         tree[a].  The Fibonacci Hashing is much better fit for the hashtable
         function here.  For longer description, read "Fibonacci Hashing: The
         Optimization that the World Forgot"[b] or just look at the Linux
         kernel.  Also this will make Diego very happy :).
      
      3. The hashtable would rehash every time the number of nodes in the rbt
         tree would exceed 3 * (hashtable size).  The overcommit will make the
         uneven distribution in the hashtable even worse, but the main problem
         lies in the rehashing - every time the database grows beyond the
         limit, each subsequent rehashing will be much slower.  The mitigation
         here is letting the rbt know how big the cache can grown and
         pre-allocate the hashtable to be big enough to actually never need to
         rehash.  This will consume more memory at the start, but since the
         size of the hashtable is capped to `1 << 32` (e.g. 4 mio entries), it
         will only consume maximum of 32GB of memory for hashtable in the
         worst case (and max-cache-size would need to be set to more than
         4TB).  Calling the dns_db_adjusthashsize() will also cap the maximum
         size of the hashtable to the pre-computed number of bits, so it won't
         try to consume more gigabytes of memory than available for the
         database.
      
         FIXME: What is the average size of the rbt node that gets hashed?  I
         chose the pagesize (4k) as initial value to precompute the size of
         the hashtable, but the value is based on feeling and not any real
         data.
      
      For future work, there are more places where we use result of the hash
      value modulo some small number and that would benefit from Fibonacci
      Hashing to get better distribution.
      
      Notes:
      a. A doubly linked list should be used here to speedup the removal of
         the entries from the hashtable.
      b. https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/
      e24bc324
  2. 20 Jul, 2020 1 commit
  3. 17 Jul, 2020 3 commits
    • Michal Nowak's avatar
      Check tests for core files regardless of test status · 1b13123c
      Michal Nowak authored
      Failed test should be checked for core files et al. and have
      backtrace generated.
      1b13123c
    • Michal Nowak's avatar
      Rationalize backtrace logging · 05c13e50
      Michal Nowak authored
      GDB backtrace generated via "thread apply all bt full" is too long for
      standard output, lets save them to .txt file among other log files.
      05c13e50
    • Michal Nowak's avatar
      Ensure various test issues are treated as failures · b232e858
      Michal Nowak authored
      Make sure bin/tests/system/run.sh returns a non-zero exit code if any of
      the following happens:
      
        - the test being run produces a core dump,
        - assertion failures are found in the test's logs,
        - ThreadSanitizer reports are found after the test completes,
        - the servers started by the test fail to shut down cleanly.
      
      This change is necessary to always fail a test in such cases (before the
      migration to Automake, test failures were determined based on the
      presence of "R:<test-name>:FAIL" lines in the test suite output and thus
      it was not necessary for bin/tests/system/run.sh to return a non-zero
      exit code).
      b232e858
  4. 16 Jul, 2020 5 commits
  5. 15 Jul, 2020 15 commits
  6. 14 Jul, 2020 10 commits
    • Matthijs Mekking's avatar
      Merge branch '2006-coverity-checked-return-keymgr' into 'main' · f8ef2c04
      Matthijs Mekking authored
      Fix Coverity keymgr reports
      
      Closes #2006
      
      See merge request !3808
      f8ef2c04
    • Matthijs Mekking's avatar
      Check return value of dst_key_getbool() · e645d2ef
      Matthijs Mekking authored
      Fix Coverity CHECKED_RETURN reports for dst_key_getbool().  In most
      cases we do not really care about its return value, but it is prudent
      to check it.
      
      In one case, where a dst_key_getbool() error should be treated
      identically as success, cast the return value to void and add a relevant
      comment.
      e645d2ef
    • Michał Kępień's avatar
      Merge branch 'michal/use-image-key-in-qemu-based-ci-job-templates' into 'main' · df72c522
      Michał Kępień authored
      Use "image" key in QEMU-based CI job templates
      
      See merge request !3855
      df72c522
    • Michał Kępień's avatar
      Use "image" key in QEMU-based CI job templates · 72201bad
      Michał Kępień authored
      Our GitLab Runner Custom executor scripts now use the "image" key
      instead of the job name for determining the QCOW2 image to use for a
      given CI job.  Update .gitlab-ci.yml to reflect that change.
      72201bad
    • Mark Andrews's avatar
      Merge branch 'u/fanf2/fix-signing' into 'main' · c53bfb30
      Mark Andrews authored
      Fix re-signing when `sig-validity-interval` has two arguments
      
      See merge request !3735
      c53bfb30
    • Mark Andrews's avatar
      Add release note for [GL !3735] · 3ff60b88
      Mark Andrews authored
      3ff60b88
    • Mark Andrews's avatar
      Add CHANGES note for [GL !3735] · f4fbca6e
      Mark Andrews authored
      f4fbca6e
    • Mark Andrews's avatar
      Add regression test for [GL !3735] · 11ecf790
      Mark Andrews authored
      Check that resign interval is actually in days rather than hours
      by checking that RRSIGs are all within the allowed day range.
      11ecf790
    • Tony Finch's avatar
      Fix re-signing when `sig-validity-interval` has two arguments · 030674b2
      Tony Finch authored
      Since October 2019 I have had complaints from `dnssec-cds` reporting
      that the signatures on some of my test zones had expired. These were
      zones signed by BIND 9.15 or 9.17, with a DNSKEY TTL of 24h and
      `sig-validity-interval 10 8`.
      
      This is the same setup we have used for our production zones since
      2015, which is intended to re-sign the zones every 2 days, keeping
      at least 8 days signature validity. The SOA expire interval is 7
      days, so even in the presence of zone transfer problems, no-one
      should ever see expired signatures. (These timers are a bit too
      tight to be completely correct, because I should have increased
      the expiry timers when I increased the DNSKEY TTLs from 1h to 24h.
      But that should only matter when zone transfers are broken, which
      was not the case for the error reports that led to this patch.)
      
      For example, this morning my test zone contained:
      
              dev.dns.cam.ac.uk. 86400 IN RRSIG DNSKEY 13 5 86400 (
                                      20200701221418 20200621213022 ...)
      
      But one of my resolvers had cached:
      
              dev.dns.cam.ac.uk. 21424 IN RRSIG DNSKEY 13 5 86400 (
                                      20200622063022 20200612061136 ...)
      
      This TTL was captured at 20200622105807 so the resolver cached the
      RRset 64976 seconds previously (18h02m56s), at 20200621165511
      only about 12h before expiry.
      
      The other symptom of this error was incorrect `resign` times in
      the output from `rndc zonestatus`.
      
      For example, I have configured a test zone
      
              zone fast.dotat.at {
                      file "../u/z/fast.dotat.at";
                      type primary;
                      auto-dnssec maintain;
                      sig-validity-interval 500 499;
              };
      
      The zone is reset to a minimal zone containing only SOA and NS
      records, and when `named` starts it loads and signs the zone. After
      that, `rndc zonestatus` reports:
      
              next resign node: fast.dotat.at/NS
              next resign time: Fri, 28 May 2021 12:48:47 GMT
      
      The resign time should be within the next 24h, but instead it is
      near the signature expiry time, which the RRSIG(NS) says is
      20210618074847. (Note 499 hours is a bit more than 20 days.)
      May/June 2021 is less than 500 days from now because expiry time
      jitter is applied to the NS records.
      
      Using this test I bisected this bug to 09990672 which contained a
      mistake leading to the resigning interval always being calculated in
      hours, when days are expected.
      
      This bug only occurs for configurations that use the two-argument form
      of `sig-validity-interval`.
      030674b2
    • Mark Andrews's avatar
      Merge branch... · 2ac2d832
      Mark Andrews authored
      Merge branch '1994-netscope-c-23-50-error-unused-parameter-addr-when-have_if_nametoindex-undefined-on-illumos' into 'main'
      
      Resolve "netscope.c:23:50: error: unused parameter 'addr' when HAVE_IF_NAMETOINDEX undefined on illumos"
      
      Closes #1994
      
      See merge request !3829
      2ac2d832