Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
BIND
BIND
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 582
    • Issues 582
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 110
    • Merge Requests 110
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • ISC Open Source Projects
  • BINDBIND
  • Merge Requests
  • !3865

Merged
Opened Jul 16, 2020 by Ondřej Surý@ondrejOwner

Fix the rbt hashtable and grow it when setting max-cache-size

  • Overview 16
  • Commits 6
  • Pipelines 13
  • Changes 25

There were several problems with rbt hashtable implementation:

  1. Our internal hashing function returns uint64_t value, but it was silently truncated to unsigned int in dns_name_hash() and dns_name_fullhash() functions. As the SipHash 2-4 higher bits are more random, we need to use the upper half of the return value.

  2. The hashtable implementation in rbt.c was using modulo to pick the slot number for the hash table. This has several problems because modulo is: a) slow, b) oblivious to patterns in the input data. This could lead to very uneven distribution of the hashed data in the hashtable. Combined with the single-linked lists we use, it could really hog-down the lookup and removal of the nodes from the rbt tree[a]. The Fibonacci Hashing is much better fit for the hashtable function here. For longer description, read "Fibonacci Hashing: The Optimization that the World Forgot"[b] or just look at the Linux kernel. Also this will make Diego very happy :).

  3. The hashtable would rehash every time the number of nodes in the rbt tree would exceed 3 * (hashtable size). The overcommit will make the uneven distribution in the hashtable even worse, but the main problem lies in the rehashing - every time the database grows beyond the limit, each subsequent rehashing will be much slower. The mitigation here is letting the rbt know how big the cache can grown and pre-allocate the hashtable to be big enough to actually never need to rehash. This will consume more memory at the start, but since the size of the hashtable is capped to 1 << 31 (e.g. 2 mio entries), it will only consume maximum of 16GB of memory for hashtable in the worst case. Calling the dns_db_adjusthashsize() will also cap the maximum size of the hashtable to the pre-computed number of bits, so it won't try to consume more gigabytes of memory than available for the database.

For future work, there are more places where we use result of the hash value modulo some small number and that would benefit from Fibonacci Hashing to get better distribution.

Notes: a. A doubly linked list should be used here to speedup the removal of the entries from the hashtable. b. https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/

Closes #1775 (closed)

Edited Jul 21, 2020 by Ondřej Surý
Assignee
Assign to
Reviewer
Request review from
August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4)
Milestone
August 2020 (9.11.22, 9.11.22-S1, 9.16.6, 9.17.4) (Past due)
Assign milestone
Time tracking
Reference: isc-projects/bind9!3865
Source branch: 1775-resizing-growing-of-cache-hash-tables-causes-delays-in-processing-of-client-queries

Revert this merge request

This will create a new commit in order to revert the existing changes.

Switch branch
Cancel
A new branch will be created in your fork and a new merge request will be started.

Cherry-pick this merge request

Switch branch
Cancel
A new branch will be created in your fork and a new merge request will be started.