Reduce adb names locking contention
With our new userspace tracing probes, we were able to pinpoint the source of the contention:
fn | type | count | min | max | sum | avg |
---|---|---|---|---|---|---|
dns_adb_createfind | mutex | 1597343 | 5772 | 210015 | 10540878621 | 6599 |
dns_adb_agesrtt | mutex | 806318 | 5772 | 236964 | 5284236828 | 6553 |
dns__rbtdb_detachnode | rwlock | 928583 | 4290 | 310557 | 4803076356 | 5172 |
dns_adb_destroyfind | mutex | 556873 | 5811 | 192972 | 3703797591 | 6651 |
dns_adb_createfind | rwlock | 556873 | 4329 | 172419 | 2802134985 | 5031 |
dns__rbtdb_addrdataset | rwlock | 462176 | 4290 | 274560 | 2295845409 | 4967 |
cache_find | rwlock | 439323 | 4290 | 280293 | 2276739816 | 5182 |
isc__mempool_destroy | mutex | 271276 | 5811 | 204321 | 1926270294 | 7100 |
isc__mempool_create | mutex | 271276 | 5850 | 141609 | 1885239369 | 6949 |
fctx_cancelqueries | mutex | 189589 | 5850 | 144027 | 1301757054 | 6866 |
dns__rbtdb_nodefullname | rwlock | 229742 | 4368 | 187512 | 1157715000 | 5039 |
dns_adb_getcookie | mutex | 132632 | 5928 | 139035 | 1137403254 | 8575 |
find_deepest_zonecut | rwlock | 224027 | 4290 | 156936 | 1105641693 | 4935 |
dns__rbtdb_findnodeintree | rwlock | 181563 | 4368 | 168246 | 948466155 | 5223 |
reactivate_node | rwlock | 181581 | 4329 | 163917 | 928872165 | 5115 |
dns_ntatable_covered | rwlock | 158872 | 4407 | 176280 | 857866503 | 5399 |
fctx__done.constprop.0 | mutex | 109350 | 5889 | 122031 | 750861813 | 6866 |
dns_adb_setudpsize | mutex | 68673 | 6318 | 222495 | 682022757 | 9931 |
dns_adb_adjustsrtt | mutex | 80787 | 5928 | 165360 | 606351837 | 7505 |
fctx_cancelquery | mutex | 80787 | 6006 | 141921 | 573609972 | 7100 |
fctx_query | mutex | 80787 | 5967 | 137943 | 552959550 | 6844 |
rctx_done | mutex | 80778 | 5889 | 118326 | 549924336 | 6807 |
resquery_destroy | mutex | 80787 | 5850 | 129090 | 541517847 | 6703 |
activeempty | rwlock | 107788 | 4251 | 139464 | 521984073 | 4842 |
dns_resolver_destroyfetch | mutex | 54899 | 6006 | 126360 | 447498324 | 8151 |
validated | mutex | 58506 | 5850 | 127608 | 442221507 | 7558 |
fctx_start | mutex | 54666 | 5928 | 135954 | 429194220 | 7851 |
dns_resolver_createfetch | mutex | 53692 | 6006 | 140088 | 390963261 | 7281 |
resquery_response | mutex | 51786 | 6123 | 155922 | 382768581 | 7391 |
dns_aclelement_match | rwlock | 62649 | 4329 | 120900 | 371844018 | 5935 |
get_attached_fctx | mutex | 53692 | 5967 | 120627 | 367318809 | 6841 |
fetch_callback | mutex | 49034 | 5772 | 142896 | 364519662 | 7434 |
get_attached_and_locked_entry | mutex | 53556 | 5928 | 117000 | 361141638 | 6743 |
zone_find | rwlock | 71262 | 4251 | 182871 | 347663121 | 4878 |
dns__rbtdb_currentversion | rwlock | 69986 | 4368 | 111696 | 346155108 | 4946 |
cache_findzonecut | rwlock | 64631 | 4329 | 226278 | 335597886 | 5192 |
clean_namehooks | mutex | 51702 | 5811 | 124995 | 332844369 | 6437 |
release_fctx | rwlock | 53459 | 4446 | 112476 | 278282199 | 5205 |
get_attached_and_locked_entry | rwlock | 53556 | 4368 | 98592 | 271329786 | 5066 |
get_attached_fctx | rwlock | 53692 | 4329 | 709020 | 266725095 | 4967 |
delete_callback | rwlock | 44974 | 4368 | 49569 | 212821830 | 4732 |
ns_query_cancel | mutex | 21337 | 6045 | 118404 | 196894893 | 9227 |
prune_tree | rwlock | 22934 | 4290 | 94029 | 172068039 | 7502 |
purge_stale_entries | mutex | 21807 | 5928 | 130143 | 147269265 | 6753 |
mutex_lock | mutex | 11905 | 6006 | 1802736 | 145038075 | 12182 |
dns_adb_changeflags | mutex | 20232 | 5889 | 110916 | 138579753 | 6849 |
ns_client_recursing | mutex | 15949 | 6357 | 110526 | 135737472 | 8510 |
dns_adb_setcookie | mutex | 16890 | 6084 | 139503 | 126144447 | 7468 |
rdataset_getownercase | rwlock | 23833 | 4329 | 122538 | 125803704 | 5278 |
cds_wfcq_dequeue_blocking | mutex | 15988 | 6240 | 187980 | 122401383 | 7655 |
shutdown_names | mutex | 18901 | 5967 | 49569 | 121754802 | 6441 |
clean_finds_at_name | mutex | 14878 | 5850 | 116922 | 100255584 | 6738 |
find_coveringnsec | rwlock | 18681 | 4485 | 110487 | 100217130 | 5364 |
resume_qmin | mutex | 12414 | 6006 | 122928 | 95990232 | 7732 |
fctx_finddone | mutex | 12404 | 5889 | 100425 | 91508859 | 7377 |
zone_shutdown | rwlock | 104 | 5382 | 6337149 | 90942969 | 874451 |
dns_adb_ednsto | mutex | 9616 | 6435 | 74568 | 90021906 | 9361 |
je_malloc_mutex_lock_slow | mutex | 32 | 18564 | 6550284 | 78068913 | 2439653 |
dns_zonemgr_releasezone | rwlock | 208 | 4836 | 4666077 | 66756261 | 320943 |
ns_client_qnamereplace | mutex | 7387 | 6162 | 89661 | 56466618 | 7644 |
isc_log_doit | mutex | 3463 | 5889 | 87477 | 23883873 | 6896 |
isc_log_doit | rwlock | 3463 | 4524 | 70863 | 20813169 | 6010 |
dns_adb_getudpsize | mutex | 2118 | 6786 | 91767 | 20512557 | 9684 |
destroy | mutex | 106 | 6942 | 1579539 | 16286439 | 153645 |
zone_maintenance | mutex | 732 | 5928 | 133653 | 13287963 | 18152 |
rdataset_settrust | rwlock | 1176 | 4446 | 48048 | 6342882 | 5393 |
zone__settimer | mutex | 317 | 6045 | 394290 | 5682846 | 17926 |
zone_postload | rwlock | 207 | 4680 | 67431 | 4023786 | 19438 |
dns_adb_plainresponse | mutex | 380 | 6708 | 53859 | 4017819 | 10573 |
(The table continues with less important stuff...)
If you look closely, the cumulative time we spend in the adb mutexes is huge:
fn | type | count | min | max | sum | avg |
---|---|---|---|---|---|---|
dns_adb_createfind | mutex | 1597343 | 5772 | 210015 | 10540878621 | 6599 |
dns_adb_agesrtt | mutex | 806318 | 5772 | 236964 | 5284236828 | 6553 |
dns_adb_destroyfind | mutex | 556873 | 5811 | 192972 | 3703797591 | 6651 |
dns_adb_getcookie | mutex | 132632 | 5928 | 139035 | 1137403254 | 8575 |
dns_adb_setudpsize | mutex | 68673 | 6318 | 222495 | 682022757 | 9931 |
dns_adb_adjustsrtt | mutex | 80787 | 5928 | 165360 | 606351837 | 7505 |
dns_adb_changeflags | mutex | 20232 | 5889 | 110916 | 138579753 | 6849 |
dns_adb_setcookie | mutex | 16890 | 6084 | 139503 | 126144447 | 7468 |
dns_adb_ednsto | mutex | 9616 | 6435 | 74568 | 90021906 | 9361 |
dns_adb_getudpsize | mutex | 2118 | 6786 | 91767 | 20512557 | 9684 |
dns_adb_plainresponse | mutex | 380 | 6708 | 53859 | 4017819 | 10573 |
This is something that is definitely worth addressing.