Add the reader-writer synchronization with modified C-RW-WP
This changes the internal isc_rwlock implementation to:
Irina Calciu, Dave Dice, Yossi Lev, Victor Luchangco, Virendra J. Marathe, and Nir Shavit. 2013. NUMA-aware reader-writer locks. SIGPLAN Not. 48, 8 (August 2013), 157–166. DOI:https://doi.org/10.1145/2517327.24425
(The full article available from: http://mcg.cs.tau.ac.il/papers/ppopp2013-rwlocks.pdf)
The implementation is based on the The Writer-Preference Lock (C-RW-WP) variant (see the 3.4 section of the paper for the rationale).
The implemented algorithm has been modified for simplicity and for usage patterns in rbtdb.c.
The changes compared to the original algorithm:
We haven't implemented the cohort locks because that would require a knowledge of NUMA nodes, instead a simple atomic_bool is used as synchronization point for writer lock.
The per-thread reader counters are not being used - this would require the internal thread id (isc_tid_v) to be always initialized, even in the utilities; the change has a slight performance penalty, so we might revisit this change in the future. However, this change also saves a lot of memory, because cache-line aligned counters were used, so on 32-core machine, the rwlock would be 4096+ bytes big.
The readers use a writer_barrier that will raise after a while when readers lock can't be acquired to prevent readers starvation.
Separate ingress and egress readers counters queues to reduce both inter and intra-thread contention.
Closes #1609 (closed)