Skip to content
  • Diego Fronza's avatar
    Fixed potential-lock-inversion · e7b36924
    Diego Fronza authored
    This commit simplifies a bit the lock management within dns_resolver_prime()
    and prime_done() functions by means of turning resolver's attribute
    "priming" into an atomic_bool and by creating only one dependent object on the
    lock "primelock", namely the "primefetch" attribute.
    
    By having the attribute "priming" as an atomic type, it save us from having to
    use a lock just to test if priming is on or off for the given resolver context
    object, within "dns_resolver_prime" function.
    
    The "primelock" lock is still necessary, since dns_resolver_prime() function
    internally calls dns_resolver_createfetch(), and whenever this function
    succeeds it registers an event in the task manager which could be called by
    another thread, namely the "prime_done" function, and this function is
    responsible for disposing the "primefetch" attribute in the resolver object,
    also for resetting "priming" attribute to false.
    
    It is important that the invariant "priming == false AND primefetch == NULL"
    remains constant, so that any thread calling "dns_resolver_prime" knows for sure
    that if the "priming" attribute is false, "primefetch" attribute should also be
    NULL, so a new fetch context could be created to fulfill this purpose, and
    assigned to "primefetch" attribute under the lock protection.
    
    To honor the explanation above, dns_resolver_prime is implemented as follow:
    	1. Atomically checks the attribute "priming" for the given resolver context.
    	2. If "priming" is false, assumes that "primefetch" is NULL (this is
               ensured by the "prime_done" implementation), acquire "primelock"
    	   lock and create a new fetch context, update "primefetch" pointer to
    	   point to the newly allocated fetch context.
    	3. If "priming" is true, assumes that the job is already in progress,
    	   no locks are acquired, nothing else to do.
    
    To keep the previous invariant consistent, "prime_done" is implemented as follow:
    	1. Acquire "primefetch" lock.
    	2. Keep a reference to the current "primefetch" object;
    	3. Reset "primefetch" attribute to NULL.
    	4. Release "primefetch" lock.
    	5. Atomically update "priming" attribute to false.
    	6. Destroy the "primefetch" object by using the temporary reference.
    
    This ensures that if "priming" is false, "primefetch" was already reset to NULL.
    
    It doesn't make any difference in having the "priming" attribute not protected
    by a lock, since the visible state of this variable would depend on the calling
    order of the functions "dns_resolver_prime" and "prime_done".
    
    As an example, suppose that instead of using an atomic for the "priming" attribute
    we employed a lock to protect it.
    Now suppose that "prime_done" function is called by Thread A, it is then preempted
    before acquiring the lock, thus not reseting "priming" to false.
    In parallel to that suppose that a Thread B is scheduled and that it calls
    "dns_resolver_prime()", it then acquires the lock and check that "priming" is true,
    thus it will consider that this resolver object is already priming and it won't do
    any more job.
    Conversely if the lock order was acquired in the other direction, Thread B would check
    that "priming" is false (since prime_done acquired the lock first and set "priming" to false)
    and it would initiate a priming fetch for this resolver.
    
    An atomic variable wouldn't change this behavior, since it would behave exactly the
    same, depending on the function call order, with the exception that it would avoid
    having to use a lock.
    
    There should be no side effects resulting from this change, since the previous
    implementation employed use of the more general resolver's "lock" mutex, which
    is used in far more contexts, but in the specifics of the "dns_resolver_prime"
    and "prime_done" it was only used to protect "primefetch" and "priming" attributes,
    which are not used in any of the other critical sections protected by the same lock,
    thus having zero dependency on those variables.
    e7b36924