Skip to content

Fix handling of mismatched responses past timeout

When a UDP dispatch receives a mismatched response, it checks whether there is still enough time to wait for the correct one to arrive before the timeout fires. If there is not, the result code is set to ISC_R_TIMEDOUT, but it is not subsequently used anywhere as 'response' is set to NULL a few lines earlier. This results in the higher-level read callback (resquery_response() in case of resolver code) not being called. However, shortly afterwards, a few levels up the call chain, isc__nm_udp_read_cb() calls isc__nmsocket_timer_stop() on the dispatch socket, effectively disabling read timeout handling. Combined with the fact that reading is not restarted in such a case (e.g. by calling dispatch_getnext() from udp_recv()), this leads to the higher-level query structure remaining referenced indefinitely because the dispatch socket it uses will neither be read from nor closed due to a timeout. This in turn causes fetch contexts to linger around indefinitely, which in turn may e.g. prevent certain cache nodes (those containing rdatasets used by the fetch context, like fctx->nameservers) from being cleaned.

Fix by making sure the higher-level callback does get invoked with the ISC_R_TIMEDOUT result code when udp_recv() determines there is no more time left to receive the correct UDP response before the timeout fires. This allows the higher-level callback to clean things up, preventing the reference leak described above.

Closes #3002 (closed)

Merge request reports