Skip to content

Fix rare control channel socket reference leak

Commit 9ee60e7a enabled netmgr shutdown to cause read callbacks for active control channel sockets to be invoked with the ISC_R_SHUTTINGDOWN result code. However, control channel code only recognizes ISC_R_CANCELED as an indicator of an in-progress netmgr shutdown (which was correct before the above commit). This discrepancy enables the following scenario to happen in rare cases:

  1. A control channel request is received and responded to. libuv manages to write the response to the TCP socket, but the completion callback (control_senddone()) is yet to be invoked.

  2. Server shutdown is initiated. All TCP sockets are shut down, which i.a. causes control_recvmessage() to be invoked with the ISC_R_SHUTTINGDOWN result code. As the result code is not ISC_R_CANCELED, control_recvmessage() does not set listener->controls->shuttingdown to 'true'.

  3. control_senddone() is called with the ISC_R_SUCCESS result code. As neither listener->controls->shuttingdown is 'true' nor is the result code ISC_R_CANCELED, reading is resumed on the control channel socket. However, this read can never be completed because the read callback on that socket was cleared when the TCP socket was shut down. This causes a reference on the socket's handle to be held indefinitely, leading to a hang upon shutdown.

Ensure listener->controls->shuttingdown is also set to 'true' when control_recvmessage() is invoked with the ISC_R_SHUTTINGDOWN result code. This ensures the send completion callback does not resume reading after the control channel socket is shut down.

Closes #3068 (closed)

Merge request reports