-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling Stop() on Initiator blocks on Dial method #653
Comments
abronan
added a commit
to abronan/quickfix
that referenced
this issue
Jul 15, 2024
This commit fixes an issue when calling Start() and then Stop() on the initiator while the connection is likely to fail and timeout. Sending a SIGTERM will fail since Dial will attempt to connect until it times out and returns on the 'waitForReconnectInterval' call. We mitigate this problem by using a proxy.ContextDialer and allowing to pass a context with cancellation method to the dialer.DialContext method on 'handleConnection'. We need to start a routine listening for the stopChan in order to call cancel() explicitly and thus exit the DialContext method. Note: there are scenarios where cancel() will be called twice, this choice was made in order to avoid a larger refactor of the reconnect logic, but since the call to cancel() is idempotent, this doesn't lead to any adverse effect. fixes quickfixgo#653 Signed-off-by: Alexandre Beslic <[email protected]>
abronan
added a commit
to abronan/quickfix
that referenced
this issue
Jul 18, 2024
This commit fixes an issue when calling Start() and then Stop() on the initiator while the connection is likely to fail and timeout. Calling initiator.Stop() will block since Dial will attempt to connect until it times out and returns on the 'waitForReconnectInterval' call. We mitigate this problem by using a proxy.ContextDialer and allowing to pass a context with cancellation method to the dialer.DialContext method on 'handleConnection'. We need to start a routine listening for the stopChan in order to call cancel() explicitly and thus exit the DialContext method. Note: there are scenarios where cancel() will be called twice, this choice was made in order to avoid a larger refactor of the reconnect logic, but since the call to cancel() is idempotent, this doesn't lead to any adverse effect. fixes quickfixgo#653 Signed-off-by: Alexandre Beslic <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When calling
Start()
and thenStop()
on the initiator with a connection that is failing or timing out, sending a SIGTERM prevents the program from exiting because of the call to Dial inhandleConnection
.Reducing the
SocketTimeout
to a lower value mitigates this issue, since the call toDial
will exit early and then exit when checking on thestopChan
inwaitForReconnectInterval
. However this is not a clean solution and prevents flexibility from consumers of the library to be able to stop the connect process immediately.The reason why is that the
handleConnection
method is taking in aproxy.Dialer
which doesn't contain adialer.DialContext
method required to cancel the connection mid-way with a cancellation signal.As a solution, we can cast the
proxy.Dialer
to aproxy.ContextDialer
and let the call toStop()
properly exit the connection process with the use of a context onhandleConnection
.The text was updated successfully, but these errors were encountered: