fix(tunnel): add timeout to keepAliveLoop to detect dead connections after sleep/wake#581
Open
ncodespot wants to merge 2 commits intojpillora:masterfrom
Open
fix(tunnel): add timeout to keepAliveLoop to detect dead connections after sleep/wake#581ncodespot wants to merge 2 commits intojpillora:masterfrom
ncodespot wants to merge 2 commits intojpillora:masterfrom
Conversation
…after sleep/wake SendRequest blocks indefinitely on a dead TCP connection (e.g. after OS sleep/wake where no RST/FIN is received). Run it in a goroutine and use select with time.After(KeepAlive) as a deadline. On timeout, close the SSH connection so that blocked OpenChannel calls are also unblocked and the client can reconnect. Also add unit tests for keepAliveLoop and an E2E test with a freezable TCP proxy that simulates the sleep/wake scenario end-to-end. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
After OS sleep/wake, port forwarding becomes unresponsive even with
--keepaliveset. Root cause:SendRequest("ping")inkeepAliveLoopblocks indefinitely on a dead TCP connection. When the system wakes from sleep, the TCP connection is stale but no RST/FIN has been received, so the OS has not yet detected failure. Both the keepalive ping and concurrentOpenChannelcalls block until TCP retransmit timeout fires (potentially 15+ minutes).Fix
Run
SendRequestin a goroutine and race it againsttime.After(KeepAlive). If no pong is received within the keepalive interval, treat it as a dead connection and callsshConn.Close(). This also unblocks any goroutines stuck inOpenChannelon the same connection, triggering reconnection.With
--keepalive 15s, worst-case detection is now ~30s (15s sleep + 15s ping timeout) instead of indefinite.Tests
share/tunnel/tunnel_keepalive_test.go): mockssh.ConnwhoseSendRequestblocks forever; assertsClose()is called within 2×keepalive.test/e2e/keepalive_test.go): freezable TCP proxy between client and server simulates sleep/wake (silent packet drop, no RST); verifies tunnel recovers automatically.