From: Siddharth Agarwal Date: Sat, 12 Apr 2014 06:42:00 +0000 (-0700) Subject: Don't wait for EOF in _handle_upload_pack_tail with side-band-64k X-Git-Tag: dulwich-0.9.6~9^2~1 X-Git-Url: http://git.samba.org/?a=commitdiff_plain;h=730463ca4fca0a2598b037d402d727349865f003;p=jelmer%2Fdulwich.git Don't wait for EOF in _handle_upload_pack_tail with side-band-64k As of Sat, 12 Apr 2014 08:46:15 +0000, cloning, fetching or receiving a pack from a GitHub repository with dulwich results in a hang. For example, the command: bin/dulwich clone git@github.com:jelmer/dulwich.git produces the following output: Reusing existing pack: 12675, done. Counting objects: 1408, done. Compressing objects: 100% (891/891), done. and then hangs indefinitely. This commit and the next one aim to fix that. The indefinite hang is because _handle_upload_pack_tail waits for an EOF from the server that never arrives. With canonical Git, both the client and the server disconnect after the fetch is completed. With canonical Git servers, Dulwich works fine since the server will disconnect. However, GitHub seems to be running a non-canonical implementation of Git, which waits for the client to disconnect. That works fine with canonical Git clients but not with Dulwich. Is this a bug in the server or the client? This is arguable, but it's reasonable to assume that no matter how the server acts, the client should disconnect without waiting for an EOF. Why? Quoting Git's Documentation/technical/pack-protocol.txt, After reference and capabilities discovery, the client can decide to terminate the connection by sending a flush-pkt, telling the server it can now gracefully terminate, and disconnect, when it does not need any pack data. This can happen with the ls-remote command, and also can happen when the client already is up-to-date. This indicates that in general for stateful transports (i.e. not HTTP), the client is responsible the lifecycle of the connection. The protocol documentation doesn't explicitly specify what happens after the client fetches a pack, but I think it's logical to extrapolate similar behaviour. This patch causes the Dulwich client to disconnect right after the fetch is complete, which menas that the above clone command completes correctly. Clones from canonical Git servers continue to work as well. It is pretty hard to write a test for this patch because the canonical Git server doesn't behave this way and we don't have access to the GitHub server. This was originally reported as a bug in hg-git, which uses Dulwich: see https://bitbucket.org/durin42/hg-git/issue/104/hangs-after-total. Thanks to John S Long and Marco Molteni for debugging this issue over there. --- diff --git a/dulwich/client.py b/dulwich/client.py index 0e4e9abd..bd75c0ad 100644 --- a/dulwich/client.py +++ b/dulwich/client.py @@ -400,10 +400,6 @@ class GitClient(object): # Just ignore progress data progress = lambda x: None self._read_side_band64k_data(proto, {1: pack_data, 2: progress}) - # wait for EOF before returning - data = proto.read() - if data: - raise Exception('Unexpected response %r' % data) else: while True: data = proto.read(rbufsize)