From mboxrd@z Thu Jan 1 00:00:00 1970 From: ezyang@MIT.EDU (Edward Z. Yang) Date: Wed, 3 Jun 2009 17:48:56 -0400 (EDT) Subject: [sup-talk] Sup is hanging In-Reply-To: <1244064889-sup-9506@entry> References: <1244050695-sup-5990@javelin> <1244053108-sup-4982@javelin> <1244064889-sup-9506@entry> Message-ID: On Wed, 3 Jun 2009, William Morgan wrote: > Reformatted excerpts from Edward Z. Yang's message of 2009-06-03: >> The current working theory (based on strace'ing and lsof) is that Sup >> is hanging on a select() call that doesn't have any timeout. The >> reason why this is hanging is because between opening the connection >> and closing it, Sup does some ridiculously slow message parsing code >> (that really thrashes the CPU) and by the time it's done the server >> has closed the connection (but we don't know about it). > > That may all be true, but I'd be surprised if a) Sup didn't know that > the server had closed the connection, and It doesn't have anything to do with Sup: the system select() call is hanging; Sup (and Ruby for that matter) is out of the equation. This might be IMAP server weirdness, but that shouldn't cause Sup to freeze. > b) that this would cause the > whole thing to hang. That's what I find surprising, since Sup is multithreaded. I suppose all of the threads are blocking on the select. Sup hangs without -n. (it always hangs after it's "done fetching IMAP headers"). "done fetching IMAP headers" is the last message we get. The behavior seems to be: 1. Log reports "done fetching IMAP headers" 2. Sup begins malloc'ing like crazy. sup-sync still is able to catch signals and if you kill it you find that it's somewhere in lib/sup/message.rb, message_to_chunks. sup doesn't respond to Ctrl+C 3. After some minutes, CPU usage dies off, but Sup doesn't unfreeze. strace indicates that sup/sup-sync is waiting on a select(), and that the descriptors are TCP connections to the IMAP server. Cheers, Edward