From mboxrd@z Thu Jan 1 00:00:00 1970 From: rhomunuq+ml_sup@gmail.com (Iain) Date: Wed, 15 Apr 2009 18:29:15 +0100 Subject: [sup-talk] Lost Maildir Messages In-Reply-To: References: <1237224356-sup-1941@entry> <1237383823-sup-8242@entry> <1237490703-sup-4095@entry> Message-ID: <49E6196B.504@gmail.com> > Hm, that would be bad. The right way to debug this is to wait for it to > happen again (!) and examine the contents of the poll-mode and log-mode > buffers, which should describe what Sup thinks it was doing at the time. I've noticed what seems to be the same problem. I've been using getmail (no procmail or fetchmail), to grab messages into Maildir folders. Occasionally, messages don't show up in Sup. They appear only when I do a manual sup-sync --changed. I discovered by accident (by impatiently trying to read an email that I knew was being delivered by getmail into Sup) that I can reliably replicate the problem, or at least a similar problem. It seems to occur when Sup polls the Maildir at the same time as getmail is retrieving the message. I can replicate the problem by pressing P repeatedly in Sup to manually poll for messages, while getmail is in the process of retrieving. (It takes about 4 seconds for getmail to go from initialising a retrieval to dropping the mail in the Maildir, so there is a big window for when I press P, and I can replicate the problem reliably.) Every time, Sup doesn't retrieve the new messages. The log-mode buffer doesn't show anything relevant in my artifically-produced replication of the problem, because it doesn't log when you manually poll. The poll-mode buffer for the source in question displays the exact same message on every poll of the source (before getmail runs, during the getmail run, after the getmail run) when repeatedly polling before and during the getmail run: Loading from maildir:///home/user/Mail/address at example.com/... Found message at 12398138490002177 with labels {mylabel} -- the timestamp never changes. If I had to guess, I'd say that it looks like the shortcut taken (looking at the mtime of the directory to see if scanning for a new Maildir message is required) has a race condition. A weird race condition, because I've verified (ls -l --time-style=+%s) that the "new" and "tmp" folders are indeed having their timestamps updated when the mail drops in (they are). This might explain the infrequent lost messages: when not hitting P repeatedly, this race condition is unlikely to occur, only happening when the Sup poll occurs within the few seconds window during which getmail is working and working with a new message to deliver. ~Iain