From mboxrd@z Thu Jan 1 00:00:00 1970 From: wmorgan-sup@masanjin.net (William Morgan) Date: Fri, 03 Jul 2009 11:37:46 -0700 Subject: [sup-talk] sup In-Reply-To: <20090703173353.GA4117@gmx.de> References: <20090703173353.GA4117@gmx.de> Message-ID: <1246645330-sup-1433@entry> [cc'ing sup-talk] Hi Marc, Reformatted excerpts from Marc Weber's message of 2009-07-03: > I finally managed to package sup on nix. (nixos.org) and I'm ready to > give sup a try. Great! > a) if ferret does save the whole messages you don't really need the > original sources (maildir, mbox files), right? That's basically correct, but: a) Ferret doesn't store the entire message as represented in the mailstore (e.g. it only keeps a subset of the headers). b) Ferret has had corruption problems in the past, which I believe have now been fixed, but I wouldn't bet my mailstore on it. c) It's easy to turn that feature off, i.e., Ferret can index email for search without directly storing the email. The only reason I have Ferret store the email is because it makes changing the message labels quicker. If the Ferret index is too large, you can disable this feature at the expense of some speed. > b) [redacted] on irc told me that ferret has been abandoned in favour of > Xapian ? So are there any plans in replacing Ferret by Xapian as well? It hasn't happened quite yet (the Xapian backend is very recent and experimental), but that's the goal. I'm currently thinking about having the next release of Sup use Xapian as the default index, but still support Ferret, and the release after that not support Ferret at all. But we shall see. I'm also working on a sup server, which will act as yet another index type. > c) I've got about 280MB of mails from the haskelcafe mailinglist or > such. I missed telling sup to mark those old mails as read as well > as adding a "haskell-cafe" tag. It's not a big problem because all > mails contain [haskell-cafe] in the subject. However finding all > threads (using !!) takes ages (more than 20min or so?) so I didn't > wait until it finished. Opening the same maildir in mutt takes less > than a minute. The Xapian backend will speed this up dramatically, because the thread structures are cached, and that's what's expensive in this case. But the best solution is to use sup-tweak-labels, which is an offline tool built for exactly this purpose. > I've seen that ferret does save mails but displays threads. > Would it make sense to index the threads instead of mails? > Then displaying them and querying them could be even faster? See above. Welcome to Sup! -- William