Archive of RubyForge sup-talk mailing list
 help / color / mirror / Atom feed
* [sup-talk] sup
       [not found] <20090703173353.GA4117@gmx.de>
@ 2009-07-03 18:37 ` William Morgan
  0 siblings, 0 replies; only message in thread
From: William Morgan @ 2009-07-03 18:37 UTC (permalink / raw)


[cc'ing sup-talk]

Hi Marc,

Reformatted excerpts from Marc Weber's message of 2009-07-03:
> I finally managed to package sup on nix. (nixos.org) and I'm ready to
> give sup a try.

Great!

> a) if ferret does save the whole messages you don't really need the
>     original sources (maildir, mbox files), right?

That's basically correct, but:

a) Ferret doesn't store the entire message as represented in the
   mailstore (e.g. it only keeps a subset of the headers).
b) Ferret has had corruption problems in the past, which I believe have
   now been fixed, but I wouldn't bet my mailstore on it.
c) It's easy to turn that feature off, i.e., Ferret can index email for
   search without directly storing the email. The only reason I have
   Ferret store the email is because it makes changing the message
   labels quicker. If the Ferret index is too large, you can disable
   this feature at the expense of some speed.

> b) [redacted] on irc told me that ferret has been abandoned in favour of
>   Xapian ? So are there any plans in replacing Ferret by Xapian as well?

It hasn't happened quite yet (the Xapian backend is very recent and
experimental), but that's the goal. I'm currently thinking about having
the next release of Sup use Xapian as the default index, but still
support Ferret, and the release after that not support Ferret at all.
But we shall see. I'm also working on a sup server, which will act as
yet another index type.

> c) I've got about 280MB of mails from the haskelcafe mailinglist or
>   such. I missed telling sup to mark those old mails as read as well
>   as adding a "haskell-cafe" tag. It's not a big problem because all
>   mails contain [haskell-cafe] in the subject. However finding all
>   threads (using !!) takes ages (more than 20min or so?) so I didn't
>   wait until it finished.  Opening the same maildir in mutt takes less
>   than a minute.

The Xapian backend will speed this up dramatically, because the thread
structures are cached, and that's what's expensive in this case. But the
best solution is to use sup-tweak-labels, which is an offline tool built
for exactly this purpose.

> I've seen that ferret does save mails but displays threads.
> Would it make sense to index the threads instead of mails?
> Then displaying them and querying them could be even faster?

See above.

Welcome to Sup!
-- 
William <wmorgan-sup at masanjin.net>


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2009-07-03 18:37 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20090703173353.GA4117@gmx.de>
2009-07-03 18:37 ` [sup-talk] sup William Morgan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox