Archive of RubyForge sup-talk mailing list
 help / color / mirror / Atom feed
From: rlane@club.cc.cmu.edu (Rich Lane)
Subject: [sup-talk] xapian question
Date: Tue, 28 Jul 2009 11:57:56 -0400	[thread overview]
Message-ID: <1248795865-sup-6634@pion.club.cc.cmu.edu> (raw)
In-Reply-To: <1248716325-sup-7534@masanjin.net>

Excerpts from William Morgan's message of Mon Jul 27 13:45:32 -0400 2009:
> Hey, I finally get to ask a question!
> 
> One of the mildly irritating things about Ferret was that it was
> impossible to update the labels of a message without updating the entire
> entry, i.e. including the body. So updating the labels of a message and
> saving that to disk required either re-loading the body from the source,
> or keeping the body explicitly in the index so that it could be loaded
> without going back to the source.
> 
> The latter approach is used by the current Ferret index implementation,
> since it's significantly faster (especially for slow sources like IMAP
> servers), but at the cost of a lot of disk space.
> 
> My understanding of Xapian is that this is also the case, since fields
> are essentially represented as prefixed terms, and so you're basically
> updating a big blog, but I wanted to confirm this. I ask because the
> entries.db file is very big. :)

Xapian actually provides add_term and remove_term for documents. I'd
definitely like to use these for label updates, but we need a way to
tell if only the labels have changed in sync_message. Or, we update the
index in Message#add_label/etc and get rid of the need to save buffers.
That might not be an option for the Ferret index, though.

We don't store the body in entries.db, just enough info for
thread-index-mode. It's only about 800 bytes/message for me, but I don't
have snippets enabled so yours would be larger.


  reply	other threads:[~2009-07-28 15:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-27 17:45 William Morgan
2009-07-28 15:57 ` Rich Lane [this message]
2009-07-28 19:05   ` William Morgan
2009-08-01  6:28     ` Rich Lane
2009-08-03 18:03       ` William Morgan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1248795865-sup-6634@pion.club.cc.cmu.edu \
    --to=rlane@club.cc.cmu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox