From: wmorgan-sup@masanjin.net (William Morgan)
Subject: [sup-talk] [PATCH 0/18] Xapian-based index
Date: Wed, 24 Jun 2009 09:30:39 -0700 [thread overview]
Message-ID: <1245854803-sup-4481@entry> (raw)
In-Reply-To: <1245531017-9907-1-git-send-email-rlane@club.cc.cmu.edu>
Hi Rich,
Reformatted excerpts from Rich Lane's message of 2009-06-20:
> This patch series refactors the Index class to remove Ferret-isms and
> support multiple index implementations. The included XapianIndex is a
> bit faster at indexing messages and significantly faster when
> searching because it precomputes thread membership. It also works on
> Ruby 1.9.1.
This is great. Really, really great. You've refactored a crufty
interface that's been growing untamed over the past three years, you've
gotten us away from the unmaintained scariness that is Ferret, you've
fixed the largest source of interface slowness (thread recomputation),
and you've enabled us to move to the beautiful, speedy, encoding-aware
world of Ruby 1.9. Thank you for satisfying all of my Sup-related
desires in one fell swoop. From my lofty throne, I commend thee.
Once the bugs are ironed out, I would like to make this the default
index format and eventually deprecate Ferret.
In the mean time, I've placed your patches on a branch called xapian. If
anyone wants to play with this, here's what you do:
1. install the ruby xapian library and the ruby gdbm library, if you
don't have them. These are packaged by your distro, and are not gems.
2. git fetch
3. git checkout -b xapian origin/xapian
4. cp ~/.sup/sources.yaml /tmp # just in case
5. sup-dump > dumpfile
6. SUP_INDEX=xapian sup-sync --all --all-sources --restore dumpfile
7. SUP_INDEX=xapian bin/sup -o
8. Oooh, fast.
This should not disturb your Ferret index, so you can switch back and
forth between the two. (Message state, of course, is not shared.)
However, adding new messages to one index will prevent it from being
automatically added to the other, so I recommend running in Xapian mode
with -o and not pressing 'P'.
> It's missing a couple of features, notably threading by subject.
FWIW, I've been thinking about deprecating that particular feature for
quite some time.
> I'm sure there are many more bugs left, so I'd appreciate any testing
> or review you all can provide.
sup-sync crashes for me fairly systematically with this error:
./lib/sup/xapian_index.rb:404:in `sortable_serialise': Expected argument 0 of type double, but got Fixnum 51767811298 (TypeError)
in SWIG method 'Xapian::sortable_serialise'
from ./lib/sup/xapian_index.rb:404:in `index_message'
from ./lib/sup/xapian_index.rb:111:in `sync_message'
from /usr/lib/ruby/1.8/monitor.rb:242:in `synchronize'
from ./lib/sup/xapian_index.rb:324:in `synchronize'
from ./lib/sup/xapian_index.rb:110:in `sync_message'
from ./lib/sup/util.rb:519:in `send'
from ./lib/sup/util.rb:519:in `method_missing'
from ./lib/sup/poll.rb:157:in `add_messages_from'
from ./lib/sup/source.rb:100:in `each'
from ./lib/sup/util.rb:558:in `send'
from ./lib/sup/util.rb:558:in `__pass'
from ./lib/sup/util.rb:545:in `method_missing'
from ./lib/sup/poll.rb:141:in `add_messages_from'
from ./lib/sup/util.rb:519:in `send'
from ./lib/sup/util.rb:519:in `method_missing'
from bin/sup-sync:140
from bin/sup-sync:135:in `each'
from bin/sup-sync:135
I haven't spent any time tracking it down. Other than that, so far so
good.
--
William <wmorgan-sup at masanjin.net>
next prev parent reply other threads:[~2009-06-24 16:30 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-20 20:49 Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 01/18] remove load_entry_for_id call in sup-recover-sources Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 02/18] remove load_entry_for_id call in DraftManager.discard Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 03/18] remove ferret entry from poll/sync interface Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 04/18] index: remove unused method load_entry_for_id Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 05/18] switch DraftManager to use Message.build_from_source Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 06/18] index: move has_any_from_source_with_label? to sup-sync-back Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 07/18] move source-related methods to SourceManager Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 08/18] index: remove unused method fresh_thread_id Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 09/18] index: revert overeager opts->query rename in each_message_in_thread_for Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 10/18] index: make wrap_subj methods private Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 11/18] index: move Ferret-specific code to ferret_index.rb Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 12/18] remove last external uses of ferret docid Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 13/18] add Message.indexable_{body, chunks, subject} Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 14/18] index: choose index implementation with config entry or environment variable Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 15/18] index: add xapian implementation Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 16/18] fix String#ord monkeypatch Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 17/18] add limit argument to author_names_and_newness_for_thread Rich Lane
2009-06-20 20:50 ` [sup-talk] [PATCH 18/18] dont using SavingHash#[] for membership test Rich Lane
2009-06-22 14:46 ` Andrei Thorp
2009-06-24 16:30 ` William Morgan [this message]
2009-06-24 17:33 ` [sup-talk] [PATCH 0/18] Xapian-based index William Morgan
2009-06-26 2:00 ` Olly Betts
2009-06-26 13:49 ` William Morgan
2009-07-17 23:42 ` Richard Heycock
2009-07-23 10:23 ` Adeodato Simó
2009-07-25 4:53 ` Rich Lane
2009-07-25 9:21 ` Adeodato Simó
2009-07-25 19:59 ` Rich Lane
2009-07-25 23:28 ` Ingmar Vanhassel
2009-07-27 15:48 ` William Morgan
2009-07-27 16:56 ` Ingmar Vanhassel
2009-09-01 8:07 ` Ingmar Vanhassel
2009-09-03 16:52 ` Rich Lane
2009-07-27 17:06 ` Rich Lane
2009-07-31 16:20 ` Rich Lane
2009-08-12 13:05 ` Ingmar Vanhassel
2009-08-12 14:32 ` Nicolas Pouillard
2009-08-14 5:23 ` Rich Lane
2009-07-27 15:46 ` William Morgan
2009-07-28 16:53 ` Olly Betts
2009-07-28 17:01 ` William Morgan
2009-07-28 13:47 ` Olly Betts
2009-07-28 15:07 ` William Morgan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1245854803-sup-4481@entry \
--to=wmorgan-sup@masanjin.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox