* [sup-talk] Xapian: Term too long
@ 2009-10-12 22:34 Tero Tilus
2009-10-15 12:59 ` William Morgan
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Tero Tilus @ 2009-10-12 22:34 UTC (permalink / raw)
To: sup-talk
sup-sync blows up like this
/home/terotil/src/sup/lib/sup/xapian_index.rb:446:in `replace_document': InvalidArgumentError: Term too long (> 245): Lfwd: =?iso-8859-1?q?tekij=e4n_oikeudet=5d?= (ArgumentError)
x-enigmail-version: 0.92.0.0
content-type: multipart/mixed;
boundary="------------010606010007070802040301"
x-virus-scanned: amavisd-new at cc.jyu.fi
x-spam-status: no, hits=-2.373 required=5 tests=[awl=0.226, bayes_00=-2.599
from /home/terotil/src/sup/lib/sup/xapian_index.rb:446:in `sync_message'
from /usr/lib/ruby/1.8/monitor.rb:242:in `synchronize'
from /home/terotil/src/sup/lib/sup/xapian_index.rb:363:in `synchronize'
from /home/terotil/src/sup/lib/sup/xapian_index.rb:440:in `sync_message'
from /home/terotil/src/sup/lib/sup/xapian_index.rb:92:in `add_message'
from /home/terotil/src/sup/bin/sup-sync:211
...
Relevant part of the problematic mail looks like this
User-Agent: Debian Thunderbird 1.0.6 (X11/20050802)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: mutikainen@iki.fi
Subject: [Fwd: =?ISO-8859-1?Q?tekij=E4n_oikeudet=5D?=
X-Enigmail-Version: 0.92.0.0
Content-Type: multipart/mixed;
boundary="------------010606010007070802040301"
X-Virus-Scanned: amavisd-new at cc.jyu.fi
X-Spam-Status: No, hits=-2.373 required=5 tests=[AWL=0.226, BAYES_00=-2.599]
X-Spam-Level:
X-Sorted: Whitelist
Content-Length: 11892
This is how I solved it for me, for now
diff --git a/lib/sup/xapian_index.rb b/lib/sup/xapian_index.rb
index ad45b0e..d3b3e25 100644
--- a/lib/sup/xapian_index.rb
+++ b/lib/sup/xapian_index.rb
@@ -443,7 +443,11 @@ EOS
warn "docid underflow, dropping #{m.id.inspect}"
return
end
- @xapian.replace_document docid, doc
+ begin
+ @xapian.replace_document docid, doc
+ rescue StandardError => err
+ warn "Failed to add message #{m.id.inspect} to Xapian index: #{err}"
+ end
end
m.labels.each { |l| LabelManager << l }
Looks like lib/sup/xapian_index.rb tries to override
Xapian::Document#add_term with a version which is wired to ditch too
long terms. Only that you can't override methods just by including a
module. Methods of the including class override methods in included
module.
terotil@sotka:~$ irb
> class Foo; def bar; :bar; end; end
=> nil
> module Baz; def bar; :baz; end; end
=> nil
> class Foo; include Baz; end
=> Foo
> Foo.new.bar
=> :bar
> Foo.ancestors
=> [Foo, Baz, Object, Kernel] # Foo before Baz, methods in Foo take priority
It is still Foo#bar being called, not Baz#bar. You need to open up
Xapian::Document and then do alias method chaining to override
methods. Or you could do tricks like
http://coderrr.wordpress.com/2008/10/29/secure-alias-method-chaining/
--
Tero Tilus ## 050 3635 235 ## http://tero.tilus.net/
_______________________________________________
sup-talk mailing list
sup-talk@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-talk
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [sup-talk] Xapian: Term too long
2009-10-12 22:34 [sup-talk] Xapian: Term too long Tero Tilus
@ 2009-10-15 12:59 ` William Morgan
2009-10-20 5:34 ` [sup-talk] [PATCH] xapian: replace DocumentMethods module with plain monkeypatching Rich Lane
2009-10-20 6:14 ` Rich Lane
2 siblings, 0 replies; 6+ messages in thread
From: William Morgan @ 2009-10-15 12:59 UTC (permalink / raw)
To: sup-talk
Reformatted excerpts from Tero Tilus's message of 2009-10-12:
> Looks like lib/sup/xapian_index.rb tries to override
> Xapian::Document#add_term with a version which is wired to ditch too
> long terms. Only that you can't override methods just by including a
> module. Methods of the including class override methods in included
> module.
Very good point. Thanks!
--
William <wmorgan-sup@masanjin.net>
_______________________________________________
sup-talk mailing list
sup-talk@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-talk
^ permalink raw reply [flat|nested] 6+ messages in thread
* [sup-talk] [PATCH] xapian: replace DocumentMethods module with plain monkeypatching
2009-10-12 22:34 [sup-talk] Xapian: Term too long Tero Tilus
2009-10-15 12:59 ` William Morgan
@ 2009-10-20 5:34 ` Rich Lane
2009-10-20 6:13 ` Rich Lane
2009-10-20 6:14 ` Rich Lane
2 siblings, 1 reply; 6+ messages in thread
From: Rich Lane @ 2009-10-20 5:34 UTC (permalink / raw)
To: sup-talk
---
lib/sup/xapian_index.rb | 25 +++++++++++++++++++++++++
1 files changed, 25 insertions(+), 0 deletions(-)
diff --git a/lib/sup/xapian_index.rb b/lib/sup/xapian_index.rb
index e1cfe65..c373c17 100644
--- a/lib/sup/xapian_index.rb
+++ b/lib/sup/xapian_index.rb
@@ -560,7 +560,32 @@ EOS
raise "Invalid term type #{type}"
end
end
+end
end
+class Xapian::Document
+ def entry
+ Marshal.load data
+ end
+
+ def entry=(x)
+ self.data = Marshal.dump x
+ end
+
+ def index_text text, prefix, weight=1
+ term_generator = Xapian::TermGenerator.new
+ term_generator.stemmer = Xapian::Stem.new(Redwood::XapianIndex::STEM_LANGUAGE)
+ term_generator.document = self
+ term_generator.index_text text, weight, prefix
+ end
+
+ alias old_add_term add_term
+ def add_term term
+ if term.length <= Redwood::XapianIndex::MAX_TERM_LENGTH
+ old_add_term term
+ else
+ warn "dropping excessively long term #{term}"
+ end
+ end
end
--
1.6.4.2
_______________________________________________
sup-talk mailing list
sup-talk@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-talk
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [sup-talk] [PATCH] xapian: replace DocumentMethods module with plain monkeypatching
2009-10-20 5:34 ` [sup-talk] [PATCH] xapian: replace DocumentMethods module with plain monkeypatching Rich Lane
@ 2009-10-20 6:13 ` Rich Lane
0 siblings, 0 replies; 6+ messages in thread
From: Rich Lane @ 2009-10-20 6:13 UTC (permalink / raw)
To: sup-talk
Disregard this one. (I thought master had already gotten my
update-message-state patch)
Excerpts from Rich Lane's message of Tue Oct 20 01:34:37 -0400 2009:
> ---
> lib/sup/xapian_index.rb | 25 +++++++++++++++++++++++++
> 1 files changed, 25 insertions(+), 0 deletions(-)
>
> diff --git a/lib/sup/xapian_index.rb b/lib/sup/xapian_index.rb
> index e1cfe65..c373c17 100644
> --- a/lib/sup/xapian_index.rb
> +++ b/lib/sup/xapian_index.rb
> @@ -560,7 +560,32 @@ EOS
> raise "Invalid term type #{type}"
> end
> end
> +end
>
> end
>
> +class Xapian::Document
> + def entry
> + Marshal.load data
> + end
> +
> + def entry=(x)
> + self.data = Marshal.dump x
> + end
> +
> + def index_text text, prefix, weight=1
> + term_generator = Xapian::TermGenerator.new
> + term_generator.stemmer =
> Xapian::Stem.new(Redwood::XapianIndex::STEM_LANGUAGE)
> + term_generator.document = self
> + term_generator.index_text text, weight, prefix
> + end
> +
> + alias old_add_term add_term
> + def add_term term
> + if term.length <= Redwood::XapianIndex::MAX_TERM_LENGTH
> + old_add_term term
> + else
> + warn "dropping excessively long term #{term}"
> + end
> + end
> end
_______________________________________________
sup-talk mailing list
sup-talk@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-talk
^ permalink raw reply [flat|nested] 6+ messages in thread
* [sup-talk] [PATCH] xapian: replace DocumentMethods module with plain monkeypatching
2009-10-12 22:34 [sup-talk] Xapian: Term too long Tero Tilus
2009-10-15 12:59 ` William Morgan
2009-10-20 5:34 ` [sup-talk] [PATCH] xapian: replace DocumentMethods module with plain monkeypatching Rich Lane
@ 2009-10-20 6:14 ` Rich Lane
2009-11-02 19:28 ` William Morgan
2 siblings, 1 reply; 6+ messages in thread
From: Rich Lane @ 2009-10-20 6:14 UTC (permalink / raw)
To: sup-talk
---
lib/sup/xapian_index.rb | 47 ++++++++++++++++++++++-------------------------
1 files changed, 22 insertions(+), 25 deletions(-)
diff --git a/lib/sup/xapian_index.rb b/lib/sup/xapian_index.rb
index ad45b0e..34d67d5 100644
--- a/lib/sup/xapian_index.rb
+++ b/lib/sup/xapian_index.rb
@@ -565,35 +565,32 @@ EOS
raise "Invalid term type #{type}"
end
end
+end
- module DocumentMethods
- def entry
- Marshal.load data
- end
-
- def entry=(x)
- self.data = Marshal.dump x
- end
+end
- def index_text text, prefix, weight=1
- term_generator = Xapian::TermGenerator.new
- term_generator.stemmer = Xapian::Stem.new(STEM_LANGUAGE)
- term_generator.document = self
- term_generator.index_text text, weight, prefix
- end
+class Xapian::Document
+ def entry
+ Marshal.load data
+ end
- def add_term term
- if term.length <= MAX_TERM_LENGTH
- super term
- else
- warn "dropping excessively long term #{term}"
- end
- end
+ def entry=(x)
+ self.data = Marshal.dump x
end
-end
-end
+ def index_text text, prefix, weight=1
+ term_generator = Xapian::TermGenerator.new
+ term_generator.stemmer = Xapian::Stem.new(Redwood::XapianIndex::STEM_LANGUAGE)
+ term_generator.document = self
+ term_generator.index_text text, weight, prefix
+ end
-class Xapian::Document
- include Redwood::XapianIndex::DocumentMethods
+ alias old_add_term add_term
+ def add_term term
+ if term.length <= Redwood::XapianIndex::MAX_TERM_LENGTH
+ old_add_term term
+ else
+ warn "dropping excessively long term #{term}"
+ end
+ end
end
--
1.6.4.2
_______________________________________________
sup-talk mailing list
sup-talk@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-talk
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [sup-talk] [PATCH] xapian: replace DocumentMethods module with plain monkeypatching
2009-10-20 6:14 ` Rich Lane
@ 2009-11-02 19:28 ` William Morgan
0 siblings, 0 replies; 6+ messages in thread
From: William Morgan @ 2009-11-02 19:28 UTC (permalink / raw)
To: sup-talk
Branch xapian-bugfix, merged into next. Thanks!
--
William <wmorgan-sup@masanjin.net>
_______________________________________________
sup-talk mailing list
sup-talk@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-talk
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-11-02 19:28 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-12 22:34 [sup-talk] Xapian: Term too long Tero Tilus
2009-10-15 12:59 ` William Morgan
2009-10-20 5:34 ` [sup-talk] [PATCH] xapian: replace DocumentMethods module with plain monkeypatching Rich Lane
2009-10-20 6:13 ` Rich Lane
2009-10-20 6:14 ` Rich Lane
2009-11-02 19:28 ` William Morgan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox