Archive of RubyForge sup-talk mailing list
 help / color / mirror / Atom feed
* [sup-talk] Bug: Converting the index to xapian fails with a message with very old date
@ 2009-09-20 19:46 Michael Stapelberg
  2009-09-26 13:56 ` William Morgan
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Stapelberg @ 2009-09-20 19:46 UTC (permalink / raw)


Hi,

I have a spam mail in one of my sources. This mail has the following Date-line:

Date: Mon, 1 Jan 1601 00:48:33 +0500

This makes sup/xapian/ruby (one of them) go crazy and abort with an exception:
## read 5856m (about 72%) @ 10.0m/s. 0:09:47 elapsed, about 0:03:44 remaining
/usr/lib/ruby/1.8/sup/xapian_index.rb:532:in `_dump': year too big to marshal (ArgumentError)
        from /usr/lib/ruby/1.8/sup/xapian_index.rb:532:in `dump'
        from /usr/lib/ruby/1.8/sup/xapian_index.rb:532:in `index_message'
        from /usr/lib/ruby/1.8/sup/xapian_index.rb:124:in `sync_message'
        from /usr/lib/ruby/1.8/monitor.rb:242:in `synchronize'
        from /usr/lib/ruby/1.8/sup/xapian_index.rb:398:in `synchronize'
        from /usr/lib/ruby/1.8/sup/xapian_index.rb:123:in `sync_message'
        from /usr/lib/ruby/1.8/sup/xapian_index.rb:94:in `add_message'
        from /home/michael/software/sup-mainline/bin/sup-sync:211
        from /usr/lib/ruby/1.8/sup/poll.rb:151:in `each_message_from'
        from /usr/lib/ruby/1.8/sup/imap.rb:197:in `each'
        from /usr/lib/ruby/1.8/sup/imap.rb:185:in `upto'
        from /usr/lib/ruby/1.8/sup/imap.rb:185:in `each'
        from /usr/lib/ruby/1.8/sup/util.rb:560:in `send'
        from /usr/lib/ruby/1.8/sup/util.rb:560:in `__pass'
        from /usr/lib/ruby/1.8/sup/util.rb:547:in `method_missing'
        from /usr/lib/ruby/1.8/sup/poll.rb:139:in `each_message_from'
        from /usr/lib/ruby/1.8/sup/util.rb:520:in `send'
        from /usr/lib/ruby/1.8/sup/util.rb:520:in `method_missing'
        from /home/michael/software/sup-mainline/bin/sup-sync:146
        from /home/michael/software/sup-mainline/bin/sup-sync:141:in `each'
        from /home/michael/software/sup-mainline/bin/sup-sync:141

I used revision 68bf6a277c5fdefb3b9d6a4b5d4dfbce3f9f9ccf of sup.

Best regards,
Michael


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [sup-talk] Bug: Converting the index to xapian fails with a message with very old date
  2009-09-20 19:46 [sup-talk] Bug: Converting the index to xapian fails with a message with very old date Michael Stapelberg
@ 2009-09-26 13:56 ` William Morgan
  2009-09-26 21:38   ` Michael Stapelberg
  0 siblings, 1 reply; 5+ messages in thread
From: William Morgan @ 2009-09-26 13:56 UTC (permalink / raw)


Reformatted excerpts from Michael Stapelberg's message of 2009-09-20:
> This makes sup/xapian/ruby (one of them) go crazy and abort with an exception:
> ## read 5856m (about 72%) @ 10.0m/s. 0:09:47 elapsed, about 0:03:44 remaining
> /usr/lib/ruby/1.8/sup/xapian_index.rb:532:in `_dump': year too big to marshal
> (ArgumentError)

Interesting. The Xapian index has had some trouble with crazy dates in
the past, but that should be fixed. Can you apply the following debug
patch and send the output for that message?

Thanks!

diff --git a/lib/sup/xapian_index.rb b/lib/sup/xapian_index.rb
index ab25ea0..b94c8b0 100644
--- a/lib/sup/xapian_index.rb
+++ b/lib/sup/xapian_index.rb
@@ -509,6 +509,9 @@ EOS
       Xapian.sortable_serialise 0
     end
 
+    puts "> truncated date is #{truncated_date.inspect}"
+    puts "> date_value is #{date_value.inspect}"
+
     docid = nil
     unless doc = find_doc(m.id)
       doc = Xapian::Document.new

-- 
William <wmorgan-sup at masanjin.net>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [sup-talk] Bug: Converting the index to xapian fails with a message with very old date
  2009-09-26 13:56 ` William Morgan
@ 2009-09-26 21:38   ` Michael Stapelberg
  2009-10-01 13:49     ` William Morgan
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Stapelberg @ 2009-09-26 21:38 UTC (permalink / raw)


Hi,

Excerpts from William Morgan's message of Sa Sep 26 15:56:04 +0200 2009:
> Interesting. The Xapian index has had some trouble with crazy dates in
> the past, but that should be fixed. Can you apply the following debug
> patch and send the output for that message?
Yes, here we go:

truncated date is Thu Jan 01 01:00:00 +0100 1970
date_value is "\200"
/usr/lib/ruby/1.8/sup/xapian_index.rb:536:in `_dump': year too big to marshal (ArgumentError)
        from /usr/lib/ruby/1.8/sup/xapian_index.rb:536:in `dump'
        from /usr/lib/ruby/1.8/sup/xapian_index.rb:536:in `index_message'
        from /usr/lib/ruby/1.8/sup/xapian_index.rb:124:in `sync_message'
        from /usr/lib/ruby/1.8/monitor.rb:242:in `synchronize'
        from /usr/lib/ruby/1.8/sup/xapian_index.rb:398:in `synchronize'
        from /usr/lib/ruby/1.8/sup/xapian_index.rb:123:in `sync_message'
        from /usr/lib/ruby/1.8/sup/xapian_index.rb:94:in `add_message'
        from /home/michael/software/sup-mainline/bin/sup-sync:211
        from /usr/lib/ruby/1.8/sup/poll.rb:151:in `each_message_from'
        from /usr/lib/ruby/1.8/sup/imap.rb:197:in `each'
        from /usr/lib/ruby/1.8/sup/imap.rb:185:in `upto'
        from /usr/lib/ruby/1.8/sup/imap.rb:185:in `each'
        from /usr/lib/ruby/1.8/sup/util.rb:560:in `send'
        from /usr/lib/ruby/1.8/sup/util.rb:560:in `__pass'
        from /usr/lib/ruby/1.8/sup/util.rb:547:in `method_missing'
        from /usr/lib/ruby/1.8/sup/poll.rb:139:in `each_message_from'
        from /usr/lib/ruby/1.8/sup/util.rb:520:in `send'
        from /usr/lib/ruby/1.8/sup/util.rb:520:in `method_missing'
        from /home/michael/software/sup-mainline/bin/sup-sync:146
        from /home/michael/software/sup-mainline/bin/sup-sync:141:in `each'
        from /home/michael/software/sup-mainline/bin/sup-sync:141

Best regards,
Michael


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [sup-talk] Bug: Converting the index to xapian fails with a message with very old date
  2009-09-26 21:38   ` Michael Stapelberg
@ 2009-10-01 13:49     ` William Morgan
  2009-10-02 20:35       ` Rich Lane
  0 siblings, 1 reply; 5+ messages in thread
From: William Morgan @ 2009-10-01 13:49 UTC (permalink / raw)


Reformatted excerpts from Michael Stapelberg's message of 2009-09-26:
> truncated date is Thu Jan 01 01:00:00 +0100 1970
> date_value is "\200"
> /usr/lib/ruby/1.8/sup/xapian_index.rb:536:in `_dump': year too big to marshal
> (ArgumentError)
>         from /usr/lib/ruby/1.8/sup/xapian_index.rb:536:in `dump'
>         from /usr/lib/ruby/1.8/sup/xapian_index.rb:536:in `index_message'
>         from /usr/lib/ruby/1.8/sup/xapian_index.rb:124:in `sync_message'

At this point I'm hoping that Rich will chime in. :)
-- 
William <wmorgan-sup at masanjin.net>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [sup-talk] Bug: Converting the index to xapian fails with a message with very old date
  2009-10-01 13:49     ` William Morgan
@ 2009-10-02 20:35       ` Rich Lane
  0 siblings, 0 replies; 5+ messages in thread
From: Rich Lane @ 2009-10-02 20:35 UTC (permalink / raw)


Excerpts from William Morgan's message of Thu Oct 01 09:49:54 -0400 2009:
> Reformatted excerpts from Michael Stapelberg's message of 2009-09-26:
> > truncated date is Thu Jan 01 01:00:00 +0100 1970
> > date_value is "\200"
> > /usr/lib/ruby/1.8/sup/xapian_index.rb:536:in `_dump': year too big to marshal
> > (ArgumentError)
> >         from /usr/lib/ruby/1.8/sup/xapian_index.rb:536:in `dump'
> >         from /usr/lib/ruby/1.8/sup/xapian_index.rb:536:in `index_message'
> >         from /usr/lib/ruby/1.8/sup/xapian_index.rb:124:in `sync_message'
> 
> At this point I'm hoping that Rich will chime in. :)

This strikes me as a bug in Marshal.dump - it should be able to handle
arbitrary dates. I've never messed with custom marshalling functions
before, but monkey patching one in could be a workaround. Or we could
just truncate the dates before putting them in the index entry.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-10-02 20:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-20 19:46 [sup-talk] Bug: Converting the index to xapian fails with a message with very old date Michael Stapelberg
2009-09-26 13:56 ` William Morgan
2009-09-26 21:38   ` Michael Stapelberg
2009-10-01 13:49     ` William Morgan
2009-10-02 20:35       ` Rich Lane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox