Archive of RubyForge sup-talk mailing list
 help / color / mirror / Atom feed
From: wmorgan-sup@masanjin.net (William Morgan)
Subject: [sup-talk] [PATCH] First draft of attachment processing for more gmail style searches
Date: Sun, 02 Mar 2008 10:08:58 -0800	[thread overview]
Message-ID: <1204479552-sup-4100@south> (raw)
In-Reply-To: <1204232994-sup-628@tomsk>

Reformatted excerpts from Marcus Williams's message of 2008-02-28:
> The only thing I'm a little wary of is the join() I do of the
> attachment filenames for the index (like labels). This means that
> ferret doesnt actually know the difference between two files called
> file1 and file2 and a single file called "file1 file2". Not sure it
> matters that much for this usage though.

The answer here is to escape the spaces and to use a Ferret custom
analyzer for this field in the index, one that will split only on
non-escaped spaces.

Something like this (needs testing):

  irb(main):055:0> a = Ferret::Analysis::RegExpAnalyzer.new /([^\s\\]|(\\\s))+/, false
  => #<Ferret::Analysis::RegExpAnalyzer:0xb79740fc>
  irb(main):056:0> t = a.token_stream :potato, "one\\ two three\\ four"=> #<Ferret::Analysis::TokenStream:0xb79705d8>
  irb(main):057:0> t.next
  => token["one\ two":0:8:1]
  irb(main):058:0> t.next
  => token["three\ four":9:20:1]

Then assign that analyzer to the :attachments field in index.rb circa
line 37, just like I do for :subject and :body.

You'll have to make sure to do the escaping properly both on user input
at query time, and at storage time to the index.

> Also I dont repopulate the attachments attribute on the message object
> and I couldnt figure out quite how you do it for labels (through the
> initialise?). 

Not quite sure what you mean here, but the answer might be: index.rb
line 371 is where we build a Message object from an index entry, and
you'll need to pass in an :attachments attribute (and handle it within
Message#initialize).

-- 
William <wmorgan-sup at masanjin.net>


  reply	other threads:[~2008-03-02 18:08 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-25 20:50 Marcus Williams
2008-02-28 17:40 ` William Morgan
2008-02-28 21:15   ` Marcus Williams
2008-03-02 18:08     ` William Morgan [this message]
2008-03-05 10:01       ` Marcus Williams
2008-03-08 22:02         ` William Morgan
2008-03-23 21:13           ` Marcus Williams
2008-04-02 20:51             ` William Morgan
2008-04-02 21:16               ` vasudeva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1204479552-sup-4100@south \
    --to=wmorgan-sup@masanjin.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox