From: wmorgan-sup@masanjin.net (William Morgan)
Subject: [sup-talk] [PATCH] First draft of attachment processing for more gmail style searches
Date: Sun, 02 Mar 2008 10:08:58 -0800 [thread overview]
Message-ID: <1204479552-sup-4100@south> (raw)
In-Reply-To: <1204232994-sup-628@tomsk>
Reformatted excerpts from Marcus Williams's message of 2008-02-28:
> The only thing I'm a little wary of is the join() I do of the
> attachment filenames for the index (like labels). This means that
> ferret doesnt actually know the difference between two files called
> file1 and file2 and a single file called "file1 file2". Not sure it
> matters that much for this usage though.
The answer here is to escape the spaces and to use a Ferret custom
analyzer for this field in the index, one that will split only on
non-escaped spaces.
Something like this (needs testing):
irb(main):055:0> a = Ferret::Analysis::RegExpAnalyzer.new /([^\s\\]|(\\\s))+/, false
=> #<Ferret::Analysis::RegExpAnalyzer:0xb79740fc>
irb(main):056:0> t = a.token_stream :potato, "one\\ two three\\ four"=> #<Ferret::Analysis::TokenStream:0xb79705d8>
irb(main):057:0> t.next
=> token["one\ two":0:8:1]
irb(main):058:0> t.next
=> token["three\ four":9:20:1]
Then assign that analyzer to the :attachments field in index.rb circa
line 37, just like I do for :subject and :body.
You'll have to make sure to do the escaping properly both on user input
at query time, and at storage time to the index.
> Also I dont repopulate the attachments attribute on the message object
> and I couldnt figure out quite how you do it for labels (through the
> initialise?).
Not quite sure what you mean here, but the answer might be: index.rb
line 371 is where we build a Message object from an index entry, and
you'll need to pass in an :attachments attribute (and handle it within
Message#initialize).
--
William <wmorgan-sup at masanjin.net>
next prev parent reply other threads:[~2008-03-02 18:08 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-25 20:50 Marcus Williams
2008-02-28 17:40 ` William Morgan
2008-02-28 21:15 ` Marcus Williams
2008-03-02 18:08 ` William Morgan [this message]
2008-03-05 10:01 ` Marcus Williams
2008-03-08 22:02 ` William Morgan
2008-03-23 21:13 ` Marcus Williams
2008-04-02 20:51 ` William Morgan
2008-04-02 21:16 ` vasudeva
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1204479552-sup-4100@south \
--to=wmorgan-sup@masanjin.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox