Archive of RubyForge sup-devel mailing list
 help / color / mirror / Atom feed
From: Rich Lane <rlane@club.cc.cmu.edu>
To: Mark Alexander <marka@pobox.com>
Cc: sup-devel <sup-devel@rubyforge.org>
Subject: Re: [sup-devel] new branch: maildir
Date: Thu, 25 Mar 2010 12:11:57 -0400	[thread overview]
Message-ID: <1269532152-sup-1158@zyrg.net> (raw)
In-Reply-To: <1269516077-sup-4573@r61>

Excerpts from Mark Alexander's message of 2010-03-25 07:24:59 -0400:
> Excerpts from Rich Lane's message of Thu Mar 25 03:12:57 -0400 2010:
> > This branch makes some drastic changes to how mbox and maildir sources
> > work.
> 
> Thanks for attacking this problem!
> 
> I just took a quick look at the diffs, and I have some concern
> about this line in maildir.rb:
> 
>   Dir[File.join(subdir, '*')].map do |fn|
> 
> I'm worried about the memory usage with some of my maildirs that have
> tens of thousands of files.  Would it be more memory-efficient to
> use Dir.open and Dir.each?  You'd have to filter out "." and "..",
> of course.

Hence the "XXX use less memory" :). I've been doing my testing on a 30k
maildir which works fine. My sup scalability target is a million
messages and memory becomes a concern there. A maildir filename is about
30 characters plus any Ruby overhead.

The primitives we have are:
Iterate through filenames in a directory in arbitrary (?) order.
Check the existence of a single file in a directory.
Iterate through filenames with a given prefix stored in the index in lexicographical order.
Any more?

Right now I took the easiest route which loads both the filesystem and
indexed filenames into arrays and diffs them. Iterating over the index
and checking the file's existence won't detect new messages. Iterating
over the filesystem and checking for existence in the index won't detect
deleted messages. A solution would be to do both, but that seems
expensive. It would be good if we could optimize for the case where most
of the maildir messages have already been indexed.
_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel


      parent reply	other threads:[~2010-03-25 16:26 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-25  7:12 Rich Lane
2010-03-25 11:24 ` Mark Alexander
2010-03-25 13:30   ` Ben Walton
2010-03-25 16:11   ` Rich Lane [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1269532152-sup-1158@zyrg.net \
    --to=rlane@club.cc.cmu.edu \
    --cc=marka@pobox.com \
    --cc=sup-devel@rubyforge.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox