From mboxrd@z Thu Jan 1 00:00:00 1970 From: bgamari.foss@gmail.com (Ben Gamari) Date: Thu, 01 Oct 2009 13:44:19 -0400 Subject: [sup-talk] Crash while scrolling In-Reply-To: <1254404181-sup-8448@masanjin.net> References: <20090911165830.GA11260@ben-laptop> <1252773189-sup-246@masanjin.net> <20090916172340.GA20566@ben-laptop> <1253975267-sup-8308@masanjin.net> <1254160696-sup-3522@ben-laptop> <1254404181-sup-8448@masanjin.net> Message-ID: <1254418089-sup-5800@ben-laptop> Excerpts from William Morgan's message of Thu Oct 01 09:43:48 -0400 2009: > Reformatted excerpts from Ben Gamari's message of 2009-09-28: > > This actually brings up a larger question. How difficult would it be > > to relax sup's assumption that sources are add-only? > > It's not difficult per se, it just requires scanning over the entire > source, which is slow. Removing this assumption would be tantamount to > running sup-sync -c every time you start up sup. > > Here's the idea: scanning over a mailstore is slow. Much of this > slowness is due to Ruby. So let's rewrite this code in C. Then we would > have something as fast as, say, Mutt. But Mutt bogs down on my mbox file > because it's too big. So my *only* reasonable choice with a large > mailstore is Sup and the assumption that the source is add only. It seems that C would definitely be a good start (or perhaps C++ would be a better idea as that is the language in which Xapian is written). However, I think one of the real issues is the exclusive nature of index access. In fact, this is one of my primary gripes with the sup workflow. After processing a large number of messages, the write-out time can be quite substantial upon killing the buffer. This can be a noticeable interruption to workflow. It seems to me that index access should be asynchronous at least. If this were the case, then we could get support for mutable sources for free, as we could synchronize against sources without interrupting workflow (although keeping the view in sync with the backend would be a bit tricky). As an aside, it would be quite nice if one could run multiple simultaneous instances of sup. It seems that if one only held write access to the index during writes (is this the case presently?), there should be nothing preventing this from being possible. Correct me if I'm wrong in any part of my above assessment. Hope things are going well, - Ben