* [sup-talk] [PATCH] index log
@ 2009-08-17 6:38 Rich Lane
2009-08-22 13:46 ` William Morgan
0 siblings, 1 reply; 5+ messages in thread
From: Rich Lane @ 2009-08-17 6:38 UTC (permalink / raw)
Add a YAML logfile that records changes to the index and modify sup-dump to use
this rather than the normal database. The log is index format/version agnostic
so that users can switch between incompatible Sup versions without running
sup-dump first.
This should also make automated backups easier.
---
bin/sup-dump | 19 +++++++++++++------
lib/sup/ferret_index.rb | 7 +++++++
lib/sup/index.rb | 22 ++++++++++++++++++++++
lib/sup/xapian_index.rb | 7 ++++++-
lib/sup/yaml_log.rb | 25 +++++++++++++++++++++++++
5 files changed, 73 insertions(+), 7 deletions(-)
create mode 100644 lib/sup/yaml_log.rb
diff --git a/bin/sup-dump b/bin/sup-dump
index ba36b21..531a30a 100755
--- a/bin/sup-dump
+++ b/bin/sup-dump
@@ -2,7 +2,8 @@
require 'rubygems'
require 'trollop'
-require "sup"
+require 'sup' # Redwood::VERSION, Redwood::BASE_DIR
+require "sup/yaml_log"
$opts = Trollop::options do
version "sup-dump (sup #{Redwood::VERSION})"
@@ -21,10 +22,16 @@ No options.
EOS
end
-index = Redwood::Index.new
-Redwood::SourceManager.new
-index.load
+labels = {}
-index.each_message :load_spam => true, :load_deleted => true, :load_killed => true do |m|
- puts "#{m.id} (#{m.labels * ' '})"
+Redwood::log "processing index log"
+index_log = YamlLogReader.new File.join(Redwood::BASE_DIR, 'index_log.yaml')
+index_log.each do |h|
+ case h['type']
+ when 'add_message', 'update_message_state'
+ labels[h['id']] = h['labels']
+ end
end
+
+Redwood::log "dumping labels"
+labels.each { |msgid,labels| puts "#{msgid} (#{labels * ' '})" }
diff --git a/lib/sup/ferret_index.rb b/lib/sup/ferret_index.rb
index 98ea9b5..2cb9759 100644
--- a/lib/sup/ferret_index.rb
+++ b/lib/sup/ferret_index.rb
@@ -57,6 +57,7 @@ EOS
def sync_message m, opts={}
entry = @index[m.id]
+ existed = !entry.nil?
raise "no source info for message #{m.id}" unless m.source && m.source_info
@@ -131,6 +132,12 @@ EOS
}
@index_mutex.synchronize do
+ if existed
+ @log.update_message_state m.id, m.labels
+ else
+ @log.add_message m.id, m.labels
+ end
+
@index.delete m.id
@index.add_document d
end
diff --git a/lib/sup/index.rb b/lib/sup/index.rb
index 54ec843..7360cf5 100644
--- a/lib/sup/index.rb
+++ b/lib/sup/index.rb
@@ -1,6 +1,7 @@
## Index interface, subclassed by Ferret indexer.
require 'fileutils'
+require 'sup/yaml_log'
begin
require 'chronic'
@@ -65,6 +66,7 @@ class BaseIndex
def load
SourceManager.load_sources
+ @log = IndexLogWriter.new File.join(@dir, 'index_log.yaml')
load_index
end
@@ -176,6 +178,26 @@ class BaseIndex
def parse_query s
unimplemented
end
+
+ private
+
+ class IndexLogWriter < YamlLogWriter
+ def update_message_state id, labels
+ write_entry 'update_message_state', 'id' => id, 'labels' => labels.map { |x| x.to_s }
+ end
+
+ def add_message id, labels
+ write_entry 'add_message', 'id' => id, 'labels' => labels.map { |x| x.to_s }
+ end
+
+ def remove_message id
+ write_entry 'remove_message', 'id' => id
+ end
+
+ def write_entry type, hash
+ self << hash.merge('type' => type, 'time' => Time.now)
+ end
+ end
end
index_name = ENV['SUP_INDEX'] || $config[:index] || DEFAULT_INDEX
diff --git a/lib/sup/xapian_index.rb b/lib/sup/xapian_index.rb
index 18b5050..c4dbc5f 100644
--- a/lib/sup/xapian_index.rb
+++ b/lib/sup/xapian_index.rb
@@ -61,7 +61,10 @@ class XapianIndex < BaseIndex
end
def delete id
- synchronize { @xapian.delete_document mkterm(:msgid, id) }
+ synchronize do
+ @log.remove_message id
+ @xapian.delete_document mkterm(:msgid, id)
+ end
end
def build_message id
@@ -510,10 +513,12 @@ class XapianIndex < BaseIndex
Redwood::log "warning: docid underflow, dropping #{m.id.inspect}"
return
end
+ @log.add_message m.id, m.labels
else
doc.clear_terms
doc.clear_values
docid = doc.docid
+ @log.update_message_state m.id, m.labels
end
@term_generator.document = doc
diff --git a/lib/sup/yaml_log.rb b/lib/sup/yaml_log.rb
new file mode 100644
index 0000000..325cca9
--- /dev/null
+++ b/lib/sup/yaml_log.rb
@@ -0,0 +1,25 @@
+class YamlLogReader
+ include Enumerable
+
+ def initialize filename
+ @io = File.open(filename, 'r+')
+ end
+
+ def each &b
+ @io.rewind
+ YAML.each_document @io, &b
+ end
+end
+
+class YamlLogWriter
+ def initialize filename
+ @io = File.open(filename, 'a')
+ end
+
+ def <<(o)
+ YAML.dump o, @io
+
+ ## This only flushes to the OS. We may want to fsync occasionally too.
+ @io.flush
+ end
+end
--
1.6.4
^ permalink raw reply [flat|nested] 5+ messages in thread
* [sup-talk] [PATCH] index log
2009-08-17 6:38 [sup-talk] [PATCH] index log Rich Lane
@ 2009-08-22 13:46 ` William Morgan
2009-08-24 12:20 ` Nicolas Pouillard
0 siblings, 1 reply; 5+ messages in thread
From: William Morgan @ 2009-08-22 13:46 UTC (permalink / raw)
Reformatted excerpts from Rich Lane's message of 2009-08-16:
> Add a YAML logfile that records changes to the index and modify
> sup-dump to use this rather than the normal database.
I like this. I'm going to wait to apply it until the api refactoring
stuff is merged down to master though. Should be soon.
--
William <wmorgan-sup at masanjin.net>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [sup-talk] [PATCH] index log
2009-08-22 13:46 ` William Morgan
@ 2009-08-24 12:20 ` Nicolas Pouillard
2009-08-31 4:16 ` Rich Lane
0 siblings, 1 reply; 5+ messages in thread
From: Nicolas Pouillard @ 2009-08-24 12:20 UTC (permalink / raw)
Excerpts from William Morgan's message of Sat Aug 22 15:46:27 +0200 2009:
> Reformatted excerpts from Rich Lane's message of 2009-08-16:
> > Add a YAML logfile that records changes to the index and modify
> > sup-dump to use this rather than the normal database.
>
> I like this. I'm going to wait to apply it until the api refactoring
> stuff is merged down to master though. Should be soon.
I'm wondering if a simpler format would not be better, I've patch
in my sup copy do feed a file called ~/.sup/labels-mapping.log with
lines like those:
000e0cd20f80143822047118693d at google.com (unread inbox -> )
20090813213654.GA30223 at community.haskell.org (unread inbox patch -> patch)
1250148617-sup-6053 at oz.taruti.net (unread inbox sup -> sup)
1250281208-sup-4199 at masanjin.net (unread inbox sup -> sup)
Their are in the style of sup-dump output and there are pretty easy to manage
by any tools.
Not to say that I don't like YAML, I am a pretty big fan of it; but here it
seems overkill.
Best regards,
--
Nicolas Pouillard
http://nicolaspouillard.fr
^ permalink raw reply [flat|nested] 5+ messages in thread
* [sup-talk] [PATCH] index log
2009-08-24 12:20 ` Nicolas Pouillard
@ 2009-08-31 4:16 ` Rich Lane
2009-08-31 11:42 ` Ben Walton
0 siblings, 1 reply; 5+ messages in thread
From: Rich Lane @ 2009-08-31 4:16 UTC (permalink / raw)
Excerpts from Nicolas Pouillard's message of Mon Aug 24 08:20:20 -0400 2009:
> Excerpts from William Morgan's message of Sat Aug 22 15:46:27 +0200 2009:
> > Reformatted excerpts from Rich Lane's message of 2009-08-16:
> > > Add a YAML logfile that records changes to the index and modify
> > > sup-dump to use this rather than the normal database.
> >
> > I like this. I'm going to wait to apply it until the api refactoring
> > stuff is merged down to master though. Should be soon.
>
> I'm wondering if a simpler format would not be better, I've patch
> in my sup copy do feed a file called ~/.sup/labels-mapping.log with
> lines like those:
>
> 000e0cd20f80143822047118693d at google.com (unread inbox -> )
> 20090813213654.GA30223 at community.haskell.org (unread inbox patch -> patch)
> 1250148617-sup-6053 at oz.taruti.net (unread inbox sup -> sup)
> 1250281208-sup-4199 at masanjin.net (unread inbox sup -> sup)
>
> Their are in the style of sup-dump output and there are pretty easy to manage
> by any tools.
>
> Not to say that I don't like YAML, I am a pretty big fan of it; but here it
> seems overkill.
>
> Best regards,
>
I agree that YAML is overkill for what we're currently storing in the
log. The intention was to make this as foolproof for future expansion as
possible, but after some time thinking about it I haven't come up with
more fields to add (not that there still couldn't be, but I think it's
unlikely). I'll submit a simpler patch.
What do people think about replacing the current undo system with one
based on the label log? This would only be possible once we have
immediate label changes. I think it would simplify a lot of code.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [sup-talk] [PATCH] index log
2009-08-31 4:16 ` Rich Lane
@ 2009-08-31 11:42 ` Ben Walton
0 siblings, 0 replies; 5+ messages in thread
From: Ben Walton @ 2009-08-31 11:42 UTC (permalink / raw)
Excerpts from Rich Lane's message of Mon Aug 31 00:16:03 -0400 2009:
> What do people think about replacing the current undo system with one
> based on the label log? This would only be possible once we have
> immediate label changes. I think it would simplify a lot of code.
+1 for this. I find that more often than not, undo doesn't work as
expected anyway. It's been suggested that this is a threading bug,
which is quite likely...
-Ben
--
Ben Walton
Systems Programmer - CHASS
University of Toronto
C:416.407.5610 | W:416.978.4302
GPG Key Id: 8E89F6D2; Key Server: pgp.mit.edu
Contact me to arrange for a CAcert assurance meeting.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://rubyforge.org/pipermail/sup-talk/attachments/20090831/8111b611/attachment.bin>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-08-31 11:42 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-17 6:38 [sup-talk] [PATCH] index log Rich Lane
2009-08-22 13:46 ` William Morgan
2009-08-24 12:20 ` Nicolas Pouillard
2009-08-31 4:16 ` Rich Lane
2009-08-31 11:42 ` Ben Walton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox