Archive of RubyForge sup-devel mailing list
 help / color / mirror / Atom feed
* [sup-devel] alternative for sup-sync option --restored?
@ 2010-08-06  9:13 Sascha Silbe
  2010-08-22 13:40 ` Sascha Silbe
  2011-01-19 13:52 ` [sup-devel] [PATCH] add sup-import-dump: import message state in sup-dump format Sascha Silbe
  0 siblings, 2 replies; 6+ messages in thread
From: Sascha Silbe @ 2010-08-06  9:13 UTC (permalink / raw)
  To: sup-devel


[-- Attachment #1.1: Type: text/plain, Size: 672 bytes --]

Hi!

I'm reading and writing my mails on both my desktop (which has the entire archive, totalling ~ 16GB) and my laptop (recent mails and partial archive, ~ 1-2GB).
Up to now I used sup-dump and sup-sync --restored to synchronize the two (every time I switched between them, since it's a full dump as opposed to a log file or diff).
After merging the latest changes (including a full dump + restore for index upgrade, taking about 33 hours), the --restored option to sup-sync is gone (and without it sup-sync doesn't do anything useful). How do I go about "merging" the changes done on the other system now?

Sascha

--
http://sascha.silbe.org/
http://www.infra-silbe.de/

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [sup-devel] alternative for sup-sync option --restored?
  2010-08-06  9:13 [sup-devel] alternative for sup-sync option --restored? Sascha Silbe
@ 2010-08-22 13:40 ` Sascha Silbe
  2011-01-19 13:52 ` [sup-devel] [PATCH] add sup-import-dump: import message state in sup-dump format Sascha Silbe
  1 sibling, 0 replies; 6+ messages in thread
From: Sascha Silbe @ 2010-08-22 13:40 UTC (permalink / raw)
  To: sup-devel


[-- Attachment #1.1: Type: text/plain, Size: 850 bytes --]

Excerpts from Sascha Silbe's message of Fri Aug 06 11:13:42 +0200 2010:

> I'm reading and writing my mails on both my desktop (which has the entire archive, totalling ~ 16GB) and my laptop (recent mails and partial archive, ~ 1-2GB).
> Up to now I used sup-dump and sup-sync --restored to synchronize the two (every time I switched between them, since it's a full dump as opposed to a log file or diff).
> After merging the latest changes (including a full dump + restore for index upgrade, taking about 33 hours), the --restored option to sup-sync is gone (and without it sup-sync doesn't do anything useful). How do I go about "merging" the changes done on the other system now?

Any suggestion? This is preventing me from updating to git master and rebasing my outstanding patches.

Sascha

--
http://sascha.silbe.org/
http://www.infra-silbe.de/

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [sup-devel] [PATCH] add sup-import-dump: import message state in sup-dump format
  2010-08-06  9:13 [sup-devel] alternative for sup-sync option --restored? Sascha Silbe
  2010-08-22 13:40 ` Sascha Silbe
@ 2011-01-19 13:52 ` Sascha Silbe
  2011-01-20  4:13   ` Rich Lane
  2011-01-20 15:28   ` Nicolas Pouillard
  1 sibling, 2 replies; 6+ messages in thread
From: Sascha Silbe @ 2011-01-19 13:52 UTC (permalink / raw)
  To: sup-devel

sup-import-dump imports message state as exported by sup-dump into the index.
It is a direct replacement for the sup-sync --restored functionality that got
lost when merging the maildir branch.
Unlike sup-sync it operates on the index only, so it's fast enough for
periodically importing full dumps to keep multiple sup installations
synchronised.
It should also be easy enough to add support for a "diff" style format that
would allow replaying "logs" if sup were enhanced to write those in the
future.

To give some rough numbers:

Dump file contains 78104 lines, index about 600k entries. 410 entries from the
dump file don't match the index and cause index updates. Transaction mode is
used for all runs.
Cold cache, dry run: 138s real time, 53s user+system
Hot cache, dry run: 42s real time, 40s user+system
Hot cache, changes written to disk: 55s real time, 44s user+system
Hot cache, no updates: 43s real time, 41s user+system

Signed-off-by: Sascha Silbe <sascha-pgp@silbe.org>
---
 bin/sup-import-dump |   99 +++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/sup/index.rb    |   15 ++++++++
 2 files changed, 114 insertions(+), 0 deletions(-)

diff --git a/bin/sup-import-dump b/bin/sup-import-dump
new file mode 100644
index 0000000..91a1721
--- /dev/null
+++ b/bin/sup-import-dump
@@ -0,0 +1,99 @@
+#!/usr/bin/env ruby
+
+require 'uri'
+require 'rubygems'
+require 'trollop'
+require "sup"; Redwood::check_library_version_against "git"
+
+PROGRESS_UPDATE_INTERVAL = 15 # seconds
+
+class AbortExecution < SystemExit
+end
+
+opts = Trollop::options do
+  version "sup-import-dump (sup #{Redwood::VERSION})"
+  banner <<EOS
+Imports message state previously exported by sup-dump into the index.
+sup-import-dump operates on the index only, so the messages must have already
+been added using sup-sync. If you need to recreate the index, see sup-sync
+--restore <filename> instead.
+
+Messages not mentioned in the dump file will not be modified.
+
+Usage:
+  sup-import-dump [options] <dump file>
+
+Options:
+EOS
+  opt :verbose, "Print message ids as they're processed."
+  opt :ignore_missing, "Silently skip over messages that are not in the index."
+  opt :warn_missing, "Warn about messages that are not in the index, but continue."
+  opt :abort_missing, "Abort on encountering messages that are not in the index. (default)"
+  opt :atomic, "Use transaction to apply all changes atomically."
+  opt :dry_run, "Don't actually modify the index. Probably only useful with --verbose.", :short => "-n"
+  opt :version, "Show version information", :short => :none
+
+  conflicts :ignore_missing, :warn_missing, :abort_missing
+end
+Trollop::die "No dump file given" if ARGV.empty?
+Trollop::die "Extra arguments given" if ARGV.length > 1
+dump_name = ARGV.shift
+missing_action = [:ignore_missing, :warn_missing, :abort_missing].find { |x| opts[x] } || :abort_missing
+
+Redwood::start
+index = Redwood::Index.init
+
+index.lock_interactively or exit
+begin
+  num_read = 0
+  num_changed = 0
+  index.load
+  index.begin_transaction if opts[:atomic]
+
+  IO.foreach dump_name do |l|
+    l =~ /^(\S+) \((.*?)\)$/ or raise "Can't read dump line: #{l.inspect}"
+    mid, labels = $1, $2
+    num_read += 1
+
+    unless index.contains_id? mid
+      if missing_action == :abort_missing
+        $stderr.puts "Message #{mid} not found in index, aborting."
+        raise AbortExecution, 10
+      elsif missing_action == :warn_missing
+        $stderr.puts "Message #{mid} not found in index, skipping."
+      end
+
+      next
+    end
+
+    m = index.build_message mid
+    new_labels = labels.to_set_of_symbols
+
+    if m.labels == new_labels
+      puts "#{mid} unchanged" if opts[:verbose]
+      next
+    end
+
+    puts "Changing flags for #{mid} from '#{m.labels.to_a * ' '}' to '#{new_labels.to_a * ' '}'" if opts[:verbose]
+    num_changed += 1
+
+    next if opts[:dry_run]
+
+    m.labels = new_labels
+    index.update_message_state m
+  end
+
+  index.commit_transaction if opts[:atomic]
+  puts "Updated #{num_changed} of #{num_read} messages."
+rescue AbortExecution
+  index.cancel_transaction if opts[:atomic]
+  raise
+rescue Exception => e
+  index.cancel_transaction if opts[:atomic]
+  File.open("sup-exception-log.txt", "w") { |f| f.puts e.backtrace }
+  raise
+ensure
+  index.save_index unless opts[:atomic]
+  Redwood::finish
+  index.unlock
+end
diff --git a/lib/sup/index.rb b/lib/sup/index.rb
index b90c2b1..bcc449b 100644
--- a/lib/sup/index.rb
+++ b/lib/sup/index.rb
@@ -260,6 +260,21 @@ EOS
     end
   end
 
+  # wrap all future changes inside a transaction so they're done atomically
+  def begin_transaction
+    synchronize { @xapian.begin_transaction }
+  end
+
+  # complete the transaction and write all previous changes to disk
+  def commit_transaction
+    synchronize { @xapian.commit_transaction }
+  end
+
+  # abort the transaction and revert all changes made since begin_transaction
+  def cancel_transaction
+    synchronize { @xapian.cancel_transaction }
+  end
+
   ## xapian-compact takes too long, so this is a no-op
   ## until we think of something better
   def optimize
-- 
1.7.2.3

_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [sup-devel] [PATCH] add sup-import-dump: import message state in sup-dump format
  2011-01-19 13:52 ` [sup-devel] [PATCH] add sup-import-dump: import message state in sup-dump format Sascha Silbe
@ 2011-01-20  4:13   ` Rich Lane
  2011-01-20 15:28   ` Nicolas Pouillard
  1 sibling, 0 replies; 6+ messages in thread
From: Rich Lane @ 2011-01-20  4:13 UTC (permalink / raw)
  To: Sascha Silbe; +Cc: sup-devel

Applied to master.
_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [sup-devel] [PATCH] add sup-import-dump: import message state in sup-dump format
  2011-01-19 13:52 ` [sup-devel] [PATCH] add sup-import-dump: import message state in sup-dump format Sascha Silbe
  2011-01-20  4:13   ` Rich Lane
@ 2011-01-20 15:28   ` Nicolas Pouillard
  2011-01-21 12:36     ` Sascha Silbe
  1 sibling, 1 reply; 6+ messages in thread
From: Nicolas Pouillard @ 2011-01-20 15:28 UTC (permalink / raw)
  To: Sascha Silbe, Sup developer discussion, sup-devel

On Wed, 19 Jan 2011 14:52:15 +0100, Sascha Silbe <sascha-pgp@silbe.org> wrote:
> sup-import-dump imports message state as exported by sup-dump into the index.
> It is a direct replacement for the sup-sync --restored functionality that got
> lost when merging the maildir branch.
> Unlike sup-sync it operates on the index only, so it's fast enough for
> periodically importing full dumps to keep multiple sup installations
> synchronised.

So it does not clear the index, right? It only assign the set of labels to the
given message. Just what I need to upgrade my sup index with my notmuch index.

> It should also be easy enough to add support for a "diff" style format that
> would allow replaying "logs" if sup were enhanced to write those in the
> future.

Please do.

I hacked such a logging feature [1] in the past (at the time where sup
used ferret and that message loss happened), then I merged the previous index
dump with the log and compared this to the actual dump (I wrote some tools
to play with this log format).

Cheers,

[1]: http://gitorious.org/~ertai/sup/clone-by-ertai/commits/to-submit-log-labels-mapping

-- 
Nicolas Pouillard
http://nicolaspouillard.fr
_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [sup-devel] [PATCH] add sup-import-dump: import message state in sup-dump format
  2011-01-20 15:28   ` Nicolas Pouillard
@ 2011-01-21 12:36     ` Sascha Silbe
  0 siblings, 0 replies; 6+ messages in thread
From: Sascha Silbe @ 2011-01-21 12:36 UTC (permalink / raw)
  To: Nicolas Pouillard; +Cc: Sup developer discussion


[-- Attachment #1.1: Type: text/plain, Size: 1168 bytes --]

Excerpts from Nicolas Pouillard's message of Thu Jan 20 16:28:17 +0100 2011:

> So it does not clear the index, right? It only assign the set of labels to the
> given message. Just what I need to upgrade my sup index with my notmuch index.

Yes, that's exactly the purpose (except that I'm using it to synchronise
two sup installations, not sup and notmuch).

> > It should also be easy enough to add support for a "diff" style format that
> > would allow replaying "logs" if sup were enhanced to write those in the
> > future.

> Please do.

Eventually I will, but I don't know when I'll get around to do it.

> I hacked such a logging feature [1] in the past [...]

Nice! Looks like the only other places (besides sync_message) that write
to the index are delete and delete_message (the latter is part of a
not-yet-upstreamed patch of mine). Looks like the hardest part will
actually be adding the configuration option for the file name (with
support for a date template). ;)


> [1]: http://gitorious.org/~ertai/sup/clone-by-ertai/commits/to-submit-log-labels-mapping

Sascha

-- 
http://sascha.silbe.org/
http://www.infra-silbe.de/

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 494 bytes --]

[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-01-21 13:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-06  9:13 [sup-devel] alternative for sup-sync option --restored? Sascha Silbe
2010-08-22 13:40 ` Sascha Silbe
2011-01-19 13:52 ` [sup-devel] [PATCH] add sup-import-dump: import message state in sup-dump format Sascha Silbe
2011-01-20  4:13   ` Rich Lane
2011-01-20 15:28   ` Nicolas Pouillard
2011-01-21 12:36     ` Sascha Silbe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox