From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (c4C8F5BC1.dhcp.as2116.net. [193.91.143.76]) by mx.google.com with ESMTPSA id ay4sm3130294lbb.2.2013.08.15.09.40.45 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Thu, 15 Aug 2013 09:40:46 -0700 (PDT) Date: Thu, 15 Aug 2013 18:39:45 +0200 From: Gaute Hope To: sup-devel Message-ID: <1376584400-sup-6242@qwerzila> In-Reply-To: <1376582732-sup-7192@sam.mediasupervision.de> References: <1376582732-sup-7192@sam.mediasupervision.de> Subject: Re: [sup-devel] sup 0.14: Encoding::UndefinedConversionError from thread: load threads for thread-index-mode Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable User-Agent: Sup/git Excerpts from Gregor Hoffleit's message of 2013-08-15 18:21:43 +0200: > Sup 0.14 fails for me just after the start with the following exception= : > = > --- Encoding::UndefinedConversionError from thread: load threads for th= read-index-mode > "\xE2" from ASCII-8BIT to UTF-8 > /var/lib/gems/1.9.1/gems/sup-0.14.0/lib/sup/util.rb:259:in `width' > /var/lib/gems/1.9.1/gems/sup-0.14.0/lib/sup/util.rb:259:in `display_len= gth' Hi Gregor, did you have a lot of messages in your index added by Sup 0.13 or earlier? Do you get the same error if you move the old xapian folder out of the way and try to re-index your messages? It is possible to restore labels using sup-dump and sup-sync --restore. Otherwise, I used to hackishly fix this error when it occured at some other point by doing the following fix: diff --git a/lib/sup/util.rb b/lib/sup/util.rb index 5cff6fa..4579a38 100644 --- a/lib/sup/util.rb +++ b/lib/sup/util.rb @@ -256,7 +256,7 @@ end = class String def display_length - @display_length ||=3D Unicode.width(self, false) + @display_length ||=3D Unicode.width(self.fix_encoding, false) end = def slice_by_display_length len The problem is that the string which is being formatted (or sought the width of) is of some sort of encoding (probably already UTF-8), but Ruby or NCurses or RMail or whatever has lost the encoding, and although the bytes are encoded they are treated as binary (ASCII-8BIT). There is generally no way to figure out what the original encoding was, but we can guess that it is UTF-8 and fix it. Regards, Gaute