* [sup-devel] Bug: UTF-8 error when sending messages
@ 2011-01-29 3:11 Adeel Ahmad Khan
2011-01-29 18:17 ` Tero Tilus
2011-01-30 15:58 ` Gaute Hope
0 siblings, 2 replies; 5+ messages in thread
From: Adeel Ahmad Khan @ 2011-01-29 3:11 UTC (permalink / raw)
To: sup-devel
When sending a message containing certain characters, like guillemets
<http://en.wikipedia.org/wiki/Guillemets>, I experiencing the following error.
--- ArgumentError from thread: main
invalid byte sequence in UTF-8
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/modes/edit-message-mode.rb:497:in `block in mentions_attachments?'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/modes/edit-message-mode.rb:497:in `each'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/modes/edit-message-mode.rb:497:in `any?'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/modes/edit-message-mode.rb:497:in `mentions_attachments?'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/modes/edit-message-mode.rb:339:in `send_message'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/mode.rb:59:in `handle_input'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/buffer.rb:277:in `handle_input'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/bin/sup:260:in `<module:Redwood>'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/bin/sup:69:in `<top (required)>'
/usr/bin/sup:19:in `load'
/usr/bin/sup:19:in `<main>'
After setting :confirm_no_attachments and :confirm_top_posting to false, I get
the following error instead.
--- ArgumentError from thread: main
invalid byte sequence in UTF-8
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/modes/edit-message-mode.rb:389:in `build_message'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/modes/edit-message-mode.rb:354:in `send_message'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/mode.rb:59:in `handle_input'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/buffer.rb:277:in `handle_input'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/bin/sup:260:in `<module:Redwood>'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/bin/sup:69:in `<top (required)>'
/usr/bin/sup:19:in `load'
/usr/bin/sup:19:in `<main>'
I am using a nearly fresh installation of Sup 0.12.1 with Ruby 1.9.2p136. I
have LOCALE="en_US.UTF-8".
Adeel
_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [sup-devel] Bug: UTF-8 error when sending messages
2011-01-29 3:11 [sup-devel] Bug: UTF-8 error when sending messages Adeel Ahmad Khan
@ 2011-01-29 18:17 ` Tero Tilus
2011-01-30 15:58 ` Gaute Hope
1 sibling, 0 replies; 5+ messages in thread
From: Tero Tilus @ 2011-01-29 18:17 UTC (permalink / raw)
To: Sup developers
Adeel Ahmad Khan, 2011-01-29 05:11:
> invalid byte sequence in UTF-8
...
> I am using a nearly fresh installation of Sup 0.12.1 with Ruby
> 1.9.2p136. I have LOCALE="en_US.UTF-8".
Both the errors were from regex matches against message body. Somehow
your editor doesn't know your locale or is not obeying it. As a
result non-utf8 stuff gets saved to disk and sup gets confused.
--
Tero Tilus ## 050 3635 235 ## http://tero.tilus.net/
_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [sup-devel] Bug: UTF-8 error when sending messages
2011-01-29 3:11 [sup-devel] Bug: UTF-8 error when sending messages Adeel Ahmad Khan
2011-01-29 18:17 ` Tero Tilus
@ 2011-01-30 15:58 ` Gaute Hope
1 sibling, 0 replies; 5+ messages in thread
From: Gaute Hope @ 2011-01-30 15:58 UTC (permalink / raw)
To: sup-devel
Excerpts from Adeel Ahmad Khan's message of 2011-01-29 04:11:40 +0100:
> When sending a message containing certain characters, like guillemets
> <http://en.wikipedia.org/wiki/Guillemets>, I experiencing the following error.
>
> --- ArgumentError from thread: main
> invalid byte sequence in UTF-8
> /usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/modes/edit-message-mode.rb:497:in `block in mentions_attachments?'
Could you try if this patch fixes it (or just edit the lines manually) ?
- Gaute
From 67a8777875091da6ae57c762f18254509f418a46 Mon Sep 17 00:00:00 2001
From: Gaute Hope <eg@gaute.vetsj.com>
Date: Sun, 30 Jan 2011 16:57:15 +0100
Subject: [PATCH] Attempt to handle encoding errors when searching for attachment string
---
lib/sup/modes/edit-message-mode.rb | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)
diff --git a/lib/sup/modes/edit-message-mode.rb b/lib/sup/modes/edit-message-mode.rb
index 734a879..8517011 100644
--- a/lib/sup/modes/edit-message-mode.rb
+++ b/lib/sup/modes/edit-message-mode.rb
@@ -494,7 +494,10 @@ private
if HookManager.enabled? "mentions-attachments"
HookManager.run "mentions-attachments", :header => @header, :body => @body
else
- @body.any? { |l| l =~ /^[^>]/ && l =~ /\battach(ment|ed|ing|)\b/i }
+ @body.any? { |l|
+ l.force_encoding 'UTF-8' if l.methods.include?(:encoding)
+ l =~ /^[^>]/ && l =~ /\battach(ment|ed|ing|)\b/i
+ }
end
end
--
1.7.3.5
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [sup-devel] Bug: UTF-8 error when sending messages
2011-01-31 18:24 ` Adeel Ahmad Khan
@ 2011-02-01 13:34 ` Gaute Hope
0 siblings, 0 replies; 5+ messages in thread
From: Gaute Hope @ 2011-02-01 13:34 UTC (permalink / raw)
To: sup-devel
Excerpts from Adeel Ahmad Khan's message of 2011-01-31 19:24:42 +0100:
>
> Tero Tilus <tero@tilus.net>:
> > Adeel Ahmad Khan, 2011-01-29 05:11:
> > > invalid byte sequence in UTF-8
> > ...
> > > I am using a nearly fresh installation of Sup 0.12.1 with Ruby
> > > 1.9.2p136. I have LOCALE="en_US.UTF-8".
What editor are you using ? Could you attach some buggy text (make sure
it is transferred in binary - ie upload an archive somewhere like
http://paste.xinu.at/)?
> > Both the errors were from regex matches against message body. Somehow
> > your editor doesn't know your locale or is not obeying it. As a
> > result non-utf8 stuff gets saved to disk and sup gets confused.
>
> You were right. It turned out to be an issue with editor.
I'm not so sure. I think it is a problem with Ruby (1.9) degrading the
the text it reads to US-ASCII (8bit). And when you later try to do UTF-8
stuff with it (append, or regexp with UTF-8) it fails. Perhaps even when
trying to access chars in the string (the body) that are not really
US-ASCII [2].
What my patch does is tell Ruby that the string is an UTF-8 string, no
matter what it deduced from reading the file in the first place - or
what might have happened throughout Sup's processing of the text.
Try this patch, it forces encoding on the entire body: http://ix.io/1rO
(these patches are workarounds; not to be applied to source-tree)
The same happened with labels.txt (or with contacts). Whenever a label
string could be degraded to US-ASCII, Ruby did so (how should it know it
was UTF-8 anyway?), then when trying to append, match or work with the
US-ASCII string towards UTF-8 input it failed.
What I think must be done (as an alternative to supporting different
encodings all the way) - is to _always_ read all files in UTF-8 [1] (or
transcode to UTF-8), and perhaps most difficultly _keep_ the strings
UTF-8 throught the entire Sup processes.
[1] http://blog.grayproductions.net/articles/ruby_19s_three_default_encodings
[2] http://www.ruby-forum.com/topic/194493
Best regards,
Gaute
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [sup-devel] Bug: UTF-8 error when sending messages
[not found] <mailman.75.1296431981.25874.sup-devel@rubyforge.org>
@ 2011-01-31 18:24 ` Adeel Ahmad Khan
2011-02-01 13:34 ` Gaute Hope
0 siblings, 1 reply; 5+ messages in thread
From: Adeel Ahmad Khan @ 2011-01-31 18:24 UTC (permalink / raw)
To: sup-devel
Tero Tilus <tero@tilus.net>:
> Adeel Ahmad Khan, 2011-01-29 05:11:
> > invalid byte sequence in UTF-8
> ...
> > I am using a nearly fresh installation of Sup 0.12.1 with Ruby
> > 1.9.2p136. I have LOCALE="en_US.UTF-8".
>
> Both the errors were from regex matches against message body. Somehow
> your editor doesn't know your locale or is not obeying it. As a
> result non-utf8 stuff gets saved to disk and sup gets confused.
You were right. It turned out to be an issue with editor.
Gaute Hope <eg@gaute.vetsj.com>:
> Excerpts from Adeel Ahmad Khan's message of 2011-01-29 04:11:40 +0100:
> > --- ArgumentError from thread: main
> > invalid byte sequence in UTF-8
> > /usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/modes/edit-message-mode.rb:497:in `block in mentions_attachments?'
>
> Could you try if this patch fixes it (or just edit the lines manually) ?
The patch resolved that error, but I get a different one now:
--- ArgumentError from thread: main
invalid byte sequence in UTF-8
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/modes/edit-message-mode.rb:389:in `build_message'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/modes/edit-message-mode.rb:354:in `send_message'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/mode.rb:59:in `handle_input'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/lib/sup/buffer.rb:277:in `handle_input'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/bin/sup:260:in `<module:Redwood>'
/usr/lib/ruby/gems/1.9.1/gems/sup-0.12.1/bin/sup:69:in `<top (required)>'
/usr/bin/sup:19:in `load'
/usr/bin/sup:19:in `<main>'
It looks like another regex match.
m.body += "\n" unless m.body =~ /\n\Z/
In any case it seems to be my problem, not sup's.
Adeel
_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-02-01 13:33 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-29 3:11 [sup-devel] Bug: UTF-8 error when sending messages Adeel Ahmad Khan
2011-01-29 18:17 ` Tero Tilus
2011-01-30 15:58 ` Gaute Hope
[not found] <mailman.75.1296431981.25874.sup-devel@rubyforge.org>
2011-01-31 18:24 ` Adeel Ahmad Khan
2011-02-01 13:34 ` Gaute Hope
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox