* [sup-talk] display_length issue with special-characters on non-UTF8 terminal
@ 2009-06-09 10:00 Tarko Tikan
2009-06-12 19:18 ` William Morgan
0 siblings, 1 reply; 6+ messages in thread
From: Tarko Tikan @ 2009-06-09 10:00 UTC (permalink / raw)
hey,
When String.display_length was introduced in recent update, it broke the length for non-UTF8 strings that contain the special characters. Wrong length results corrupted display (line ends chopped off).
Terminal is iso-8859-15 and it's detected by sup correctly.
I've tracked it down to /./u regexp. Here are some examples:
irb(main):001:0> "asd".scan(/./u)
=> ["a", "s", "d"]
irb(main):002:0> "asd????".scan(/./u)
=> ["a", "s", "d", "\365\374\344\366"]
irb(main):017:0> "asd???".scan(/./u)
=> ["a", "s", "d"]
irb(main):008:0* "asd".scan(/./)
=> ["a", "s", "d"]
irb(main):009:0> "asd????".scan(/./)
=> ["a", "s", "d", "\365", "\374", "\344", "\366"]
Expecting UTF8 gives unexpected results :) Also, old behaviour of String.length gives correct results with these test cases.
--
tarko
^ permalink raw reply [flat|nested] 6+ messages in thread
* [sup-talk] display_length issue with special-characters on non-UTF8 terminal
2009-06-09 10:00 [sup-talk] display_length issue with special-characters on non-UTF8 terminal Tarko Tikan
@ 2009-06-12 19:18 ` William Morgan
2009-06-13 11:13 ` Tarko Tikan
0 siblings, 1 reply; 6+ messages in thread
From: William Morgan @ 2009-06-12 19:18 UTC (permalink / raw)
Reformatted excerpts from Tarko Tikan's message of 2009-06-09:
> When String.display_length was introduced in recent update, it broke
> the length for non-UTF8 strings that contain the special characters.
> Wrong length results corrupted display (line ends chopped off).
That's a good point. I got a little utf8-centric with those changes.
(I'm assuming that your terminal encoding is not UTF-8.)
Does this patch fix the issue? If so, I will release an 0.8.1.
--- cut here ---
diff --git a/lib/sup.rb b/lib/sup.rb
index 4f59eaa..20835ae 100644
--- a/lib/sup.rb
+++ b/lib/sup.rb
@@ -244,7 +244,7 @@ end
Redwood::log "using character set encoding #{$encoding.inspect}"
else
Redwood::log "warning: can't find character set by using locale, defaulting
- $encoding = "utf-8"
+ $encoding = "UTF-8"
end
## now everything else (which can feel free to call Redwood::log at load time)
diff --git a/lib/sup/util.rb b/lib/sup/util.rb
index 8a3004f..d5310bc 100644
--- a/lib/sup/util.rb
+++ b/lib/sup/util.rb
@@ -172,7 +172,13 @@ class Object
end
class String
- def display_length; scan(/./u).size end
+ def display_length
+ if $encoding == "UTF-8"
+ scan(/./u).size
+ else
+ size
+ end
+ end
def camel_to_hyphy
self.gsub(/([a-z])([A-Z0-9])/, '\1-\2').downcase
--
William <wmorgan-sup at masanjin.net>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [sup-talk] display_length issue with special-characters on non-UTF8 terminal
2009-06-12 19:18 ` William Morgan
@ 2009-06-13 11:13 ` Tarko Tikan
2009-06-15 14:10 ` William Morgan
2009-06-17 16:04 ` William Morgan
0 siblings, 2 replies; 6+ messages in thread
From: Tarko Tikan @ 2009-06-13 11:13 UTC (permalink / raw)
> (I'm assuming that your terminal encoding is not UTF-8.)
No, it's not.
> Does this patch fix the issue? If so, I will release an 0.8.1.
Yes it does. To me, this approach felt "hackish" so I didn't come up with a patch :) But I still don't have better idea how to fix it so it'll have to stay like this.
> + if $encoding == "UTF-8"
> + scan(/./u).size
> + else
> + size
> + end
It would probably be correct to use:
if $encoding == "UTF-8"
scan(/./u).size
else
length
end
Thats because scan returns a array (hence using the size), without scan you are just invoking on string and it's correct to use length (for some reason size works too, backward compatibility?)
--
tarko
^ permalink raw reply [flat|nested] 6+ messages in thread
* [sup-talk] display_length issue with special-characters on non-UTF8 terminal
2009-06-13 11:13 ` Tarko Tikan
@ 2009-06-15 14:10 ` William Morgan
2009-06-17 16:04 ` William Morgan
1 sibling, 0 replies; 6+ messages in thread
From: William Morgan @ 2009-06-15 14:10 UTC (permalink / raw)
Reformatted excerpts from Tarko Tikan's message of 2009-06-13:
> Yes it does. To me, this approach felt "hackish" so I didn't come up
> with a patch :) But I still don't have better idea how to fix it so
> it'll have to stay like this.
It's hackish because Ruby 1.8 has shitty multibyte support. The only
reason it works at all is because byte length is character length (at
least most of the time) in your encoding.
There is a multibyte gem out there that I'm keeping an eye on. Also Ruby
1.9.1 allegedgly fixes this problem.
> Thats because scan returns a array (hence using the size), without
> scan you are just invoking on string and it's correct to use length
> (for some reason size works too, backward compatibility?)
Size and length are synonmys for both arrays and strings. I used size
there for symmetry.
--
William <wmorgan-sup at masanjin.net>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [sup-talk] display_length issue with special-characters on non-UTF8 terminal
2009-06-13 11:13 ` Tarko Tikan
2009-06-15 14:10 ` William Morgan
@ 2009-06-17 16:04 ` William Morgan
2009-06-17 18:34 ` Nicolas Pouillard
1 sibling, 1 reply; 6+ messages in thread
From: William Morgan @ 2009-06-17 16:04 UTC (permalink / raw)
Reformatted excerpts from Tarko Tikan's message of 2009-06-13:
> william wrote:
> > Does this patch fix the issue? If so, I will release an 0.8.1.
>
> Yes it does. patch :) But I still don't have better idea how to fix
> it so it'll have to stay like this.
I have released an 0.8.1 which has this patch in it.
--
William <wmorgan-sup at masanjin.net>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [sup-talk] display_length issue with special-characters on non-UTF8 terminal
2009-06-17 16:04 ` William Morgan
@ 2009-06-17 18:34 ` Nicolas Pouillard
0 siblings, 0 replies; 6+ messages in thread
From: Nicolas Pouillard @ 2009-06-17 18:34 UTC (permalink / raw)
Excerpts from William Morgan's message of Wed Jun 17 18:04:34 +0200 2009:
> Reformatted excerpts from Tarko Tikan's message of 2009-06-13:
> > william wrote:
> > > Does this patch fix the issue? If so, I will release an 0.8.1.
> >
> > Yes it does. patch :) But I still don't have better idea how to fix
> > it so it'll have to stay like this.
>
> I have released an 0.8.1 which has this patch in it.
I still have issues with display_length. I use UTF-8, urxvt
and some characters disappear when a line contains special characters.
For instance in thread-view-mode if a line contains a special character
then the last character is dropped.
I've "fixed" the issue by reverting a display_length call to a size call
as in the attached patch.
diff --git a/lib/sup/buffer.rb b/lib/sup/buffer.rb
index 8eedf96..795b4c9 100644
--- a/lib/sup/buffer.rb
+++ b/lib/sup/buffer.rb
@@ -114,7 +114,7 @@ class Buffer
stringl += 1 while stringl < s.length && s[0 ... stringl].display_length < maxl
@w.mvaddstr y, x, s[0 ... stringl]
unless opts[:no_fill]
- l = s.display_length
+ l = s.size
unless l >= maxl
@w.mvaddstr(y, x + l, " " * (maxl - l))
end
--
Nicolas Pouillard
http://nicolaspouillard.fr
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-06-17 18:34 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-09 10:00 [sup-talk] display_length issue with special-characters on non-UTF8 terminal Tarko Tikan
2009-06-12 19:18 ` William Morgan
2009-06-13 11:13 ` Tarko Tikan
2009-06-15 14:10 ` William Morgan
2009-06-17 16:04 ` William Morgan
2009-06-17 18:34 ` Nicolas Pouillard
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox