From: Rich Lane <rlane@club.cc.cmu.edu>
To: Matti Eiden <snaipperi@gmail.com>
Cc: sup-devel <sup-devel@rubyforge.org>
Subject: Re: [sup-devel] Arch utf8 vs UTF-8 fix and wide character support
Date: Fri, 07 May 2010 12:46:20 -0400 [thread overview]
Message-ID: <1273250528-sup-9662@zyrg.net> (raw)
In-Reply-To: <p2s6242182a1005061102ree7fa042jb595f0a2e7b443cc@mail.gmail.com>
Excerpts from Matti Eiden's message of 2010-05-06 14:02:46 -0400:
> Hey folks,
>
> I've been experimenting with sup for the past few days, and of course,
> I love it. Firstly I had some trouble with getting unicode display
> going. This problem was already described in an old post on this
> mailing list:
>
> http://rubyforge.org/pipermail/sup-devel/2010-March/000522.html
>
> So Arch Linux defines encoding as utf8, but Iconv requires it to be
> UTF-8. I would say this is a bug in Arch Linux for not following
> standards, but anyway, I fixed it with the little modification to
> sup.rb:
>
> ## determine encoding and character set
> $encoding = Locale.current.charset
> $encoding = "UTF-8" if $encoding == "utf8"
I've applied this fix, thanks.
> Then about wide character support. And I mean really wide. Like CJK
> characters. Scandics (ä,ö,å) and other European accent characters work
> nicely, as we all who are concerned probably know. These characters
> have a byte length of 2 and unicode length of 1.
>
> However, take an example of the following two-character Korean word
> (byte length of such single character is 3 instead of 2!)
>
> http://www.kotiposti.net/eiden/soulbound/hellovim.png (looking good in vim)
> http://www.kotiposti.net/eiden/soulbound/hellosup.png (sup lost 2
> characters (or bytes) from the line that has the Korean word)
>
> It seems that for every Korean character with a byte length of 3, one
> byte is lost from the end of the line. In the above example, two bytes
> are missing in sup, as there are two Korean characters on the same
> line.
>
> If the line consist of a single Korean character, nothing appears in
> sup (last byte out of three is missing?).
> If the line consist of two Korean characters, last character is
> missing (last two bytes out of six are missing?).
> etc.
>
> Some sort of miscalculation somewhere is causing this, perhaps
> assuming that unicode characters always have a byte length of 2? Can
> anybody with Ruby skills take a look on this?
It's actually the multiple screen cells that causes problems, not
multiple bytes [1]. Sup currently thinks all characters are 1 cell wide.
The right thing is probably a C extension that uses wcswidth.
[1] http://mid.gmane.org/1264629880-sup-9232%40zyrg.net
_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel
next prev parent reply other threads:[~2010-05-07 16:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <y2j6242182a1005061059w5e32fb54vd10ccfd7e4a1911e@mail.gmail.com>
2010-05-06 18:02 ` Matti Eiden
2010-05-07 16:46 ` Rich Lane [this message]
2010-05-11 18:50 ` Matti Eiden
2010-05-11 19:19 ` William Morgan
2010-05-11 21:51 ` Matti Eiden
2010-05-13 12:33 ` William Morgan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1273250528-sup-9662@zyrg.net \
--to=rlane@club.cc.cmu.edu \
--cc=snaipperi@gmail.com \
--cc=sup-devel@rubyforge.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox