error displaying a 2 byte character Ê
SamLT
sam at sltosis.org
Wed Dec 14 23:52:30 CET 2011
On Wed, Dec 14, 2011 at 10:59:22PM +0100, Martin Pohlack wrote:
> On 14.12.2011 22:48, SamLT wrote:
> > On Wed, Dec 14, 2011 at 02:07:11PM +0100, Mikael Magnusson wrote:
> >> On 14 December 2011 08:57, SamLT <sam at sltosis.org> wrote:
> >>>
> >>>
> >>> Hello,
> >>>
> >>> It's been a while since I first noticed that, but I'm not sure really
> >>> where to report. You guys sure can help me:)
> >>>
> >>> When I view or type this character 'Ê' (\uc38a) in a terminal(urxvt
> >>> mainly but also in xterm for the record), I see the character 'Ë'
> >>> (\uc38b).
> >>>
> >>> I don't know where to report this.
> >>>
> >>> Any idea?
> >>
> >> I don't know if it's too soon after waking up, but the codepoints in
> >> parenthesis are 쎊 and 쎋, looks like korean characters?
> >
> >
> > hum, weird:
> > | $ echo -n Ê | hexdump
> > | 0000000 8ac3
> > | 0000002
> > | $ echo -n Ë | hexdump
> > | 0000000 8bc3
> > | 0000002
> >
> > Those are, among other things I suppose, french characters.
>
> You might be confusing UTF-8 encoding with code point (simulated by
> UTF16, little endian here)?
>
> $ echo -n Ê | recode UTF8..UTF16LE | hexdump
> 0000000 00ca
> 0000002
> $ echo -n Ë | recode UTF8..UTF16LE | hexdump
> 0000000 00cb
> 0000002
Wow!, i completly forgot about that while trying to focus on the actual
problem! Thanks! It's been a long time since I last played with bits;)
For those interested this is what the recode command, Martin gave above,
does:
c3 8a -> 1100.0011 1000.1010
now(see [1]), the actual bits of the coded character are:
| 1100.0011 1000.1010
| ---0.0011 --00.1010
which gives:
| 0000.0000 1100.1010 -> 00 ca
Thanks again for this clarification.
Now the actual problem remains: the character 'Ê' (\u00ca), I see the
character Ë (\u00cb). This is probably a font problem. But how do I
know, where am I suppose to report this?
TIA
sam
[1] -> http://en.wikipedia.org/wiki/UTF-8#Description
>
> HTH,
> Martin
More information about the rxvt-unicode
mailing list