Unicode related bug fix (Updated)

Yarin yarin at warpmail.net
Tue Mar 27 09:23:13 CEST 2012


> Hmm, if that were the case, mbstowcs would use more than one wide
> character per input octet - under what conditions would this
> legally happen?

I don't know whether or not that would conform to specifications.
Only that the author expected it to null terminate the string,
and it doesn't on OpenBSD, which introduces an obvious bug.

> (does openbsd support __STDC_ISO_10646__? urxvt requires this).

Yes, but the concern is proper termination, not encoding.

Also, the replacement code had a typo. Here it is fixed:

size_t rl = mbstowcs (r, str, len);
if(rl == (size_t)-1) *r = 0;
else r[rl] = 0;

> that would also corrupt data in that case - if the buffer is not
> big enough, data is lost.

I'm not sure if you're refering to my typo.
But if not, then no. len+1 characters are actually allocated,
so the call to mbstowcs() will never touch the last allocated
character. (and the return value reflects this, see the man)

> yes, but it sounds as if the bug is in itself.

I think it's just a difference in implimentation.
If you choose to _not_ support OpenBSD, then I can understand
why you wouldn't care to patch this. But the patch,
does make it more portable.

----- Original message -----
From: Marc Lehmann <schmorp at schmorp.de>
To: Yarin <yarin at warpmail.net>
Cc: rxvt-unicode at lists.schmorp.de
Subject: Re: Unicode related bug fix application
Date: Tue, 27 Mar 2012 04:10:17 +0200

On Mon, Mar 26, 2012 at 05:42:56PM -0500, Yarin <yarin at warpmail.net> wrote:
> Upon investigation I found that it was a string _not_ being properly null
> terminated (in spite of the man page suggesting that it might be done
> automatically). The fix for this is quite simple...

Hmm, if that were the case, mbstowcs would use more than one wide
character per input octet - under what conditions would this legally
happen?

(does openbsd support __STDC_ISO_10646__? urxvt requires this).

> need to be replaced with

that would also corrupt data in that case - if the buffer is not big
enough, data is lost.

> While I understand this bug may or may not be present on a system depending
> on it's implementation of mbstowcs(), it is still a very legitimate bug.

yes, but it sounds as if the bug is in mbstowcs itself.

-- 
                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_              http://www.deliantra.net
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      schmorp at schmorp.de
      -=====/_/_//_/\_,_/ /_/\_\



More information about the rxvt-unicode mailing list