run_cmd && utf8 streams

Zsbán Ambrus ambrus at math.bme.hu
Fri Jul 6 14:17:38 CEST 2012


On 7/6/12, Marc Lehmann <schmorp at schmorp.de> wrote:
> On Fri, Jul 06, 2012 at 12:54:28PM +0200, Zsbán Ambrus <ambrus at math.bme.hu>
> wrote:
>> Here's some example code for how to use Encode this way at the end of
>
> That example doesn't decode incrementally, it simply gives up at the first
> error and assumes that a partial character is the same as an encoding
> error (which isn't allowed in utf-8).

It does not assume that.  Note the length checks I've left in.  I
believe these should guarantee that any error is detected at most a
couple of input bytes later.  I did acknowledge that the errors might
be detected later than would be possible, but I think I can live with
that.

>> There are some caveats.  Encode still might not be able to
>
> I definitely cannot - it has no state to store the shift state, and no way
> to detect code shifts.

It could leave approperiate shift bytes in the byte string buffer,
just like how it leaves partial utf-8 characters there.

Ambrus



More information about the anyevent mailing list