From mark at rekudos.net Mon May 5 11:14:14 2025 From: mark at rekudos.net (Mark Lawrence) Date: Mon, 5 May 2025 09:14:14 +0000 Subject: AnyEvent::Log UTF8 strings to terminal? Message-ID: This is sort of what I am trying to do: use strict; use warnings; use utf8; use AE; BEGIN { $ENV{PERL_ANYEVENT_LOG} = 'filter=trace:log=file=/dev/pts/5'; } AE::log error => '?'; And this is the error I'm getting: 2025-05-05 11:10:54.000000 +0200 info AnyEvent::IO: Autoloaded IO model 'Perl', using it. Wide character in syswrite at /usr/lib/x86_64-linux-gnu/perl5/5.36/AnyEvent/IO/Perl.pm line 62. Is there a) a better way to send log output to a terminal and/or b) some way to binmode the aio filehandle? -- Mark Lawrence From felipe at felipegasper.com Mon May 5 14:08:06 2025 From: felipe at felipegasper.com (Felipe Gasper) Date: Mon, 5 May 2025 08:08:06 -0400 Subject: AnyEvent::Log UTF8 strings to terminal? In-Reply-To: References: Message-ID: <55317D1D-98C5-44AD-A12F-0B636A9992AF@felipegasper.com> Omitting `use utf8` is one way to solve your problem. FWIW I maintain that that pragma causes more trouble than it solves. Here?s a presentation I did that touches on this: https://www.youtube.com/watch?v=yH5IyYyvWHU&pp=ygUSZmVsaXBlIGdhc3BlciBwZXJs -FG > On May 5, 2025, at 5:14?AM, Mark Lawrence wrote: > > This is sort of what I am trying to do: > > use strict; > use warnings; > use utf8; > use AE; > > BEGIN { > $ENV{PERL_ANYEVENT_LOG} = 'filter=trace:log=file=/dev/pts/5'; > } > > AE::log error => '?'; > > And this is the error I'm getting: > > 2025-05-05 11:10:54.000000 +0200 info AnyEvent::IO: Autoloaded IO model 'Perl', using it. > Wide character in syswrite at /usr/lib/x86_64-linux-gnu/perl5/5.36/AnyEvent/IO/Perl.pm line 62. > > Is there a) a better way to send log output to a terminal and/or b) some way to binmode the aio filehandle? > > -- > Mark Lawrence > > _______________________________________________ > anyevent mailing list > anyevent at lists.schmorp.de > http://lists.schmorp.de/mailman/listinfo/anyevent From mark at rekudos.net Mon May 5 22:03:51 2025 From: mark at rekudos.net (Mark Lawrence) Date: Mon, 5 May 2025 20:03:51 +0000 Subject: AnyEvent::Log UTF8 strings to terminal? In-Reply-To: <55317D1D-98C5-44AD-A12F-0B636A9992AF@felipegasper.com> References: <55317D1D-98C5-44AD-A12F-0B636A9992AF@felipegasper.com> Message-ID: On Mon May 05, 2025 at 08:08:06AM -0400, Felipe Gasper wrote: >FWIW I maintain that that pragma causes more trouble than it solves. >Here?s a presentation I did that touches on this: >https://www.youtube.com/watch?v=yH5IyYyvWHU&pp=ygUSZmVsaXBlIGdhc3BlciBwZXJs Thanks for the informative presentation. I learned a lot, but not quite enough it seems :-( >Omitting `use utf8` is one way to solve your problem. Even with the video, I don't think your suggestion has quite enough context for me to understand how it helps. Isn't the standard philosophy, that as a Perl User I should deal in (unicode) codepoints? And that I encode to/from UTF-8 at the boundaries? So by default I set filehandle encoding to UTF-8. If I now define UTF-8 strings under "no utf8" I get mojibake: no utf8; binmode STDOUT, ':encoding(UTF-8)'; say '??'; # ???? So of course "use utf8" solves this problem under this philosophy. Unless, as in the AE::log case, there is a filehandle outside my control, which doesn't encode. Your "no utf8" suggestion feels like the opposite. Are you saying use UTF-8 strings everywhere and do not encode filehandles? Or are you saying I need to keep track UTF-8 strings decode them just before output? Then lots of other things break... What am I missing here? :-) -- Mark Lawrence From felipe at felipegasper.com Mon May 5 22:28:17 2025 From: felipe at felipegasper.com (Felipe Gasper) Date: Mon, 5 May 2025 16:28:17 -0400 Subject: AnyEvent::Log UTF8 strings to terminal? In-Reply-To: References: <55317D1D-98C5-44AD-A12F-0B636A9992AF@felipegasper.com> Message-ID: <7144ABBB-A8F9-42F4-B00F-4D1D6E30D43F@felipegasper.com> > On May 5, 2025, at 4:03?PM, Mark Lawrence wrote: > > On Mon May 05, 2025 at 08:08:06AM -0400, Felipe Gasper wrote: >> FWIW I maintain that that pragma causes more trouble than it solves. Here?s a presentation I did that touches on this: https://www.youtube.com/watch?v=yH5IyYyvWHU&pp=ygUSZmVsaXBlIGdhc3BlciBwZXJs > > Thanks for the informative presentation. I learned a lot, but not quite enough it seems :-( > >> Omitting `use utf8` is one way to solve your problem. > > Even with the video, I don't think your suggestion has quite enough context for me to understand how it helps. > > Isn't the standard philosophy, that as a Perl User I should deal in (unicode) codepoints? And that I encode to/from UTF-8 at the boundaries? > So by default I set filehandle encoding to UTF-8. If I now define UTF-8 strings under "no utf8" I get mojibake: > > no utf8; > binmode STDOUT, ':encoding(UTF-8)'; > say '??'; # ???? > > So of course "use utf8" solves this problem under this philosophy. Unless, as in the AE::log case, there is a filehandle outside my control, which doesn't encode. > > Your "no utf8" suggestion feels like the opposite. Are you saying use UTF-8 strings everywhere and do not encode filehandles? Or are you saying I need to keep track UTF-8 strings decode them just before output? Then lots of other things break... > > What am I missing here? :-) Your confusion is totally understandable. The ideal would, yes, be for Perl code to decode all inputs, manipulate, then encode all outputs. Perl, though, makes that *really* hard to do consistently. There?s a graphic in my presentation that shows all the input & output paths for a Perl program and then shows how ?use utf8? adds a decode to just one of those input paths (string literals), leaving all the others (e.g., %ENV) unaddressed. The only way to be ?consistent by default? in Perl is, yes, to leave all inputs & outputs as plain bytes/UTF-8, *not* Unicode. Then you decode/encode in just the spots where you need to manipulate text. This is what I personally recommend, even at variance with the language?s documentation. That said, if all your stuff does ?use utf8?, and it works for you, then maybe stay the course. To keep ?use utf8? and all else in place, you can work around the specific issue with AE::log by doing: --- AE::log error => do { no utf8; '?' } --- ? or by just encoding the text to UTF-8 as usual. HTH, -FG