google translate
Benjamin R. Haskell
rxvt-unicode at benizi.com
Sun Jun 27 21:03:12 CEST 2010
On Sun, 27 Jun 2010, Marc Lehmann wrote:
> On Sun, Jun 27, 2010 at 02:38:17AM -0400, "Benjamin R. Haskell" <rxvt-unicode at benizi.com> wrote:
> > Your commented-out code was close. I think you want locale_decode
> > instead of locale_encode.
>
> I don't know anything about google webservices (and this is unlikely
> to have anything to do with urxvt, which expects unicode), but I
> somehow don't believe that google delivers text in your local encoding
> (how would it know :).
Google knows all... :-)
You're right of course. I thought that locale_{encode,decode} might
have been using the sometimes-confusing directionality of
Encode::{encode,decode}. That is: locale_decode would decode a bunch of
bytes that contained UTF-8 (into a locale-appropriate encoding).
> My guess is that the webservice expects utf-8 and wants utf-8, so as
> rxvt delivers and epxects text in unicode, you first have to
> encode/decode to/from utf-8, using e.g. the Encode module.
Not sure what distinction between unicode and utf-8 (a particular
Unicode encoding) you're drawing here, unless you're just saying that
rxvt expects strings to be perl-internal Unicode (UTF-8 flag on).
> But again, I don't know what encoding WebService::Google::Language
> expects, but it will almostc etrainly not match your local encoding
> except by chance.
The chance of UTF-8 being the local encoding is pretty good these days
apparently :-)
Updated version that uses Encode, allows for multiple language pairs,
and updates the selection when used from the selection popup is at:
http://benizi.com/urxvt/google-translate
The perl:google-translate command is modified:
perl:google-translate:src:dst (translates from src to dst)
perl:google-translate:src:dst:1 (...also updates the selection)
(If 'src' is empty, it's auto-detected)
It still uses the gt_lang.src and gt_lang.dst resources, but as
comma-separated lists. Default src is '' (auto-detect), default dst is
equivalent to 'en,fr,it,de,es,pt,ru,hi,zh-CN,zh-TW,ja,ko,el,ar,iw', to
cover English and the most common translation languages (in the U.S.
English market): FIGS (France, Italy, Germany, Spain), BRIC (Brazil,
Russia, India, China), and CJK. And Greek, Arabic, and Hebrew (to test
some other charsets).
It also adds the gt_lang.pairs resource, which is a comma-separated set
of src:dst pairs. (The .src and .dst resources get expanded via
Cartesian product, which can be unwieldy.) The gt_lang.label resource
provides the label text and defaults to 'Google translate'. If the
gt_lang.codes resource is non-empty, the language codes aren't converted
into names. (Disables the 'en' => 'English' mapping.)
Anything with wide/variable-width/RTL characters displays weirdly (as
might be expected), but seems to copy-paste via X selection just fine.
--
Best,
Ben
More information about the rxvt-unicode
mailing list