Responding to responses on my earlier posts on Google Translate

Avoiding combat in a compelling disagreement is no easy task.  A delicate balance, that I unfortunately, struggle to strike. Mallesh Pai is more than good on both fronts, in his response to my two recent posts ( 1, 2). His principal disagreement is my near dismissal of the product Google Translate (GT), in reporting some bugs and omissions in information . Vadim, Arvind and Siddhartha (elsewhere) also offered insightful comments. Google surely would benefit from this team of social and computer scientists testing and discussing GT, with little incentive.

I will begin with what I found problematic in Mallesh’s  largely reasonable response. To test the efficacy of GT, he translated one piece from Hindi to English, an output he calls “less than good” and another from French to English where GT was “unsurprisingly [much better]”. I agree that the results are a darn good start. But in my original posts, my claims were explicitly about  source AND target languages being other than English. Apply my deduction, that GT uses English as a mediating language, on the results shown by Mallesh, the ‘unsurprisingly better’ translation would ‘border on poor’ when it passes from ‘French to Hindi’ ( via English). The reverse, I am afraid, would be an even poorer re-translation of an already “less than good” translation. Although, as Arvind said, such experiments have to begin somewhere, and mediating via English (or Russian for Slavic Languages as per Vadim) are indeed low hanging fruits that should be plucked. These technical glitches ‘alone’, i agree, are not “abject failures”.

However , GT is already positioned as much more than just an experiment. It is an institution and should be evaluated as such. In this frame of evaluation, my invective is quite justified, as I would soon show by situating this institution in two social settings.

The first is GT ‘s use by its non paying customers, ordinary people, who couldn’t care less about whether its outcome is probabilistic or deterministic.  Many of whom don’t know English, and perhaps speak only 1 language. A uni-lingual Hindi speaker, referred to as ‘haurs’ in Bengali script ( a typo for Harsh, meaning Joy) would be bewildered when he sees himself addressed as ‘Ghoda’ (the animal Horse). Remember he doesn’t know any English or Bengali to interpret whether the translation acted up or someone actually called him a horse. There begins my problem.

Possibilities of such mistakes warrant that GT explicitly state either, at the site of translation, and in its product description that they use an intermediate language for translation between certain pairs . Such information is absent even on the GT blog. Why would they hide such a blatant aspect? Perhaps it would make a service, they themselves term an output of a “very smart” program, appear less so.

A more plausible explanation is that hiding this information is in Google’s economic self interest. For this let’s look at GT’s second set of “more valued” customers. Google’s principal revenue is from paid search or keyword based advertising (also called ad words). They offer advertisers to buy keywords in languages they are not conversant in for advertising in global markets, by generating  copy (keywords) with GT (the main GT page links to a toolkit for businesses). The attractiveness of this proposition would go down many notches if advertisers are explicitly told that the translation between many language pairs is actually mediated through a third language such as English or Russian. Aren’t they less likely to use a service that relies on further translation of an initially rough translation. And there Google risks losing millions of advertising dollars. (Finally these are seen by human advertising agents, but initially an advertiser has to generate them through GT).

A traditional analogy: how would a Russian author react if a translator translated an English translation into Hindi  after being hired to translate from Russian to Hindi. Had he known that the translator would use an English version, he may have looked for an alternative provider in the first place.

The point of my earlier posts was not to paint a dystopian picture of the Internet or Artificial Intelligence. Instead, here’s an alternative interpretation of a “successful experiment” that despite being very smart has limitations which need to be stated upfront, when rolled out as an institution .


4 thoughts on “Responding to responses on my earlier posts on Google Translate

  1. Hi Harsh sir. Your thoughts on the translation tool are thoughtful. I see that translation is effective when the interpreter is capable of understanding the essence of the message, sense the mood and then convey it effectively. So instead of using english as a median language, a very effective but very complex tool could be a ‘feel’ language based on heuristic learning and mood sensing based on phrases and not just single words used in the sentence.

    Since a lot of users use this tool for translating millions of message lines, it could be trained to perform this task better with usage over time, much like search has evolved. Just a thought. 🙂

  2. Google would probably assign someone the responsibility of studying these aspects and modifying the methodology to benefit the multilingual as well as uni-lingual people.
    These aspects should be brought to their attention as the observation is really strong.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s