Fake social media profiles: My own experience with Facebook

Just read this blog in the economist, stating how many social media fans/ followers etc are actually fake machine generated accounts. The Blog mentions that Mitt Romney gained 17% followers in a day and interestingly, a friend tweeted about Mitt Romney losing 11 followers per minute.

Arguably this machine generation of fake profiles, is of some benefit to the organization ( say Romney), but I bet also helps build up the numbers of social media service.
A few weeks ago Facebook prompted me to friend one Harsh Taneja. It was indeed a fake profile of myself. The picture and the “about me” description among other details were scraped from this blog. Interestingly this clone of mine and I also had 7 or 8 mutual friends already. Curious I decided to send a friend request to myself and magically it was accepted in some hours. I reported the profile and I think in a day it was taken down ( as I am unable to find it on searching).

Made me think, it is indeed to easy to create fairly real looking machine generated profiles, given the amount of self descriptions we leave on the web, that can be viewed publicly or semi-privately. And it is quite easy for these profiles to circulate unchecked, as people can entirely miss seeing them, forget to report. In some other cases, they may be of people not on the service altogether ( say people who use blogger but not twitter, and their About Me pages on blogger provide fodder to generate these fake twitter profiles.) Scary!

Finally, I decided to use this service to check my own twitter followers and found that 5% of the followers were fake and another 13% inactive. Perhaps explains why tweeting some random but generic word such as Forex, or Television instantly adds a few followers.


Facebook ads impact is just more ‘measurable’ than TV ads, not necessarily ‘effective’

They have been talking for a while about the death of the 30 second commercial. And social media were not the first technologies that inspired such claims.Yet, the size of Facebook with its tightly connected ‘anatomy’ has made these claims more vociferous. (The average degree of separation between any two people on Facebook is just 3.8, means that just in four steps a message can actually travel from anyone to anyone else, on average).

A recent post exemplifying this refrain, claims that Facebook promotion ( paid ads + all other kinds of messages) deliver 70 times more impressions ( eyeballs that see the ad) for the same cost than an ad on prime time television in the UK ( on coronation street) and 180 times more impressions than the same campaigns in the US . They argue that a well run Facebook campaign can generate 4 to 5 billion impressions versus the TV campaign that reaches a 100 million for the same duration. The claim is problematic due to the way impressions are counted for the two media.

The Facebook number is reached by counting the people touched directly (who friend or like the brand pages), their friends (who see it in their feeds, comments or messages) and friends of friends of those directly touched. In other words upto 3 levels removed from the campaign. For TV they simply count the number of eyeballs that saw the commercial ( the way marketers have computed ad effectiveness for TV ads for decades). There seems an implied an assumption that prime time television advertising only reaches those who see it directly. However just as Facebook page activity, television advertising too inspires conversations and recommendations through word of mouth that can make it reach a significant multiple of that 100 million, especially over the course of the year. In fact, the worth of one such conversation or recommendation (including on Facebook) ‘may’ be worth a few hundred Facebook impressions, a miniscule number of which may actually count themselves. However, obtaining any similar accurate measurement for the extended impact of TV advertising is next to impossible. However, his does not mean that there is no extended impact.

On first degree impressions, TV campaigns clearly achieve much more than the biggest Facebook pages( even Coca-Cola has just 53 million likes). Next, it may be worth pondering why these extended 4 to 5 billion impressions may not really count for a lot. Consider that for each person who likes a brand page, that activity appears on some laundry list of notifications of 16% of their 190 friends (Currently 16% of FB users are able to see such posts, and the average number of friends on FB is 190). A miniscule number of these impressions are likely to be welcomed or even consumed. Those receiving them may be just users who haven’t optimized their news feeds enough to avoid such notifications. Finally the post claims that Facebook itself did not have any data on the third level of sharing (to friends of friends of directly touched). This will be presumably much lower than the 16% on level 1. Think of chance that a friend of your friend is interested in the activity by their friend ( who is not a mutual friend) with a campaign or brand page. Since these data were not available, they used some triangulation to estimate some multipliers, and do not mention either the multipliers (no of people reached further by each person) or the attenuation (what % of people’s friends will receive such a message).

Of course other points worth considering are 1) why people like brand pages on Facebook (to get discounts!) and 2) how did they learn about these brands (surely not exclusively through FB campaigns or noticing friends who liked FB campaigns, but perhaps through prime time advertising and conversations with friends and family). These concerns make the value of these impressions questionable, even if their sheer volume is estimated properly. Of course the same can be argued for TV advertising that people do not want to see, but end up watching.

In sum, the key point here is that just because Facebook due to the traces users leave, allows one to measure how a campaign circulates beyond the first degree impressions and TV does not, one cannot conclude that Facebook campaigns are that much more cost effective than prime time TV.

Words of the mouse can mislead: Google Translator exposed through more fundamental evidence

My previous post got advocates of artificial intelligence to accuse me of making arguments against Google Translator (GT) based on proper nouns or misspelled words. These they contended made GT confound translation with transliteration in that particular case. However, I have recently found more compelling evidence to show that they indeed use English as a mediating language even when they ‘offer’ to translate from any source language to any target language. In doing so, they ignore fundamental ways in which languages differ from English.

Spanish like Hindi has a distinct formal and informal second person form. So an “Aap kaise ho? (How are you, formal)  – becomes “?Como esta”? in spanish and a “Tum Kaise Ho?” (how are you, informal)translates to “?Como estas?”. Enter either of these in GT in Hindi and the Spanish output on  GT is the same result, “?Como estas?” ( the informal form). Puzzled, well English ( the language through which it is mediated) has one second person singular form, “You”. So either of the Hindi expressions are first translated to “How are You?” and then further translated into Spanish.

Here I used no proper nouns, or a word that was hard to spell or understand, but the first expression that one learns when starting with any new language ( even before the alphabet or any vocab). And there this probabilistic and/or intelligent algorithm fails to make this fundamental distinction in  Spanish and Hindi from English.

Anyway, till Big Brother achieves greater perfection so that humans can only learn the newest newspeak  (ref: Orwell’s 1984) – the language of the mouse,  I urge you to continue enrolling in real language classes and turn to real people for humanistic tasks!

PostScript:I was told that my previous post was being circulated within Google and they were using the evidence presented as a case study of sorts. I would have expected them to offer some kind of acknowledgement. But I see no signs that they even visited my website. They ( or some employee) perhaps has conveniently copied the text and maybe is passing it off as his own discovery.

Artificial Intelligence May Only Take Us So Far: The Abject Failure of Google Translator

I like to experiment with language services (Hindi versions) of Google Products. Granted that I don’t need to do this, since everyone I communicate with over email is perhaps as much or more fluent in English than in Hindi. So call it inverse snobbery, too much free time or whatever you please.  The tale that follows reveals that my endeavors are not that fruitless after all.

A friend, to whom I often write emails in Hindi, decided to go one up on me. He replied to me in Bengali.  I decided out of curiosity to enter the text in Google translator and asked for a Hindi translation. The fun began.

‘হর্স ‘( ‘Harsh’), my name written in Bengali script was translated by Google as ‘घोड़ा’ the Hindi word for ‘Horse’.  By no means, the letters making up my name in Bengali combine to mean “Horse.”  Puzzled, I decided to ask Google for a Bengali – English translation and it had indeed translated হর্স  as Horse. Then I realized that Bengali doesn’t have a hard “a” sound and instead uses “au” and instead of a hard “sh” often pronounces it as “s”. So “Harsh” can sound like Horse. And then Horse in Hindi is घोड़ा (the word for the animal) and hence the output by Google. This reveals two major flaws in Google Translator.

First, that Google is really ‘fooling’ users when it offers translation from any source language to many other target languages. For instance in this case the translation was really being made from Bengali to English and then to Hindi. Similarly I checked Bengali to Spanish, the same word was translated as Caballo – the Spanish word for the animal horse. Perhaps Bengali to Spanish being mediated via English is still understandable but Bengali to Hindi via English is a very inefficient way of translating. It is almost like translating between Arabic and Urdu via English. More importantly the service conveys the impression that it directly translates from the source language to the target language.

The second flaw suggested by this incident is even more grave. That is, if Google does not have the meaning of the word in the input language in its database (for instance my name here in Bengali) , it translates the ‘sound’ into English. Now if that sound happens to be spelled as a legitimate English Word, as was “Horse’  in this case, it assigns the ‘meaning’ of the word in English to all subsequent translations.  This completely distorts the original meaning ,of course.

In this case I was reasonably close to the three languages to ascertain what was going on. It may not always be the case. I would perhaps go to a real person to put me wise than rely on artificial intelligence. Big Brother may desire to simplify our lives, but he is not so wise yet, after all.

Addendum: A conversation with someone who read this one. And an update to respond to all previous reactions

Rohan MurarkaHow can you gauge it based on translation of proper nouns?
HarshT :Rohan, good question, but one I had anticipated all along. Of the two arguments I made – the first one ( about translation being mediated through English stands irrespective. The second one is perhaps a problem because of the word being a proper noun ( hence not in the extant database of the source language). But understanding it as a common noun (in the mediating language) for further translation is what changes the meaning completely. It is an error that can be easily fixed – they just need to flag it as a word they cannot translate and retain it. Assume I knew no Bengali or English here -then I had no way to decipher that why did someone call me a ‘ghoda’. Instead the translation could have been ‘haurs’ with a quote or something around it to signifiy that the word was not ‘comprehended’ by the machine

Quick Updates on the NASA Diwali Image Phenomenon : Socially Constructed Thruths and the Persistence of Popularity

October 26th ( 2:40 Pm Chicago Time Less than 48 hrs on) ;66, 248 Shares  ( three times over 22,000 before)

October 25th ( 14 hrs on)

Remember in my original post I had speculated that as India would wake up , we would see an exponential rate of increase in the alternate truth. Yes the 1000 odd shares on that thread have grown to 23,000 ( in 12 -13 hrs) in that thread ( potential audience of 7-8 million just from this source).

Interestingly, there were many counter-voices in some of the threads that circulated who argued with similar logic as to why the image was not one of Diwali.Some of these even pointed out to the original website of the image confirming that it wasn’t a Diwali image.  Yet the Euphoria continues to persist.

To me it reflects that on Social Media, once certain facts get socially constructed, it is very hard to let alternate facts percolate. In other words, their popularity will continue to persist, even though they have been proven wrong! There seems little room for course correction.

Social Media and The Problem with ‘Socially Constructed Truths’ : Why all Believed that the NASA Picture Was Clicked on Diwali

India and her diaspora worldwide was excited about Divali, (the festival of lights), and for good reason. A day before Divali,  October 25 2011 3:30 pm CT, I noticed that a map had become a ‘Facebook hit’. In less than 5 hrs,  the link had first appeared it had already been shared by 1000 odd people ( giving it a potential audience of at least 200,000). Remember I am reporting figures for only 1 thread that I could trace.

An idea took form that the picture was an illumination of the country on Diwali night. Of course most people related the distinct, yellow, green and red dots to the colors of fireworks.  When it popped on my wall ( as the 950th share) I immediately questioned that Colombo(Sri Lanka), Lahore (Pakistan) and Delhi could not be celebrating divali with the same ‘illumination’. (Some even speculated these were the remnants of Hindu legacies in these cities!) Also why would people in certain areas burst green colored fireworks when the rest of the country was bursting yellow.

Leave that, it was easy to explain the map if one understood some   development parameters and grade school geography of the region ( Electrical Connectivity and Population Density, Location of the cities, Mountains).

So I decided to question how could the illumination be different from any other night (when the Power grid hadn’t failed)? I commented with my reservations on the wall of the friend who had shared this with me. Seeing some merit in my reservations, he digged the original source of the picture and indeed found it had nothing to do with Divali. He found the original source on FB itself, when someone else like me ( a friend of a friend of his) had commented on their common friend’s share of this post with similar concerns and found this link.

But our counter-currents are far and few between. That they are the complete truth does not matter . As I write, the Indian diaspora on Facebook believes that the map is indeed a NASA picture clicked on a Divali evening. I am sure in a few hours when Indians wake up this ‘socially constructed truth’ will spread further and more widely. ( See my update tracking the numbers)

To me this is the grim reality of online social media!

Happy Divali Regardless