Monthly Archive for November, 2007

How many people are there on Facebook?

Facebooks new advertising features allow people to create targetted advertising campaigns. I took advantage of this feature to uncover some data about Facebook’s user base as I designed a mock campaign, because I’ve been curious to know where its strongest.

Although not all countries are listed below (ie, I have friends in Russia and Serbia whose data I could not fetch), this does give a good indication on users by country. The subtotal of 50 million is about the amount of users I’d expect to be on Facebook; the countries not included are obviously small and would make an immaterial difference. Fifty million users is within the ballpark of what sounds right (sorry, no link, but I read it somewhere), so the breakdown seems pretty complete.

I thought it might also be useful to add the data of under 18 year olds, to show social networking is certainly an adults tool now and not just some teen fad.

facebook users in US	Canada	UK	Australia	China	Columbia	Dominican Republic	Egypt	France	Germany	India	Ireland	Israel	Italy	Japan	Lebanon	Malaysia	Mexico	Netherlands	New Zealand	Norway	Pakistan	Saudi Arabia	Singapore	South Africa	Korea, Republic of	Spain	Sweden	Switzerland	Turkey	United Arab Emirates<br />

Update March 2008: I’ve done a follow up posting on March 2008 numbers

Media post article: baby steps

I’ve contributed to the Metrics Insider column on Mediapost. Have a read, and post a comment 🙂

Ouch – widgets bypassing Google’s wall

Feedjit
On the right of my blog as I write this, I have a widget – it’s a simple piece of javacript, from the company Feedjit, that allows me to embed a short piece of code to indicate to my readers how other people find my blog. Since the launch of the widget, it seems like it has become very popular with 60 million widgets claimed by the company’s website.

I made a discovery today almost by accident: I accessed my blog on another computer. Or rather, I accessed my blog via Google’s cache – who have replicated my content for their search results, widgets and all. Now when you look at the Feedjit widget (image below left), the data is very different: it no longer shows visitors to my blog, but visitors to Google servers.

If you follow through to the detailed statistics you will even see what the most popular sites are that day, as well as the locations of the visitors. As this is data from the Google cache server, you are effectively getting an analysis of visitors – who they are, what keywords they are searching for, and what they found. So because my blog is part of Google cache, I can effectively hack and sneak in the backdoor of Google’s data.

(Having a quick look, it seems this URL is the main Google cache address; however data will only get logged when someone looks at the cache.)

Feedjit google cacheDoes it matter?
While this is a fun thing to look at and then move on, I think it raises some serious issues – multiple ones at that.

On widgets: With the prolifiration of widgets on the web, has this become potentially the next biggest security risk on the web?

On privacy: It’s not that hard to identify the people making those searches. Search engines handing over data to the government has been a hot issue, with Google resisiting a much hyped story as the company tried to prove it protected its users. With the growing cross-pollination of the web, exemplified with widgets, are we prepared for what it means to have open data (which is becoming inevitable)?

On metrics: Google has a complete download of my blog in its cache, but what I didn’t realise, is that it is a copy of the full blog (with scripts like my web stats). When I look at my statistics, I see an awful lot of activity from computer bots for example. Is this because every time Google, Yahoo or MSN analyse content that has been ripped off my site, I can actually see what they are doing behind their closed walls?

Those are questions with simple but also complicated answers. Either way, if its that easy to hack even Google, then God help us.

Pageview’s are a misleading metric

Recently MySpace, the social networking site that once dominated but is now being overtaken by Facebook, sent me an e-mail informing me that a friend of mine had a birthday. What is unusual, is that although I have received notifications of this type when I had logged into the site, I had never been e-mailed.

Below is a copy of the e-mail, and lets see if you notice what I did:
birthdayreminder

It doesn’t tell me whose birthday it is. In fact, it is even ambiguous as to whether it was just the one person or not. Big deal? Not really. But it very clearly tells me something: MySpace is trying to increase its pageviews.

Social networking sites are very useful services to an individual; they enable a person to manage and monitor their personal networks. Not only am I in touch with so many people I lost contact with, but I am in the loop with their lives. I may not message them, but by passive observation, I know what everyone is up to. Things like what they’re studying, where they work, what countries they will be holidaying in, and useful things like when they have their birthday.

Social networking sites are not just a website, but an information service, to help you manage your life. However as useful as I find these services, the revenue model is largely dependent on advertising, with premium features a rare thing now. So when you rely on advertising, you are going to be looking at ways of boosting the key figures that determine that revenue stream.

Friendster’s surprising growth in May was due to some clever techniques of using e-mail, to drive pageviews. And it worked. E-mail notifications, when done tactfully, can drive a huge amount of activity. Of the what seems like hundreds of web services I have joined, e-mail at times is the only way for me to remember I even subscribed to it once upon a time. Combine e-mail with information I want to be updated with, and you’ve got a great recipe for using e-mail as a tool to drive page views.

…And that is the problem. MySpace has very cleverly sent this e-mail to get me to log into my account. A marketing campagn like that will at the very least, see a good day in pageview growth. But the reason I am logging in, is just so I can see whose birthday it is. Myspace now to me is irrelevant: those pageviews attributed to me are actually, not one of an engaged user.

Pageviews as a metric for measuring audience engagement is prone to manipulation. Increases in pageviews on the face of it, make a website appear more popular. But in reality, dig a little deeper and the correlation for what really matters (audience engagement) is not quite on par.

So everyone, repeat after me: Pageviews – we need to drop them as a concept if we are ever going to make progress.

Facebook’s privacy is smart on technology but stupid in thought

I’ve had to neglect this blog because I have been insanely busy with work and my studies, and will continue to do so for the rest of the year. But I thought I’d post a quick observation I made today, that I found interesting. Even more interesting, because I rarely notice details!

Whenever Facebook notifies you of an e-mail – like for example when a friend messages you – it will actually show you their e-mail. An example is in the screen shot below, which would enable me to click ‘reply’ to their e-mail and it would go directly to their personal e-mail. (I’ve noticed however, that this will only occur if you have already added the person as a friend.)

direct e-mail

This raises some interesting issues regarding privacy. The first being, why the heck is Facebook allowing this? Am I going to reply to my friends asking them what did they say in the message?! Privacy is my right to determine when people can see information about me when I want to – and I don’t want my friends seeing my e-mail. I can think of an example when a friend collected my e-mail from my profile, and adding me to a forward list of chain e-mails. Unlike the postal system for snail mail, where people pay for sending me a message with a stamp, e-mail forces the user to pay when they receive a message through their time. Before I didn’t have a choice, but now with new ways of communicating, I can control what gets sent to me.

This actually is a bit deeper. I’ve seen fake profiles friend request me – I always deny people I don’t know, but I know that lots of my friends usually add people blindly (I remember asking a friend who a friend requester was when I noticed she was a mutual friend with him, to which he replied: “No idea, but she’s hot!”). This now just became a very easy way to obtain someones e-mail – certainly, not as easy as harvesting e-mails from a public facing website, but still another means. The concerns however is not spam but identity threats.

A crucial thing to understand about privacy, is the concept of identifiable data. Corporations can collect data about me until their heart is content and I wouldn’t mind- but only on the basis they can’t specifically identify me. An e-mail address is what I regard as identifiable information: the e-mail I use on various web services that hold different data about me, can be easily linked purely through my e-mail address.

I’ve previously said how social networking sites are a new type of communications, that are far better than e-mail. E-mail is one of the worlds most powerful technologies but also one of the most dangerous. Whilst most would think it is because of e-mail overload and spam, what I really mean is how a single e-mail address can do so much damage if used by someone trying to investigate you and your life.

As our digital world becomes more sophisticated (and scary), lets be clear of some things. People no longer need e-mail to contact you; they can instead contact your ‘identity’ which is far superior (I discussed this in the posting I linked to just above). However with this advancement, also comes the opportunity to regard what your e-mail address really is: a key piece of identifiable data that can link your multiple identity’s across the digital world into one mega profile.