Archive for November, 2007

How many people are there on Facebook?

Facebooks new advertising features allow people to create targetted advertising campaigns. I took advantage of this feature to uncover some data about Facebook's user base as I designed a mock campaign, because I've been curious to know where its strongest.

Although not all countries are listed below (ie, I have friends in Russia and Serbia whose data I could not fetch), this does give a good indication on users by country. The subtotal of 50 million is about the amount of users I'd expect to be on Facebook; the countries not included are obviously small and would make an immaterial difference. Fifty million users is within the ballpark of what sounds right (sorry, no link, but I read it somewhere), so the breakdown seems pretty complete.

I thought it might also be useful to add the data of under 18 year olds, to show social networking is certainly an adults tool now and not just some teen fad.

facebook users in US	Canada	UK	Australia	China	Columbia	Dominican Republic	Egypt	France	Germany	India	Ireland	Israel	Italy	Japan	Lebanon	Malaysia	Mexico	Netherlands	New Zealand	Norway	Pakistan	Saudi Arabia	Singapore	South Africa	Korea, Republic of	Spain	Sweden	Switzerland	Turkey	United Arab Emirates<br />

Update March 2008: I've done a follow up posting on March 2008 numbers

Media post article: baby steps

I've contributed to the Metrics Insider column on Mediapost. Have a read, and post a comment :)

Ouch – widgets bypassing Google’s wall

Feedjit
On the right of my blog as I write this, I have a widget - it's a simple piece of javacript, from the company Feedjit, that allows me to embed a short piece of code to indicate to my readers how other people find my blog. Since the launch of the widget, it seems like it has become very popular with 60 million widgets claimed by the company's website.

I made a discovery today almost by accident: I accessed my blog on another computer. Or rather, I accessed my blog via Google's cache - who have replicated my content for their search results, widgets and all. Now when you look at the Feedjit widget (image below left), the data is very different: it no longer shows visitors to my blog, but visitors to Google servers.

If you follow through to the detailed statistics you will even see what the most popular sites are that day, as well as the locations of the visitors. As this is data from the Google cache server, you are effectively getting an analysis of visitors - who they are, what keywords they are searching for, and what they found. So because my blog is part of Google cache, I can effectively hack and sneak in the backdoor of Google's data.

(Having a quick look, it seems this URL is the main Google cache address; however data will only get logged when someone looks at the cache.)

Feedjit google cacheDoes it matter?
While this is a fun thing to look at and then move on, I think it raises some serious issues - multiple ones at that.

On widgets: With the prolifiration of widgets on the web, has this become potentially the next biggest security risk on the web?

On privacy: It's not that hard to identify the people making those searches. Search engines handing over data to the government has been a hot issue, with Google resisiting a much hyped story as the company tried to prove it protected its users. With the growing cross-pollination of the web, exemplified with widgets, are we prepared for what it means to have open data (which is becoming inevitable)?

On metrics: Google has a complete download of my blog in its cache, but what I didn't realise, is that it is a copy of the full blog (with scripts like my web stats). When I look at my statistics, I see an awful lot of activity from computer bots for example. Is this because every time Google, Yahoo or MSN analyse content that has been ripped off my site, I can actually see what they are doing behind their closed walls?

Those are questions with simple but also complicated answers. Either way, if its that easy to hack even Google, then God help us.