Frequent thinker, occasional writer, constant smart-arse

Tag: scoble

On the future of search

Robert Scoble has put together a video presentation on how Techmeme, Facebook and Mahalo will kill Google in four years time. His basic premise is that SEO’s who game Google’s algorithm are as bad as spam (and there are some pissed SEO experts waking up today!). People like the ideas he introduces about social filtering, but on the whole – people are a bit more skeptical on his world domination theory.

There are a few good posts like Muhammad‘s on why the combo won’t prevail, but on the whole, I think everyone is missing the real issue: the whole concept of relevant results.

Relevance is personal

When I search, I am looking for answers. Scoble uses the example of searching for HDTV and makes note of the top manufacturers as something he would expect at the top of the results. For him – that’s probably what he wants to see – but for me, I want to be reading about the technology behind it. What I am trying to illustrate here is that relevance is personal.

The argument for social filtering, is that it makes it more relevant. For example, by having a bunch of my friends associated with me on my Facebook account, an inference engine can determine that if my friend called A is also friends with person B, who is friends with person C – than something I like must also be something that person C likes. When it comes to search results, that sort of social/collaborative filtering doesn’t work because relevance is complicated. The only value a social network can provide is if the content is spam or not – a yes or no type of answer – which is assuming if someone in my network has come across this content. Just because my social network can (potentially) help filter out spam, doesn’t make the search results higher quality. It just means less spam results. There is plenty of content that may be on-topic but may as well be classed as spam.

Google’s algorithm essentially works on the popularity of links, which is how it determines relevance. People can game this algorithm, because someone can make a website popular to manipulate rankings through linking from fake sites and other optimisations. But Google’s pagerank algorithm is assuming that relevant results are, at their core, purely about popularity. The innovation the Google guys brought to the world of search is something to be applauded for, but the extreme lack of innovation in this area since just shows how hard it is to come up with new ways of making something relevant. Popularity is a smart way of determining relevance (because most people would like it) – but since that can be gamed, it no longer is.

The semantic web

I still don’t quite understand why people don’t realise the potential for the semantic web, something I go on about over and over again (maybe not on this blog – maybe it’s time I did). But if it is something that is going to change search, it will be that – because the semantic web will structure data – moving away from the document approach that webpages represent and more towards the data approach that resembles a database table. It may not be able to make results more relevant to your personal interests, but it will better understand the sources of data that make up the search results, and can match it up to whatever constructs you present it.

Like Google’s page rank, the semantic web will require human’s to structure data, which a machine will then make inferences – similar to how Pagerank makes inferences based on what links people make. However Scoble’s claim that humans can overtake a machine is silly – yes humans have a much higher intellect and are better at filtering, but they in no way can match the speed and power of a machine. Once the semantic web gets into full gear a few years from now, humans will have trained the machine to think – and it can then do the filtering for us.

Human intelligence will be crucial for the future of search – but not in the way Mahalo does it which is like manually categorising pieces of paper into a file cabinet – which is not sustainable. A bit like how when the painters of the Sydney harbour bridge finish painting it, they have to start all over again because the other side is already starting to rust again. Once we can train a machine that for example, a dog is an animal, that has four legs and makes a sound like “woof” – the machine can then act on our behalf, like a trained animal, and go fetch what we want; how those paper documents are stored will now be irrelevant and the machine can do the sorting for us.

The Google killer of the future will be the people that can convert the knowledge on the world wide web into information readeable by computers, to create this (weak) form of artificial intelligence. Now that’s where it gets interesting.

New measurement systems need a purpose

Chris recently proposed a new measurement system for attention, after yet another call to arms for a new way of measuring metrics. This is a hard issue to gnaw at, because it’s attempting to graple at the emerging business models of a new economy, which we are still at the cross roads at. Chris asked us on the APML workgroup on what we thought of his proposal, which is interesting, but I thought it might be better to take a step back on this one and look at the bigger picture. Issues this big need to be conceptually clear, before you can break into the details.

Television, radio, and newspapers are the corner stone of what we regard as the mainstream media. For decades, they have ruled the media business – with their 30 second advertising spots, and “pageviews” (circulation). Before the information age, they were what the ‘attention economy’ was. None of those flamin’ blogs stealing our attention: content and advertising flowed through to us from one place.

The internet is enabling literally an entire new Age of humanity. A lot of the age-old business models have been replicated, because we don’t know any better, but people are abandoning them because they are realising they can now do so much more. So the key here is not to get too excited on what you can do – rather, we need to think why what we need to do.

Let me explain – advertisers sold their product on a TV/radio commercial, and a newspaper page, because it guaranteed them that a certain amount of people would see it. Advertisers advertise because they want to do one thing: to make money. It’s just how capitalism works – profit is god – so do what you can to make higher profit.

But back then, the traditional mainstream media was the only way they could reach audiences on an effective scale. However advertising on the Sunday night movie is the equivalant to dropping a million pamphlets out of a plane, hoping that the five customers you know that would buy your product, end up catching it. Back then, no one complained – it was the best we could do. It sucked, but we didn’t know any better.

The internet changed that.

Advertisers can now target their advertising to a specific individual. They don’t care anymore about advertising on a mass scale; what they would rather is advertising on a micro scale. Spending $20,000 on 10,000 people you know that want to buy your product, has a much better Return on Investment than $2,000,000 on 1,000,000 people – of which 10% don’t speak the language of your ad, 20% aren’t the target group for your ad; and 30% are probably offended by your ad and will ruin it for the 40% they you were targeting in the first place.

Sound crazy? Well Google making $10 billion dollars doing just that is crazy.

So now that we have cleared that up – let’s get back to the issue. We now know one of the reasons why we need measurement: advertisers want to target their advertising better. Are there any other reasons? Sure- sometimes people want to measure what their audience reads for non-monetary reasons – they could just trying to find out what their readers are interested in, so they can focus on that content. Statistics like that is not narcissism – it’s just being responsive to an audience. Or then again, it could be pure ego.

So when it comes to measuring content, there are two reasons why anyone cares: to make money, or to see how people react to your content. However it’s the first type that is causing us problems in this issue. And that’s because how long someone spends on your content, or how many people view your content, is no longer relevant as it was in the mass media days. What is relevant is WHO is reading your content.

I don’t think you can have a discussion about new ways of measuring the way content is consumed, without separating those two different motives for measurement. I like Chris’s proposal – knowing how long someone spends reading my blog posting is something I would find interesting as a blogger. But that’s pure ego – I just want to know if I have a readership of deep thinkers or random Google visitors that were looking for a picture about shorts skirts. (As an aside – one of my pictures is the number one Google image result for “women in short skirts” – thank God it goes to my Flickr account now, the bandwidth that used to eat up was crazy!)

So before we come up with new measurement systems, lets spend more time determining why we are measuring. Simply saying we are better measuring what consumers are giving their attention to, is only part of the problem. We need to first determine what value we obtain from measuring that attention in the first place.

Patents: more harm than good

When I was in Prague two years ago, I met a bloke from Bristol (UK) that very convincingly explained how patents as a concept, are stupid. Because alcohol was involved, I can’t recall his actual argument, but it has since made me question: do you really need a patent to protect your business idea?

Narendra Rocherolle, an experienced entrepreneur, has written a good little article explaining when you should, and shouldn’t, spend money to protect your IP. Racherolle offers a good analysis, but I am going to extend it by stating that a patent can be dangerous for your business, and not just because of the monetary cost. Radar Networks is my case-study – a stealth-mode “Semantic web” company, that has received a lot of press lately because apparently they are doing something big but they are not going to tell us until later this year.

Continue reading