Tag Archive for 'spam'

Page 2 of 2

On the future of search

Robert Scoble has put together a video presentation on how Techmeme, Facebook and Mahalo will kill Google in four years time. His basic premise is that SEO's who game Google's algorithm are as bad as spam (and there are some pissed SEO experts waking up today!). People like the ideas he introduces about social filtering, but on the whole - people are a bit more skeptical on his world domination theory.

There are a few good posts like Muhammad's on why the combo won't prevail, but on the whole, I think everyone is missing the real issue: the whole concept of relevant results.

Relevance is personal

When I search, I am looking for answers. Scoble uses the example of searching for HDTV and makes note of the top manufacturers as something he would expect at the top of the results. For him - that's probably what he wants to see - but for me, I want to be reading about the technology behind it. What I am trying to illustrate here is that relevance is personal.

The argument for social filtering, is that it makes it more relevant. For example, by having a bunch of my friends associated with me on my Facebook account, an inference engine can determine that if my friend called A is also friends with person B, who is friends with person C - than something I like must also be something that person C likes. When it comes to search results, that sort of social/collaborative filtering doesn't work because relevance is complicated. The only value a social network can provide is if the content is spam or not - a yes or no type of answer - which is assuming if someone in my network has come across this content. Just because my social network can (potentially) help filter out spam, doesn't make the search results higher quality. It just means less spam results. There is plenty of content that may be on-topic but may as well be classed as spam.

Google's algorithm essentially works on the popularity of links, which is how it determines relevance. People can game this algorithm, because someone can make a website popular to manipulate rankings through linking from fake sites and other optimisations. But Google's pagerank algorithm is assuming that relevant results are, at their core, purely about popularity. The innovation the Google guys brought to the world of search is something to be applauded for, but the extreme lack of innovation in this area since just shows how hard it is to come up with new ways of making something relevant. Popularity is a smart way of determining relevance (because most people would like it) - but since that can be gamed, it no longer is.

The semantic web

I still don't quite understand why people don't realise the potential for the semantic web, something I go on about over and over again (maybe not on this blog - maybe it's time I did). But if it is something that is going to change search, it will be that - because the semantic web will structure data - moving away from the document approach that webpages represent and more towards the data approach that resembles a database table. It may not be able to make results more relevant to your personal interests, but it will better understand the sources of data that make up the search results, and can match it up to whatever constructs you present it.

Like Google's page rank, the semantic web will require human's to structure data, which a machine will then make inferences - similar to how Pagerank makes inferences based on what links people make. However Scoble's claim that humans can overtake a machine is silly - yes humans have a much higher intellect and are better at filtering, but they in no way can match the speed and power of a machine. Once the semantic web gets into full gear a few years from now, humans will have trained the machine to think - and it can then do the filtering for us.

Human intelligence will be crucial for the future of search - but not in the way Mahalo does it which is like manually categorising pieces of paper into a file cabinet - which is not sustainable. A bit like how when the painters of the Sydney harbour bridge finish painting it, they have to start all over again because the other side is already starting to rust again. Once we can train a machine that for example, a dog is an animal, that has four legs and makes a sound like "woof" - the machine can then act on our behalf, like a trained animal, and go fetch what we want; how those paper documents are stored will now be irrelevant and the machine can do the sorting for us.

The Google killer of the future will be the people that can convert the knowledge on the world wide web into information readeable by computers, to create this (weak) form of artificial intelligence. Now that's where it gets interesting.

Half the problem has been solved with time spent

On Thursday, I attended the internal launch of the Australian Entertainment & Media Outlook for 2007-2011. It was an hour packed with interesting analysis, trends, and statistics across a dozen industry segments. You can leave a comment on my blog if you are interested in purchasing the report and I'll see if I can arrange it for you.

One valuable thing briefly mentioned, was the irony of online advertising.
Continue reading 'Half the problem has been solved with time spent'

Social networks as the new e-mail

The other day, I received my first spam message within Facebook, which I thought was reminiscent of the Nigerian scam

Please if you are reliable and Interested in been a commissioned rep with our company we will be glad but you have to be a Trustworthy person. We have sold out to major galleries and private collectors from few parts of the world. We have been facing serious difficulties when it comes to the payment method, i.e The international money transfer tax for legal entities (companies) in Latvia is 25%, whereas for the individual it is only 7%.There is no sense for us to work this way, while tax for international money transfer made by a private individual is 7% .That's why we need you! Branches have been set up in few countries,and the head branch in UK.we are working on setting up a branch in the states, so for now i need a representative in Canada, America,Asia,New Zealand,and Europe who will be handling the payment aspect. so all you need do is cash the Payment,deduct your percentage and wire the rest back.</p> <p>JOB DESCRIPTION? 1. Receive payment from Clients 2. Cash Payments at your Bank 3. Deduct 10% which will be your percentage/pay on Payment processed. 4. Forward balance after deduction of percentage/pay to any of the offices you will be contacted to send payment to(Payment is to be forwarded either by Money Gram or Western Union Money Transfer).

But unlike spam I would get in my e-mail inbox, I could actually check the profile of the user that sent the message to me. It was empty and a dud - which is how I could assess it was spam. Spam through a closed social networking site like Facebook has very different implications to e-mail spam: it's accountable.

Unlike e-mail spam, you don't know who is sending it. Sometimes, the e-mail spammers can make it look like it comes from a certain company you trust (like your bank). This also to some extent happens on myspace, whereby spammers do up their profile and deceivingly make it look like a real profile when it isn't (ie, a pretty girl with her interests filled out - but as soon as you click somewhere, it takes you to a porn referral site). Facebook is different, because people can't modify their profiles (yet) like you can on myspace, so the person sending the message is a lot more accountable to their true identity. You can judge how real they are by the amount of friends they have, information in their profile, and postings on their profile from other people.

Profile comments are the key aspect - no comments, suggests a fake account - because you can't fake friends to post real discussions. A spammer would need to create a few dozen profiles, to replicate the thread of discussion via peoples profiles, so that it could make someones profile look "real": that's a lot of effort that a computer robot can't do on it's own.

A new way of communicating

Aside from this, there is something more interesting: I rarely use e-mail to communicate with friends anymore. Messages or comments/wallposts are now the new way of how people communicate. In the old days, people would forward a funny video - now they "post a bulletin". People post "notes" and tag their friends if they are mentioned in the note - a bit like writing a story, and alerting those who are involved to have a look. It's the equivilant to sending an e-mail to a group of people - but leaving it somewhere where all your other friends can have a read as well if they want. That is huge - this open style of communication is something e-mail never did.

I've previously written how the "post a comment" feature is one of the most powerful features of social networking sites. When I say these sites are the new e-mail, it's not just messages that are the means of communicating - it's actually mostly through these profile comments that people have these discussions. The interesting thing about this new way of communicating, is that two people can be having a discussion, however all their friends can monitor the conversation. For example, I made a tongue-in-cheek comment of a Ukrainian friend of mine on her facebook profile wall, and another mutual (Ukrainian) friend saw the comment and joined in defending Ukrainians!

Social networking sites work because they are creating a community feel, where people interact within a tribe or small village that everyone knows each other, and they communicate in what is like a open forum. If it's one thing I am sure of, these sites are no longer fads: they are a positive evolution of the Internet as a communications medium. It appear solutions to e-mail spam with clever algorithms that can filter messages arn't the way forward; the solution is to be found in new ways of communicating, and that is what social networking sites do really well.