Frequent thinker, occasional writer, constant smart-arse

Tag: video (Page 2 of 3)

Don’t get the Semantic Web? You will after this

Prior to 2006, I had sort of heard of the Semantic Web. To be honest, I didn’t know much – it was just another buzzword. I’ve been hearing about Microformats for years, and cool but useless initiatives like XFN. However to me it was simply just another web thing being thrown around.

Then in August 2006, I came across Adrian Holovaty’s article where he argues journalism needs to move from a story-centric world to a data-centric world. And that’s when it dawned on me: the Semantic web is some serious business.

I have since done a lot of reading, listening, and thinking. I don’t profess to be a Semantic Web expert – but I know more than the average person as I have (painfully) put myself through videos and audios of academic types who confuse the crap out of me. I’ve also read through a myriad of academic papers from the W3C, which are like the times when you read a novel and keep re-reading the same page and still can’t remember what you just read.

Hell – I still don’t get things. But I get the vision, so that’s what I am going to share with you now. Hopefully, my understanding will benefit the clueless and the skeptical alike, because it’s a powerful vision which is entirely possible

1) The current web is great for humans; useless for machines
When you search for ambiguous terms, at best, search engines can algorithmically predict some sort of answer that partially answers your query. Sometimes not. But the complexity of language, is not something engineers can engineer to deal with. After all, without ambiguity of natural languages, the existence of poetry is impossible.

Fine.

What did you think when you read that? As in: “I’ve had it – fine!” which is like another way of saying ok or agreeing with something. Perhaps you thought about that parking ticket I just got – illegal parking gets you fined. Maybe you thought I am applauding myself by saying that was one fine piece of wordcraftship I just wrote, or said in another context, like a fine wine.

Language is ambiguous, and depending on the context with other words, we can determine what the meaning of the word is. Search start-up company Powerset, which is hoping to kill Google and rule the world, is employing exactly this technique to improve search: intelligent processing of words depending on context. So by me putting in “it’s a fine”, it understands the context that it’s a parking ticket, because you wouldn’t say “it’s a” in front of ‘fine’ when you use it to agree with something (the ‘ok’ meaning above).

But let’s use another example: “Hilton Paris” in Google – the worlds most ‘advanced’ search engine. Obviously, as a human reading that sentence, you understand because of the context of those words I would like to find information about the Hilton in Paris. Well maybe.

Let’s see what Google comes up with: Of the ten search results (as of when I wrote this blog posting), one was a news item on the celebrity; six were on the celebrity describing her in some shape or form, and three results were on the actual Hotel. Google, at 30/70 – is a little unsure.

Why is Paris Hilton, that blonde haired thingy of a celebrity, coming up in the search results?

Technologies like Powerset apparently produce a better result because it understands the order of the words and context of the search query. But the problem with these searches, isn’t the interpretation of what the searcher wants – but also the ability to understand the actual search results. Powerset can only interpret so much of the gazilions of words out there. There is the whole problem of the source data, no just the query. Don’t get what I mean? Keep reading. But for now, learn this lesson

Computers have no idea about the data they are reading. In fact, Google pumping out those search results is based on people linking. Google is a machine, and reads 1s and 0s – machine language. It doesn’t get human language

2) The Semantic web is about making what human’s read, machine readable
Tim Berner’s Lee, the guy that invented the World Wide Web and the visionary behind the Semantic Web, prefers to call it the ‘data web’. The current web is a web of documents – by adding this extra data to content – machines will be able to understand it. Metadata, is data about data.

A practical outcome of having a semantic web, is that Google would know that when it pulls up a web page regardless of the context of the words – it will understand what the content is. Think of every word on the web, being linked to a master dictionary.

The benefit of the semantic web is not for humans – at least immediately. The Semantic Web is actually pretty boring with what it does – what is exciting, is what it will enable. Keep reading.

3) The Semantic web is for machines to interpret, not people
A lot of the skeptics of the semantic web, usually don’t see the value of it. Who cares about adding all this extra meta data? I mean heck – Google still was able to get the website I needed – the Hilton in Paris. Sure, the other 60% of the results on that page were irrelevant, but I’m happy.

I once came across a Google employee and he asked “what’s the point of a semantic web; don’t we already enough metadata?” To some extent, he’s right – there are some websites out there that have metadata. But the point of the semantic web is so that machines once they read the information, can start thinking like how a human would and connecting it to other information. There needs to be across the board metadata.

For example, my friend Michael was recently looking to buy a car. A painful process, because there are so many variables. So many different models, different makes, different dealers, different packages. We have websites, with cars for sale neatly categorised into profile pages saying what model it is, what colour it is, and how much. (Which may I add, are hosted on multiple car sites with different types of profiles). A human painfully reads through these profiles, and computes as fast as a human can. But a machine can’t read these profiles.

Instead of wasting his (and my) weekends driving around Sydney to find his car, a machine could find it for him. So, Mike would enter his profile in – what he requires in a car, what his credit limit is, what his prior history with cars are – everything that would affect his judgement of a car. And then, the computer can query every online website with cars to match the criteria. Because the computer can interpret these websites across the board, it can evaluate and it can go back to Michael and say “this is the car for you, at this dealer – click yes to buy”.

The semantic web is about giving computers the information to be able to interpret data, so that it can do what they do really well – compute.

4) A worldwide database
What essentially Berner’s Lee envisions, is turning the entire world wide web into a database that can be queried. Currently, the web looks like Microsoft Word – one swab of text. However, if that swab of text was neatly categorised in an Excel spreadsheet, you could manipulate that data and do what you please – create reports, reorder them, filter, and do whatever until your heart is content.

At university, I was forced to do an Information Systems subject which was essentially about the theory of databases. Damn painful. I learned only two things from that course. The first thing was that my lecturer, tutor, and classmates spoke less intelligible English than a caterpillar. But the second thing was that I learned what information is and how it differs from data. I am now going to share with you that lesson, and save you three months of your life.

You see, data is meaningless. For example, 23 degrees is data. On its own, it’s useless. Another piece of data in Sydney. Again, Рuseless. I mean, you can think all sorts of things when you think of Sydney, but it doesn’t have any meaning.

Now put together 23 degrees and Sydney, and you have just created information. Information is about creating relationships between data. By creating a relationship, an association, between these two different pieces of data – you can determine it’s going to be a warm day in Sydney. And that is what information is: Relationship building; connecting the dots; linking the islands of data together to generate something meaningful.

The semantic web is about allowing computers to be able to query the sum of human knowledge like one big database to generate information

Concluding thoughts
You are probably now starting to freak out and think “Terminator” images with computers suddenly erupting form under your computer desk, and smashing you against the wall as a battle between humans and computers begins. But I don’t see it like that.

I think about the thousands of hours humans spend trying to compute things. I think of the cancer research, whereby all this experimentation occurring in labs, is trying to connect new pieces of data with old data to create new information. I think about computers being about to query the entire taxation legislation to make sure I don’t pay any tax, because it knows how it all fits together (having studied tax, I can assure you – it takes a lifetime to only understand a portion of tax law). In short, I understand the vision of the Semantic web as a way of linking things together, to enable computers to compute – so that I can sit on my hammock drinking my beer, as I can delegate the duties of my life to the machines.

All the semantic web is trying to do, is making sure everything is structured in a consistent manner, with a consistent dictionary behind the content, so that a machine can draw connections. As Berner’s Lee said on one of the videos I saw: “it’s all about creating links”.

The process to a Semantic Web is boring. But once we have those links, we can then start talking about those hammocks. And that’s when the power of the internet – the global network – will really take off.

John Hagel – What do you think is the single most important question after everything is connected?

I recently was pointed to a presentation of John Hagel who is a renowned strategy consultant and author on the impact the Internet has on business. He recently joined Deloitte and Touche, where he will head a new Silicon Valley research institute. At the conference (Supernova 2007), John outlined critical research questions regarding the future of digital business that remain unresolved, which revolved around the following:

What happens after everything is connected? What are the most important questions?

I had to watch the video a few times because its not possible to capture everything he says in one hit. So I started writing notes each time, which I have reproduced below to help guide your thoughts and give a summary as you are watching the presentation (which I highly recommend).

I also have discovered (after writing these notes – damn it!) that he has written his speech (slightly different however) and posted it on his blog. I’ll try and reference my future postings on these themes here, by pinging or adding links to this posting.
Continue reading

On the future of search

Robert Scoble has put together a video presentation on how Techmeme, Facebook and Mahalo will kill Google in four years time. His basic premise is that SEO’s who game Google’s algorithm are as bad as spam (and there are some pissed SEO experts waking up today!). People like the ideas he introduces about social filtering, but on the whole – people are a bit more skeptical on his world domination theory.

There are a few good posts like Muhammad‘s on why the combo won’t prevail, but on the whole, I think everyone is missing the real issue: the whole concept of relevant results.

Relevance is personal

When I search, I am looking for answers. Scoble uses the example of searching for HDTV and makes note of the top manufacturers as something he would expect at the top of the results. For him – that’s probably what he wants to see – but for me, I want to be reading about the technology behind it. What I am trying to illustrate here is that relevance is personal.

The argument for social filtering, is that it makes it more relevant. For example, by having a bunch of my friends associated with me on my Facebook account, an inference engine can determine that if my friend called A is also friends with person B, who is friends with person C – than something I like must also be something that person C likes. When it comes to search results, that sort of social/collaborative filtering doesn’t work because relevance is complicated. The only value a social network can provide is if the content is spam or not – a yes or no type of answer – which is assuming if someone in my network has come across this content. Just because my social network can (potentially) help filter out spam, doesn’t make the search results higher quality. It just means less spam results. There is plenty of content that may be on-topic but may as well be classed as spam.

Google’s algorithm essentially works on the popularity of links, which is how it determines relevance. People can game this algorithm, because someone can make a website popular to manipulate rankings through linking from fake sites and other optimisations. But Google’s pagerank algorithm is assuming that relevant results are, at their core, purely about popularity. The innovation the Google guys brought to the world of search is something to be applauded for, but the extreme lack of innovation in this area since just shows how hard it is to come up with new ways of making something relevant. Popularity is a smart way of determining relevance (because most people would like it) – but since that can be gamed, it no longer is.

The semantic web

I still don’t quite understand why people don’t realise the potential for the semantic web, something I go on about over and over again (maybe not on this blog – maybe it’s time I did). But if it is something that is going to change search, it will be that – because the semantic web will structure data – moving away from the document approach that webpages represent and more towards the data approach that resembles a database table. It may not be able to make results more relevant to your personal interests, but it will better understand the sources of data that make up the search results, and can match it up to whatever constructs you present it.

Like Google’s page rank, the semantic web will require human’s to structure data, which a machine will then make inferences – similar to how Pagerank makes inferences based on what links people make. However Scoble’s claim that humans can overtake a machine is silly – yes humans have a much higher intellect and are better at filtering, but they in no way can match the speed and power of a machine. Once the semantic web gets into full gear a few years from now, humans will have trained the machine to think – and it can then do the filtering for us.

Human intelligence will be crucial for the future of search – but not in the way Mahalo does it which is like manually categorising pieces of paper into a file cabinet – which is not sustainable. A bit like how when the painters of the Sydney harbour bridge finish painting it, they have to start all over again because the other side is already starting to rust again. Once we can train a machine that for example, a dog is an animal, that has four legs and makes a sound like “woof” – the machine can then act on our behalf, like a trained animal, and go fetch what we want; how those paper documents are stored will now be irrelevant and the machine can do the sorting for us.

The Google killer of the future will be the people that can convert the knowledge on the world wide web into information readeable by computers, to create this (weak) form of artificial intelligence. Now that’s where it gets interesting.

Half the problem has been solved with time spent

On Thursday, I attended the internal launch of the Australian Entertainment & Media Outlook for 2007-2011. It was an hour packed with interesting analysis, trends, and statistics across a dozen industry segments. You can leave a comment on my blog if you are interested in purchasing the report and I’ll see if I can arrange it for you.

One valuable thing briefly mentioned, was the irony of online advertising.
Continue reading

Tangler

This is the second post in a series – wizards of oz – which is to highlight the innovation we have down under, and how the business community needs to wake up and realise the opportunities. I review Tangler, a Sydney-based start-up that has recently released their application to the world as a public beta.

Tangler is a web-service that enables discussions over a network. Think of discussions with the immediacy of Instant Messaging (it’s easy), but with the persistency of a forum (messages are permanently stored). Discussions are arranged into communities of interest (groups), which are further broken down into topic areas. Click here to see a video overview.

Value

1) It’s a network application. Although it’s got a great design, and looks like a funky website, the real power of this web service is what it’s working towards: discussions over a network. Imagine a little widget with the topic “What do you think of Elias Bizannes?” placed on my (external) personal blog, my internal work blog, my myspace/facebook/social networking page, as well as it’s own dedicated forum on the Tangler site. A centralised discussion, in a decentralised manner. That’s big.

2) It’s community has great DNA. Communities are not easy things to build – my own experience on a getting-bigger-by-the-day internal project has shown that it is a complex science, touching everything from understand motivational theory to encouraging the right kind of behaviours (policing without policing). My usage on the site has shown to me that the active community building currently occuring, is on the right track. Anyone can hire a code monkey, wack on some flashy front-end, and say they have a great product. But not anyone can build a strong community – even Google struggles on this (the acquisition of YouTube happened largely because of community, because the YouTube community beat Google’s own service). Tangler’s community is already turning into a powerful asset – the DNA is there – now it just needs exposure, and the law of cumulative advantage will kick in.

3) The founder and staff are responsive to its community. I posted a question on the feedback forum, to prove this point: I got a response in an hour, on a Saturday. The staff at Tangler are super responsive – which in part, is due to the real-time discussion ability of the software – but also because of their commitment. As I state above – the value of Tangler is the community of users it builds – this type of responsiveness is crucial to keep its users satisfied to come back, because it makes them feel valued. Additionally, the community is driving the evolution of the application, and that’s the most powerful way to create something (adapting to where there is a need by the people that use it)

4) It’s a platform. What makes Tangler powerful, is that it encourages discussions around niche content areas. Make that niche content, being created for free. Low cost to produce + highly targeted content = an advertisers dream. Link it with a distributed network across the entire Internet (see 1 above), and you’ve got something special.

Conclusion

Social networks, which is what Tangler is, are characterised by:
1) the existence of a repository of user-generated content and
2) the need of members to communicate.

Tangler’s user-generated content and communications web make them an interesting fit for both media conglomerates and telecommunication companies (but for different reasons). I see a Tangler acquisition as a no-brainer for the big Telco’s. Integrating a social network like Tangler into Telstra, builds on the synergy between the communication needs of social network users and the communications expertise and service infrastructure of the communication companies. Unlike voice calls that are a commodity now, the Telco’s need to take advantage of their network infrastructure and accommodate for text-based discussions, which can be monetised for as long as the content exists (with advertising).

The challenge for Tangler however – as with any other Internet property – is that the scale of the audience of social networks determines the nature of the relationship with a communications company. Micro-sized social networks are not interesting to communication companies. Massive social networks are, but history has shown they would rather be partners than be acquired. To be attractive to the big end of town, Tangler needs to show to have a scale large enough to grow as a business but not too large to dictate the terms of the business.

My observations conclude me to think that they will be a hit once they open up their application to external developers, which will relieve the development bottleneck faced by their resource and time constrained team. However they shouldn’t rush this, as I still think their performance issues are not completely ironed out yet. An open API would be taken up by its enthusiastic community who are technologically orientated. Not too mention the strong relationships the CEO and CMO have forged with the local web entrepreneurial and development community in Australia.

My boss is currently doing a secondment as acting Finance Director at Sensis, Telstra’s media arm. Maybe I need to organise a catch-up with him, before these guys get snatched up by some US conglomerate!

Faraday Media – Particls

This series of blog posts – wizards of oz – is to highlight the innovation we have down under. So I begin with Faraday media, a Brisbane based start-up that launched their keynote product today,

Particls is an engine that learns what you are interested in, and alerts you when content on the internet becomes available – through a desktop ‘ticker’ or pop-up alerts.

Value
1) It’s targeted. Particls is an attention engine – it learns what you want to read, and then goes and finds relevant information. That’s a powerful tool, for those of us drowning in information overload, and who don’t have time to read.

2) It catches your attention. Particls is based on the concept of ‘alerts’ – information trickles across your screen seemlesly as you do your work, like a news ticker. For the things that matter, an alert will pop-up. The way you deal with information overload is not by shutting yourself out – it’s by adjusting the volume on things that you value more than other things.

3) The founders understand privacy. They started the APML standard – a workgroup I joined because it’s the best attempt I have seen yet that tackles the issue of privacy on the internet. For example, I can see what the Particls attention engine uses to determine my preferences – lists of people and subjects with “relevance scores”. And better yet – it’s stored on my hard-disk.

4) It’s simple. RSS is a huge innovation on the web, that only a minority of users on the internet understand. The problem with RSS (Real Simple Syndication), is that it’s not simple. Particles makes it dead simple to add RSS and track that content.

Conclusion

Why the hell doesn’t Fairfax acquire the start-up, rather than wasting time creating yet another publication (incidently in the same city) that we don’t have time to read. In my usage of the product, I have been introduced to content that I am interested in, that I never would have realised had existed on the web. In my trials, I have mainly used it to keep track of my research interests, and despite my skepticism about how ‘good’ the the attention engine is, it has absolutely blown me away.

And it’s not just in the consumer space – a colleague (who happens to hold a lot of influence in enterprise architecture of our 140,000 person firm) was blasting RSS one day on an internal blog – saying how we don’t yet have the technology to ‘filter’ information. I told him about Particls – he’s now in love. If a guy like him, who shapes IT strategy for a $20 billion consulting firm, can get that excited – that’s got to tell you something.

Social networks as the new e-mail

The other day, I received my first spam message within Facebook, which I thought was reminiscent of the Nigerian scam

Please if you are reliable and Interested in been a commissioned rep with our company we will be glad but you have to be a Trustworthy person. We have sold out to major galleries and private collectors from few parts of the world. We have been facing serious difficulties when it comes to the payment method, i.e The international money transfer tax for legal entities (companies) in Latvia is 25%, whereas for the individual it is only 7%.There is no sense for us to work this way, while tax for international money transfer made by a private individual is 7% .That's why we need you! Branches have been set up in few countries,and the head branch in UK.we are working on setting up a branch in the states, so for now i need a representative in Canada, America,Asia,New Zealand,and Europe who will be handling the payment aspect. so all you need do is cash the Payment,deduct your percentage and wire the rest back.</p> <p>JOB DESCRIPTION? 1. Receive payment from Clients 2. Cash Payments at your Bank 3. Deduct 10% which will be your percentage/pay on Payment processed. 4. Forward balance after deduction of percentage/pay to any of the offices you will be contacted to send payment to(Payment is to be forwarded either by Money Gram or Western Union Money Transfer).

But unlike spam I would get in my e-mail inbox, I could actually check the profile of the user that sent the message to me. It was empty and a dud – which is how I could assess it was spam. Spam through a closed social networking site like Facebook has very different implications to e-mail spam: it’s accountable.

Unlike e-mail spam, you don’t know who is sending it. Sometimes, the e-mail spammers can make it look like it comes from a certain company you trust (like your bank). This also to some extent happens on myspace, whereby spammers do up their profile and deceivingly make it look like a real profile when it isn’t (ie, a pretty girl with her interests filled out – but as soon as you click somewhere, it takes you to a porn referral site). Facebook is different, because people can’t modify their profiles (yet) like you can on myspace, so the person sending the message is a lot more accountable to their true identity. You can judge how real they are by the amount of friends they have, information in their profile, and postings on their profile from other people.

Profile comments are the key aspect – no comments, suggests a fake account – because you can’t fake friends to post real discussions. A spammer would need to create a few dozen profiles, to replicate the thread of discussion via peoples profiles, so that it could make someones profile look “real”: that’s a lot of effort that a computer robot can’t do on it’s own.

A new way of communicating

Aside from this, there is something more interesting: I rarely use e-mail to communicate with friends anymore. Messages or comments/wallposts are now the new way of how people communicate. In the old days, people would forward a funny video – now they “post a bulletin”. People post “notes” and tag their friends if they are mentioned in the note – a bit like writing a story, and alerting those who are involved to have a look. It’s the equivilant to sending an e-mail to a group of people – but leaving it somewhere where all your other friends can have a read as well if they want. That is huge – this open style of communication is something e-mail never did.

I’ve previously written how the “post a comment” feature is one of the most powerful features of social networking sites. When I say these sites are the new e-mail, it’s not just messages that are the means of communicating – it’s actually mostly through these profile comments that people have these discussions. The interesting thing about this new way of communicating, is that two people can be having a discussion, however all their friends can monitor the conversation. For example, I made a tongue-in-cheek comment of a Ukrainian friend of mine on her facebook profile wall, and another mutual (Ukrainian) friend saw the comment and joined in defending Ukrainians!

Social networking sites work because they are creating a community feel, where people interact within a tribe or small village that everyone knows each other, and they communicate in what is like a open forum. If it’s one thing I am sure of, these sites are no longer fads: they are a positive evolution of the Internet as a communications medium. It appear solutions to e-mail spam with clever algorithms that can filter messages arn’t the way forward; the solution is to be found in new ways of communicating, and that is what social networking sites do really well.

A bit of inspiration

and the text, if you don’t want to see the video:

Here’s to the crazy ones.

The misfits.

The rebels.

The troublemakers.

The round pegs in the square holes.

The ones who see things differently.

They’re not fond of rules.

And they have no respect for the status quo.

You can quote them, disagree with them,

glorify or vilify them.

About the only thing you can’t do is ignore them.

Because they change things.

They push the human race forward.

And while some may see them as the crazy ones,

we see genius.

Because the people who are crazy enough to think
they can change the world, are the ones who do.

?¢‚Ǩ‚ÄùApple Computer (via workhappy.net)

« Older posts Newer posts »