Last weekend, I participated at the Sydney Startup camp Sydney II, which had been a straight 24 hour hackathon to build and launch a product (in my case Activity Horizon). Ross Dawson has written a good post about the camp you are interested in that.

activity horizon
It’s been a great experience (still going – send us your feedback!) and I’ve learned a lot. But something really strikes me which I think should be shared. It’s how little has changed since the last start-up camp and how stupid companies are – but first, some background.

The above mentioned product we launched, is a service that allows people to discover events and activities that they would be interested in. We have a lot of thoughts on how to grow this – and I know for a fact, finding new things to do in a complex city environment as time-poor adults, is a genuine issue people complain often about. As Mick Liubinskas said “Matching events with motivation is one of the Holy Grails of online businesses” and we’re building tools to allow people to filter events with minimal effort.

ActivityHorizon Team

So as “entrepreneurs” looking to create value under an artificial petri dish, we recognised that existing events services didn’t do enough to filter events with user experience in mind. By pulling data from other websites, we have created a derivative product that creates value without necessarily hurting anyone. Our value proposition comes from the user experience in simplicity (more in the works once the core technology is set-up) and we are more than happy to access data from other providers in the information value chain on the terms they want.

The problem is that they have no terms! The concept of an API is one of the core aspects of the mashup world we live in, firmly entrenched within the web’s culture and ecosystem. It’s something that I believe is a dramatic way forward for the evolution of the news media and it’s a complementary trend that is building the vision of the semantic web. However nearly all the data we have hasn’t been done through an API which can regulate the way we use the data; instead, we’ve had to scrape it.

Scraping is a method of telling a computer how data is structured on a web page, which you then ‘scape’ data from that template presentation on a website. A bit like highlighting words in a word document with a certain characteristic and pulling all the words you highlighted into your own database. Scraping has a negative connotation as people are perceived to be stealing content and re-using it as their own. The truth of the matter is, additional value gets generated when people ‘steal’ information products: data is an object, and by connecting it with other objects – those relationships – are what create information. The potential to created unique relationships with different data sets, means no two derivative information products are the same.

So why are companies stupid
Let’s take for example a site that sells tickets and lists information about them. If you are not versed in the economics of data portability (which we are trying to do with the DataPortability Project), you’d think that if Activity Horizon is scraping ‘their’ data, that’s a bad thing as we are stealing their value.

WRONG!

Their revenue model is based on people buying tickets through their site. So by us reusing their data and creating new information products, we are actually creating more traffic, more demand, more potential sales. By opening up their data silo, they’ve actually opened up more revenue for themselves. And by opening up their data silo, they not only control the derivatives better but they can reduce the overall cost of business for everyone.

Let’s use another example: a site that aggregates tickets and doesn’t actually sell them (ie, their revenue model isn’t through transactions but attention). Activity Horizon could appear to be a competitor right? Not really – because we are pulling information from them (like they are pulling information from the ticket providers). We’ve extracted and created a derivative product, that brings a potential audience to their own website. It’s repurposing information in another way, to a different audience.

The business case for open data is something I could spend hours talking about. But it all boils down to this: data are not like physical objects. Scarcity does not determine the value of data like it does with physical goods. Value out of data and information comes through reuse. The easier you make it for others to resuse your data, the more success you will have.