Frequent thinker, occasional writer, constant smart-arse

Tag: data (Page 2 of 5)

Data portability and media: explaining the business case

The information value chain I wrote about a while back, although in need of further refinement, underpins my entire thinking in how I think the business
case for data portability exists.

In this post, I am going to give a brief illustration of how interoperability is a win-win for all involved in the digital media business.

To do this, I am going to explain it using the following companies:
– Amazon (EC2)
– Facebook
– Yahoo! (Flickr)
– Adobe (Photoshop Express)
– Smugmug
– Cooliris

How the world works right now
I’ve listed six different companies, each of which can provide services for your photos. Using a simplistic view of the market, they are all competitors – they ought to be fighting to be the ultimate place where you store your photos. But the reality is, they aren’t.

Our economic system is underpinned by a concept known as “comparative advantage“. It means that even if you are the best at everything, you are better off specialising in one area, and letting another entity perform a function. In world trade, different countries specialise in different industries, because by focusing on what you are uniquely good at and by working with other countries, it actually is a lot more efficient.

Which is why I take a value chain approach when explaining data portability. Different companies and websites, should have different areas of focus – in fact, we all know, one website can’t do everything. Not just because of lack of resources, but the conflict it can create in allocating them. For example, a community site doesn’t want to have to worry about storage costs, because it is better off investing in resources that support its community. Trying to do both may make the community site fail.

How specialisation makes for a win-win
With that theoretical understanding, let’s now look into the companies.

Amazon
They have a service that allows you to store information in the cloud (ie, not on your local computer and permanently accessible via a browser). The economies of scale by the Amazon business allows it to create the most efficient storage system on the web. I’d love to be able to store all my photos here.

Facebook
Most of the people I know in the offline world, are connected to me on Facebook. Its become a useful way for me to share with my friends and family my life, and to stay permanently connected with them. I often get asked my friends to make sure I put my photos on Facebook so they can see them.

Yahoo
Yahoo owns a company called Flickr – which is an amazing community of people passionate about photography. I love being able to tap into that community to share and compare my photos (as well as find other people’s photos to use in my blog posts).

Adobe
Adobe makes the industry standard program for graphic design: Photoshop. When it comes to editing my photos – everything from cropping them, removing red-eye or even converting them into different file formats – I love using the functionality of Photoshop to perform that function. They now offer an online Photoshop, which provides similar functionality that you have on the desktop, in the cloud.

Smugmug
I actually don’t have a Smug mug account, but I’ve always been curious. I’d love to be able to see how my photos look in their interface, and be able to tap into some of the features they have available like printing them in special ways.

Cooliris
Cooliris is a cool web service I’ve only just stumbled on. I’d love be able to plug my photos in the system, and see what cool results get output.

Putting it together

  • I store my photos on Amazon, including my massive RAW picture files which most websites can’t read.
  • I can pull my photos into Facebook, and tag them how I see fit for my friends.
  • I can pull my photos into Flickr, and get access to the unique community competitions, interaction, and feedback I get there.
  • With Adobe Photoshop express, I can access my RAW files on Amazon, to create edited versions of my photos based on the feedback in the comments I received on Flickr from people.
  • With those edited photos now sitting on Amazon, and with the tags I have on Facebook adding better context to my photos (friends tagging people in them), I pull those photos into Smug mug and create really funky prints to send to my parents.
  • Using those same photos I used in Smug Mug, I can use them in Cooliris, and create a funky screensaver for my computer.

As a customer to all these services – that’s awesome. With the same set of photos, I get the benefit of all these services, which uniquely provide something for me.

And as a supplier that is providing these services, I can focus on what I am good at – my comparative advantage – so that I can continue adding value to the people that use my offering.

Sounds simple enough, eh? Well the word for that is “interoperability”, and it’s what we are trying to advocate at the DataPortability Project. A world where data does not have borders, and that can be reused again and again. What’s stopping us for having a world like this? Well basically, simplistic thinking that one site should try to do everything rather than focus on what they do best.

DataPortability Project

Help us change the market’s thinking and demand for data portability.

Best error message ever (for Data Portability in action)

As we were preparing for the upgrade of DataPortability Project’s website, we realised we needed to close off some of our legacy mailing lists…but we didn’t want to lose the hundreds of people already on these mailing lists. So we decide to export the emails and paste them into the new Google group as subscribers.

I then got this error message.

email permissions

The has to be one of the best error messages I have ever seen. Yes I’m happy that I could port the data from a legacy system/group to a new one, using an open standard (CSV). Yes, I was impressed that the Google Groups team supports this functionality (who I am told is just one Google engineer and are completely understaffed). But what blew me away was the fact Google was able to recognise how to treat these emails.

These particular people have opted to not allow someone to reuse their e-mail, other than the intended purpose for which they submitted it (which was to be subscribed to this legacy Group). Google recognised that and told me I wasn’t allowed to do it as part of my batch add.

That’s Google respecting their users, while making life a hell of a lot easier for me as an administrator of these mailing lists.

I’m happy to be helped out like that, because I don’t want to step on any toes. And these people are happy, because they have control of the way their data is used. That’s what I call “Awesome”.

Data portability allows mashup for Australian bush fire crisis

Last night in Australia, one of the states developed a series of bush fires that have ravaged communities – survivors describe it as “raining fire” that came out of no where. As I write this, up to 76 people have been killed.

Victorian AU Fires 2009
The sky is said by Dave Hollis to look how it is in the movie ‘Independence Day’

An important lesson has come out out of this. First, the good stuff.

Googler Pamela Fox has created an invaluable tool to display the bush fires in real time. Using Google technologies like App engine and the Maps API (which she is the support engineer for), she’s been able to create a mashup that helps the public.

She can do so because the Victorian Fire department supports the open standard RSS. There are fires in my state of New South Wales as well, but like other Fire Department’s in Australia, there is no RSS feed to pull the data from (which is why you won’t see any data on the map from there) It appears states like NSW do support RSS for updates, but it would be more useful if there was some consistency – refer to discussion below about the standards.

For further information, you can read the Google blog post.

While the Fire Department’s RSS allows the portability of the data, it doesn’t have geocodes or a clear licence for use. That may not sound like a big deal, but the ability to contextualise a piece of information in this case matters a hell of a lot.

As a workaround, Pamela sent addresses through the Google geocoder to develop a database of addresses with latitude and longtitude.

GeoRSS and KML
In the geo standards world, two dominant standards exist that enable the portability of data. One is an extension to RSS (GeoRSS) that allows you to extend an RSS feed to show geodata. The other in Keyhole Markup Language, which was a standard developed by Google. GeoRSS is simply modifying RSS feeds to be more useful, while KML is more like how HTML is.

If the CFA and any other websites had supported them either of these standards, it would have made life a lot more easier. Pamela has access to Google resources to translate the information into a geocode and even she had trouble. (Geocoding the location data was the most time-consuming of the map-making process.)

The lessons
1) If you output data, output it in some standard structured format (like RSS, KML, etc).
2) If you want that data to be useful for visualisation, include both time and geographic (latitude/longitude information). Otherwise you’re hindering the public’s ability to use it.
3) Let the public use your data. The Google team spent some time to ensure they were not violating anything by using this data. Websites should be clearer about their rights of usage to enable mashers to work without fear
4) Extend the standards. It would have helped a lot of the CFA site extended their RSS with some custom elements (in their own namespace), for the structured data about the fires. Like for example <cfa:State>Get the hell out of here</cfa>.
5) Having all the Fire Department’s using the same standards would have make a world of difference – build the mashup using one method and it can be immediately useful for future uses.

Pamela tells me that this is the fifth natural disaster she’s dealt with. Every time there’s been an issue of where to get the data and how to syndicate it. Data portability matters most for natural disasters- people don’t have time to deal with scraping HTML (didn’t we learn this with Katrina?).

Let’s be prepared for the next time an unpredictable crisis like this occurs.

Information age companies losing out due to industrial age thinking

Last weekend, I participated at the Sydney Startup camp Sydney II, which had been a straight 24 hour hackathon to build and launch a product (in my case Activity Horizon). Ross Dawson has written a good post about the camp you are interested in that.

activity horizon
It’s been a great experience (still going – send us your feedback!) and I’ve learned a lot. But something really strikes me which I think should be shared. It’s how little has changed since the last start-up camp and how stupid companies are – but first, some background.

The above mentioned product we launched, is a service that allows people to discover events and activities that they would be interested in. We have a lot of thoughts on how to grow this – and I know for a fact, finding new things to do in a complex city environment as time-poor adults, is a genuine issue people complain often about. As Mick Liubinskas said “Matching events with motivation is one of the Holy Grails of online businesses” and we’re building tools to allow people to filter events with minimal effort.

ActivityHorizon Team

So as “entrepreneurs” looking to create value under an artificial petri dish, we recognised that existing events services didn’t do enough to filter events with user experience in mind. By pulling data from other websites, we have created a derivative product that creates value without necessarily hurting anyone. Our value proposition comes from the user experience in simplicity (more in the works once the core technology is set-up) and we are more than happy to access data from other providers in the information value chain on the terms they want.

The problem is that they have no terms! The concept of an API is one of the core aspects of the mashup world we live in, firmly entrenched within the web’s culture and ecosystem. It’s something that I believe is a dramatic way forward for the evolution of the news media and it’s a complementary trend that is building the vision of the semantic web. However nearly all the data we have hasn’t been done through an API which can regulate the way we use the data; instead, we’ve had to scrape it.

Scraping is a method of telling a computer how data is structured on a web page, which you then ‘scape’ data from that template presentation on a website. A bit like highlighting words in a word document with a certain characteristic and pulling all the words you highlighted into your own database. Scraping has a negative connotation as people are perceived to be stealing content and re-using it as their own. The truth of the matter is, additional value gets generated when people ‘steal’ information products: data is an object, and by connecting it with other objects – those relationships – are what create information. The potential to created unique relationships with different data sets, means no two derivative information products are the same.

So why are companies stupid
Let’s take for example a site that sells tickets and lists information about them. If you are not versed in the economics of data portability (which we are trying to do with the DataPortability Project), you’d think that if Activity Horizon is scraping ‘their’ data, that’s a bad thing as we are stealing their value.

WRONG!

Their revenue model is based on people buying tickets through their site. So by us reusing their data and creating new information products, we are actually creating more traffic, more demand, more potential sales. By opening up their data silo, they’ve actually opened up more revenue for themselves. And by opening up their data silo, they not only control the derivatives better but they can reduce the overall cost of business for everyone.

Let’s use another example: a site that aggregates tickets and doesn’t actually sell them (ie, their revenue model isn’t through transactions but attention). Activity Horizon could appear to be a competitor right? Not really – because we are pulling information from them (like they are pulling information from the ticket providers). We’ve extracted and created a derivative product, that brings a potential audience to their own website. It’s repurposing information in another way, to a different audience.

The business case for open data is something I could spend hours talking about. But it all boils down to this: data are not like physical objects. Scarcity does not determine the value of data like it does with physical goods. Value out of data and information comes through reuse. The easier you make it for others to resuse your data, the more success you will have.

Facebook needs to be more like the Byzantines

Flickr graph Chris Saad wrote a good post on the DataPortability Project’s (DPP) blog about how the web works on a peering model. Something we do at the DPP is closely monitor the market’s evolution, and having done this actively for a year now as a formal organisation, I feel we are at the cusp of a lot more exciting times to come. These are my thoughts on why Facebook needs to alter their strategy to stay ahead of the game, and by implication, everyone else who is trying to innovate in this sphere.

Let’s start by describing the assertion that owning data is useless, but access is priceless.

It’s a bold statement that you might need to get some background reading to understand my point of view (link above). However once you understand it, all the debates about who “owns” what data, suddenly become irrelevant. Basically access, just like ownership, is possible due to a sophisticated society that recognises peoples rights. Our society has now got to the point where ownership matters less now for the realisation of value, as we now have things in place to do more, through access.

Accessonomics: where access drives value
Let’s use an example to illustrate the point with data. I am on Facebook, MySpace, Bebo, hi5, Orkut, and dozens of other social networking sites that have a profile of me. Now what happens if all of those social networking sites have different profiles of me? One when I was single, one when I was in a relationship, another engaged, and another “it’s complicated”.

If they are all different, who is correct? The profile I last updated of course. With the exception of your birthdate, any data about you will change in the future. There is nothing ‘fixed’ about someone and “owning” a snap shot of them at a particular point of time, is exactly that. Our interests change, as do our closest friends and our careers.

Recognising the time dimension of information means that unless a company has the most recent data about you, they are effectively carrying dead weight and giving themselves a false sense of security (and a false valuation). Facebook’s $3 billion market value is not the data they have in June 2008; but data of people they have access to, of which, that’s the latest version. Sure they can sell to advertisers specific information to target ads, but “single” in May is not as valuable as “single” in November (and even less valuable than single for May and November, but not the months in between).
Network cable

Facebook Connect and the peering network model
The announcement by Facebook in the last month has been nothing short of brilliant (and when its the CEO announcing, it clearly flags it’s a strategic move for their future, and not just some web developer fun). What they have created out of their Facebook Connect service is shaking up the industry as they do a dance with Google since the announcement of OpenSocial in November 2007. That’s because what they are doing is creating a permanent relationship with the user, following them around the web in their activities. This network business model means constant access to the user. But the mistake is equating access with the same way as you would with ownership: ownership is a permanent state, access is dependent on a positive relationship – the latter of course, being they are not permanent. When something is not permanent, you need strategies to ensure relevance.

When explaining data portability to people, I often use the example of data being like money. Storing your data in a bank allows you better security to house that data (as opposed to under your mattress) and better ability to reuse it (ie, with a theoretical debit card, you can use data about your friends for example, to filter content on a third party site). This Facebook Connect model very much appears to follow this line of thinking: you securely store your data in one place and then you can roam the web with the ability to tap into that data.

However there is a problem with this: data isn’t the same as money. Money is valuable because of scarcity in the supply system, whilst data becomes valuable from reusing and creating derivatives. We generate new information by connecting different types of data together (which by definition, is how information gets generated). Our information economy allows alchemists to thrive, who can generate value through their creativity of meshing different (data) objects.

By thinking about the information value chain, Facebook would benefit more by being connected to other hubs, than having all activity go through it. Instead of data being stored in the one bank, it’s actually stored across multiple banks (as a person, it probably scares you to store all your personal information with the one company: you’d split it if you could). What you want to do as a company is have access to this secure EFT ecosystem. Facebook can access data that occurs between other sites because they are party to the same secured transfer system, even though they had nothing to do with the information generation.

Facebook needs to remove itself from being a central node, and instead, a linked-up node. The node with the most relationships with other sites and hubs wins, because with the more data at your hands, the more potential you have of connecting dots to create unique information.

Facebook needs to think like the Byzantines
A lot more can be said on this and I’m sure the testosterone within Facebook thinks it can colonise the web. What I am going to conclude with is that that you can’t fight the inevitable and this EFT system is effectively being built around Facebook with OpenSocial. The networked peer model will trump – the short history and inherent nature of the Internet proves that. Don’t mistake short term success (ie, five years in the context of the Internet) with the long term trends.

Byzantine buildingThere was once a time where people thought MySpace was unstoppable. Microsoft unbeatable. IBM unbreakable. No empire in the history of the word has lasted forever. What we can do however, is learn the lessons of those that lasted longer than most, like the forgotten Byzantine empire.

Also known as the eastern Roman empire, its been given a separate name by historians because it outlived its western counterpart by over 1000 years. How did they last that long? Through diplomacy and avoiding war as much as possible. Rather than buying weapons, they bought friends, and ensured they had relationships with those around them who had it in their self-interest to keep the Byzantines in power.

Facebook needs to ensure it stays relevant in the entire ecosystem and not be a barrier. They are a cashed up business in growth mode with the potential to be the next Google in terms of impact – but let’s put emphasis on “potential”. Facebook has competitors that are cash flow positive, have billions in the bank, but most importantly of all are united in goals. They can’t afford to fight a colonial war of capturing people identity’s and they shouldn’t think they need to.

Trying to be the central node of the entire ecosystem, by implementing their own proprietary methods, is an expensive approach that will ultimately be beaten one day. However build a peered ecosystem where you can access all data is very powerful. Facebook just needs access, as they can create value through their sheer resources to generate innovative information products: that, not lock-in, is that will keep them up in front.

Just because it’s a decentralised system, doesn’t mean you can’t rule it. If all the kids on a track are wearing the same special shoes, that’s not going to mean everyone runs the same time on the 100 metre dash. They call the patriarch of Constantiniple even to this day “first among equals” – an important figure who worked in parallel to the emperor’s authority during the empire’s reign. And it’s no coincidence that the Byzantine’s outlived nearly all empires known to date, which even to this day, arguably still exists in spirit.

Facebook’s not going to change their strategy, because their short-term success and perception of dominance blinds their eyes. But that doesn’t mean the rest of us need to make that mistake. Pick your fights: realise the business strategy of being a central node will create more heart-ache than gain.

It may sound counter intuitive but less control can actually mean more benefit. The value comes not from having everyone walk through your door, but rather you having the keys to everyone else’s door. Follow the peered model, and the entity with the most linkages with other data nodes, will win.

Let’s kill the password anti-pattern before the next web cycle

Authenticity required: password?I’ve just posted an explanation on the DataPortability Blog about delegated authentication and the Open Standard OAuth. I give poor Twitter a bit of attention by calling them irresponsible (which their password anti-pattern is – a generic example being sites that force people to give up their passwords to their e-mail account, to get functionality like finding your friends on a social network) but with their leadership they will be a pin-up example which we can promote going forward and well placed in this rapidly evolving data portability world. I thought the news would have calmed down by now, but new issues have come to light further highlighting the importance of some security.

With the death of Web 2.0, the next wave of growth for the Web (other than ‘faster, better, cheaper’ tech for our existing communications infrastructure) will come from innovation on the data side. Heaven forbid another blanket term for this next period, which I believe we will see the rise of when Facebook starts monetising and preparing for an IPO, but all existing trends outside of devices (mobile) and visual rendering (3D Internet) seem to point to this. That is, innovation on machine-to-machine technologies, as opposed to the people-to-machine and people-to-people technologies that we have seen to date. The others have been done and are being refined: machine-to-machine is so big it’s a whole new world that we’ve barely scratched the surface of.

But enough about that because this isn’t a post on the future – it’s on the current – and how pathetic current practices are. I caught up with Carlee Potter yesterday – she’s a young Old Media veteran who inspired by the Huffington Post, wants to pioneer New Media (go support her!). Following on from our discussion, she writes in her post that she is pressured by her friends to add applications on services like Facebook. We started talking about this massive cultural issue that is now being exported to the mainstream, where people freely give up personal information – not just the apps accessing it under Facebook’s control, but their passwords to add friends.

I came to the realisation of how pathetic this password anti-pattern is. I am very aware that I don’t like the fact that various social networking sites ask me for private information like my e-mail account, but I had forgotten how used to the process I’ve become to this situation that’s forced on us (ie, giving up our e-mail account passsword to get functionality).

Argument’s that ‘make it ok’ are that these types of situations are low risk (ie, communication tools). I completely disagree, because reputational risk is not something easily measured (like financial risk which has money to quantify), but that’s not the point: it’s contributing to a broader cultural acceptance, that if we have some trust of a service, we will give them personal information (like passwords to other services) so we can get increased utility out of that service. That is just wrong, and whilst the data portability vision is about getting access to your data from other services, it needs to be done whilst respecting the privacy of yourself and others.

Inspired by Chris Messina, I would like to see us all agree on making 2009 the year we kill the password anti-pattern. Because as we now set the seeds for a new evolution of the web and Internet services, let’s ensure we’ve got things like this right. In a data web where everything is interoperable, something that’s a password anti-pattern is not a culture that bodes us well.

They say privacy is dead. Well it only is if we let it die – and this is certainly one simple thing we can do to control how our personal information about ourselves gets used by others. So here’s to 2009: where we seek the eradication of the password anti-pattern virus!

Blog posts on Liako.Biz for 2008

I launched this blog in March 2005 as a travel blog. People would flood me with e-mails about my travels, and it made me realise how powerful blogging can be (not to mention fun). I re-started this blog in March 2007 as a "career" blog (whatever that’s supposed to mean). It’s probably best now to describe it as my "passions" blog which evolves as I progress through life and think about things.

What I love about blogging is that it forces me to think; forces me to research and learn; forces me to challenge my ideas by interacting with other people. All the good stuff in life – I hope to give a bit more attention next year.

I also thought it would be good if I summarised what I wrote about this year. Heck – let’s go right back to 2005. This will be the first in a series of three blog posts progressively released – starting with 2008 today, then 2007 tomorrow and finally 2005 two days later. For those that may post comments, bear with me as I have literally 24 hours of New Year’s concerts to attend to (get home at 6am from Shore Thing, ready for Field day at 11am). It may take me some time to recover and get back on a computer!

I’ve given you a brief summary to guide you on whether you should make the great leap and click. I was going to rank my articles with a simple "good, poor, average" and I ended up getting stuck reading some and think 90% are more or less the same style (so I am either consistently crap or consistently good).

Enjoy!

December 2008

  • A milestone year in my life: Basically, a mini biography of my career. The decisions I made and the experiences I’ve had that will determine where I will be heading
  • The evolution of news and the bootstrapping of the Semantic Web: Highlighting how the New York Times is making available news data in the form of API’s. The significance of this in my eyes is a huge shift in the evolution of the news media, and separately, I mention that this might make the vision of the Semantic Web a reality in an unintended way
  • Thank you 2008, you finally gave New Media a name: I indicate how 2008 was the tipping point for the Information Age’s Social Media to finally trump the Industrial Age’s Mass Media. I researched the history of the concept of Social Media, explained what "media" really is, and how the term "Social Media" is the perfect term to describe what we’ve been calling these evolving communication trends.
  • The makings of a media mogul: Michael Arrington of TechCrunch: A detailed analysis of how a nobody became one of the most influential men in the world as a New Media pioneer. Mr Arrington even thanked me!
  • The future of journalism and media: A look at the Watergate scandal as well as my own personal experience with a university publication, to understand the core dynamic of the media. I argue that what made the mass media tick in the past was a marketplace, and it’s one that can be applied to digital media going forward.
  • So open it‚Äôs closed: I make an argument that the term "Open" is being abused and has lost its meaning. We need better guidelines on what constitutes an "Open Standard" before it becomes too late.
  • Social media and that whole ‚Äúfriend‚Äù thing: A post about how there is pressure to subscribe to peoples content on various services, even when you don’t want to receive their content. The result is an unusable service. I reflect on how Google Readers friends option is a simple but more effective way of social media, as it removes this pressure.

November 2008

  • The broken business model of newspapers: An analysis of problems with the newspaper industry – too much detail in articles create extra cost, changes to the news cycle has changed their relevance, and incentives and structures are not aligned with what their strategic goals should be
  • Online advertising – a bubble: Long detailed analysis on why advertising is basically screwed in the long term (thanks to the Internet)
  • Liako is everywhere‚Ķbut not here: Some links to content I have been creating elsewhere, as this blog had been neglected!
  • The Rudd Filter: I wrote an e-mail to every senator of the Australian parliament on the proposed Internet censorship laws. As a postscript, it made an impact as I got responses from the key people who are the balance of power in the Senate 🙂
  • You don‚Äôt nor need to own your data: We live in an economy now where you don’t need "ownership" to live your life. This will certainly make you think!

October 2008

  • The mobile 3D future – as clear as mud: Recounting my experience from iPhone 1G to Nokia N96 back to the iPhone (3G). I conclude that the reason we never got the vision of the mobile web in the past, is because the interface has been the missing link for so long

September 2008

July 2008

  • Silicon Beach Australia – the movie!: An announcement post for the Silicon Beach Australia community which exploded in interest after I created it
  • The DataPortability governance framework: a template: An update, history and recognition post of the many months of hard work for the team that created the governance and workflow model for the DataPortability Project. It was a challenge because existing models aren’t designed for an online virtual world that we operate as.
  • Internet censorship in Australia: The responses from the Federal government on my letter six months earlier protesting against the proposed Internet censorship regime

June 2008

May 2008

April 2008

March 2008

February 2008

The evolution of news and the bootstrapping of the Semantic Web

The other month (as in, the ones where I am working 16 hour days and don’t have time to blog), I read in amazement a stunning move made by the New York Times. It was the announcement of its first API, where you could query campaign finance data. It turns out this wasn’t an isolated incident, as evidenced by yet another API release, this time for movies, with plenty more to come.

Fake New York Times newspaper That is massive! Basically, using the same data people will be able to create completely different information products.

I doubt the journalists toiling away at the Times have any idea what this will do to their antiquated craft (validating that to get the future of media you need to track technology). As the switched on Marshall Kirkpatrick said in the above linked article for Read Write Web "We believe that steps like this are going to prove key if big media is to thrive in the future."

Hell yeah. The web has now evolved beyond ‘destination’ sites as a business model. News organisations need to harness the two emerging business models – platforms and networks. Whilst we’ve seen lots of people trying the platform model (as aggregators – after all, that is what a traditional newspaper has been in society), this is the first real example I have seen of the heritage media doing the network model. The network model means your business thrives by people using *other* peoples’ sites and services. It sounds counter intuitive but it’s the evolution of the information value chain.

This will certainly make Sir Tim Berners-Lee happy. The Semantic Web is a vision that information on the web is machine readable so that computers can truly unleash their power. However this vision is gaining traction very slowly. We will get there, but I am wondering whether the way we get there is not how we expect.

The New Improve Semantic Web: now with added meaning!

These API’s that allow web services to reuse their data in a structured way may just be what the Semantic Web needs to bootstrap it. There’s an assumption with the vision, which is that for it to work, all data needs to be open and publicly accessible. The economics are just not there yet for companies to unlock their data and my work this year with the DataPortability Project has made me realise to get value out of your data you simply need access to it (which doesn’t necessarily mean public data).

Either way, for me this was one of the biggest news events of the year, and one that very quietly has moved on. This will certainly be something worth tracking in 2009 as we see the evolution of not just the Semantic Web, but also Social Media.

So open it’s closed

The DataPortability Project has successfully promoted in 2008 the concept of “data portability”. However it’s become too successful – people make announcements now that claim to be “data portability” but are misleadingly not. Further, the term “Open” has become the new black. But really, when people say they are open – are they?

Status update on the DataPortability Project & context
The DataPortability Project now has developed a strong underlying transparent governance model to make decisions which embeds a process to achieve outcomes. We have also formulated our vision that forms the core DNA of the Project and allow us to align our efforts. Organisationally, we are currently working on a legal entity to protect our online community, and we are doing this whilst also ensuring we are working with others in the industry, such as the discussions we’ve had within the IDTBD proposal with Liberty Alliance, Identity Commons and others.

Our brand communications are nearly finalised (this time, legally vetted), and a refreshed website with a new blog has been rolled out. We’ve put out calls for positions and have already finalised our agreement with a new community manager. (Now open are positions for our analyst roles if you are interested.)

We have a Health Care task force that’s just started, looking to broaden our work into another sector of the economy. We also have an Service Provider Grid Task force finalising its work, which via an online interface and API, will allow people to query what various entities use in terms of open standards. We also have a task force that will provide sample EULA and TOS documents that encourage data portability, and further our vision.

The DataPortability vision states that people should be able to reuse their data. Traditionally in the past, people have said this means “physically” porting their data amongst web services. Whilst this applies in some cases, it is also about access as I recently argued .

So to synchronise our work on the EULA/ToS task force, I believe we need a technology equivalent, and which will give additional value to our Service Provider Grid. This is because Open Standards comply with our vision, and we need to ensure we only support efforts that we believe are worthy.

Hi, I’m open
Open Standards have been a core value that the DataPortability Project has advocated for since its founding, getting to the point where its even been confused as its core mission (it‚Äôs not). For us, they are an enabler – and it has always been in our interest to see all of them work together.

Standards are important because they allow interoperability. For people to be able to access their data from multiple systems, we require systems to be able to easily communicate with each other. Likewise, for people to get value of any data they export from a system, they need to be able to import it – and this can only occur if the data is structured in a way that is compatible with another system.

We advocate “Open” because we want to minimise the costs of business for wanting to comply with our vision. However during 2008, the term "Open" Standards has been over-used, to the point of abuse.

An open standard is a standard that is publicly available and has various rights of use associated with it. But really, what’s open?
– its availability?
– the authority controlling the standard?
– the decision making process over the standard?

Liberty Alliance defines it as:

– The costs for the use of the standard are low.
– The standard has been published.
– The standard is adopted on the basis of an open decision-making procedure.
– The intellectual property rights to the standard are vested in a not-for-profit organisation, which operates a completely free access policy.
– There are no constraints on the re-use of the standard.

That I believe, perfectly encapsulates what I think an Open Standard should be. However as someone who spends his days applying international accounting standards to what companies report in their financials, I can assure you, simply flagging the criteria is only half the fun. Interpreting them is a whole debate in itself.

In my eye, most of these "open" efforts don’t fit that criteria. To illustrate, I am going to shame myself as I am a member of a workgroup that claims to be open: the APML workgroup. The group fails the open test because:
– it has a closed workgroup that makes the decisions, without a clearly defined decision making procedure
– it does not have a non-profit behind it, with the copyright owned by a company (although it’s made clear there is no intention to issue patents)
– it has no clear rights attached to it

So does that mean every standard group needs to create a legal entity for it to be open? Thankfully no – the Open Web Foundation (OWF) will solve this problem. Or does it? Whilst the decision making process is "open" (you can read the mailing list where the discussion occurs), what about the way it selects members? It’s dependent on being invited. That’s Open with a big But.

How about OpenID (which I am also a member of) – that poster child for "Open Standards". On the face of it, it fits the bill. But did you know OpenID contains other standards as part of it? As my friend and intellectual mentor Steve Greenberg said:

openid xrds greenberg

Now thankfully, XRDS fits the bill as a safe standard. Well kind of. It has links to another standard XRI, which it is alleged are subject to patent claims. Well sort of. Kinda. Oh God, let’s not get into a discussion about this again. But don’t give poor APML, the OWF or Open ID too much grief – I could indeed raise some nastier questions especially at other groups. However this isn’t about shaming – rather, it’s about raising questions.

The standards communities are fraught with politics, they are murky, and now they are creeping into the infrastructure of our online world. As a proponent for these "Open Standards", I think it’s time we start looking at them with a more critical eye. Yes, I recognise all these questions I’m raising are fixable, but that’s why I want to raise the point, because they are currently being swept under the carpet outside of the traditional authorities like the W3C.

It’s time some boundaries were set on what is effectively the brand of Open. It’s also time the term is defined, because quite frankly, its lost all meaning now. I’ve listed some criteria – but what we really need is some consensus on what ‘the’ criteria for Open should be.

You don’t nor need to own your data

One of the biggest questions the DataPortability project has grappled with (and where the entire industry is not at consensus), is a fairly basic question with some profound consequences: who owns your data. Well I think I have an answer to the question now, which I’ve now cross-validated across multiple domains. Given we live in the Information Age, this certainly matters in every respect.

So who owns “your data”? Not you. Or the other guy. Or the government, and the MicroGooHoo corporate monolith. Actually, no one does. And if they do, it doesn’t matter.

People like to conflate the concept of property ownership to that of data ownership. I mean it’s you right? You own your house so surely, you own your e-mail address, your name, your date of birth records, your identity. However when you go into the details, from a conceptual level, it doesn’t make sense.

Ownership of data
First of all, let’s define property ownership: “the ability to deny use of an asset by another entity”. The reason you can claim status to owning your house, is because you can deny someone else access to your property. Most of us have a fence to separate our property from the public space; others like the hillbillies sit in their rocking chair with a shot gun ready to fire. Either way, it’s well understood if someone else owns something, and if you trespass, the dogs will chase after you.

133377798_8c85d1f1a6_o

The characteristics of ownership can be described as follows:
1) You have legal title recognising in your legal jurisdiction that you own it.
2) You have the ability to enforce your right of ownership in your legal jurisdiction
3) You can get benefits from the property.

The third point is key. When people cry out loud “I own my data”, that’s essentially the reason (when you take out the Neanderthal emotionally-driven reasoning out of the equation). Where we get a little lost though, is when we define those benefits. It could be said, that you want to be able to control your data so that you can use it somewhere else, and so you can make sure someone else doesn’t use it in a way that causes you harm.

Whilst that might sound like ownership to you, that’s where the house of cards collapses. The reason being, unless you can prove the ability to deny use by another entity, you do not have ownership. It’s a trap, because data is not like a physical good which cannot be easily copied. It’s like a butterfly locked in a safe: the moment you open that safe up, you can say good bye. If data can only satisfy the ownership definition when you hide it from the world, that means when it’s public to the world, you no longer own it. And that sucks, because data by nature is used for public consumption. But what if you could get the same benefits of ownership – or rather, receive benefits of usage and regulate usage – without actually ‘owning’ it?

Property and data – same same, but different
Both property and data are assets. They create value for those who use them. But that’s where the similarity’s end.

Property gains value through scarcity. The more unique, the more valuable. Data on the other hand, gains value through reuse. The more derivative works off it, means the more information generated (as information is simply data connected with other data). The more information, the more knowledge, the more value created – working its way along the information value chain. If data is isolated, and not reused, it has little value. For example, if a company has a piece of data but is not allowed to ever use it – there is no value to it.

Data gains value through use, and additional value through reuse and derivative creations. If no one reads this blog, it’s a waste of space; if thousands of people read it, its value increases – as these ideas are decimated. To give one perspective on this, when people create their own posts reusing the data I’ve created, I generate value through them linking back to me. No linking, no value realised. Of course, I get a lot more value out of it beyond page rank juice, but hopefully you realise if you “steal” my content (with at least some acknowledgement to me the person), then you are actually doing me a favour.

Ignore the above!
Talking about all this ownership stuff doesn’t actually matter; it’s not ownership that we want. Let’s take a step back, and look at this from a broader, philosophical view.

Property ownership is based on the concept that you get value from holding something for an extended period of time. But in an age of rapid change, do you still get value from that? Let’s say, we lose the Holy War for people being able to ‘own’ their data. Facebook – you win – you now ‘own’ me. This is because it owns the data about me – my identity, it would appear, is under the control of Facebook – it now owns, that “I am in a relationship”. However, the Holy War might have been lost but I don’t care. Because Facebook owns crap – as six months ago, I was in a relationship. Now I’m single and haven’t updated my status. The value for Facebook, is not in owning me in a period of time: it’s in having access to me all the time – because one way they translate that data into value is advertising, and targeting ads is pointless if you have the wrong information to base your targetting on. Probably the only data that can be static in my profile, is birth-date and gender – but with some tampering and cosmetics, even those can be altered now!

468487548_06182b43d2_o

Think about this point raised by Luk Vervenne, in response to my above thoughts on the VRM mailing list, by considering employability. A lot of your personal information, is actually generated by interactions with third parties, such as the education institution you received your degree from. So do I own the fact that I have a Bachelor of Commerce from the University of Sydney? No I don’t, as that brand and the authenticity is that of the university. What I do have however, is access & usage rights to it. Last time I checked, I didn’t own the university, but if someone quizzes me on my academic record, there’s a hotline ready to confirm it – and once validated, I get the recognition that translates into a benefit for me.

Our economy is now transitioning from a goods-producing to a service-performing and experience-generating economy. It’s hard for us to imagine this new world, as our conceptual understanding of the world is built on the concept of selling, buying and otherwise trading goods that ultimately ends in us owning something. But this market era of the exchange of goods is making way for “networks” and the concept of owning property will diminish in importance, as our new world is will now place value on the access.

This is a broader shift. As a young man building his life, I cannot afford to buy a house in Sydney with its overinflated prices. But that’s fine – I am comfortable in renting – all I want is ‘access’ to the property, not the legal title to it which quite frankly would be a bad investment decision even aside from the current economic crisis. I did manage to buy myself a car, but I am cursing the fact that I wasted my money on that debt which could have gone to more productive means – instead, I could have just paid for access to public transport and taxis when I needed transport. In other words, we now have an economy where you do not need to own something to get the value: you just need access.

That’s not to say property ownership is a dead concept – rather, it’s become less important. When we consider history as well, the concept of the masses “owning” property was foreign anyway – there was a class system with the small but influential aristocracy that would own the land, with the serfs working on the land. “Ownership” really, is a new ‘established’ concept to our world – and it’s now ready to get out of vogue again. We’ve now reached a level of sophistication in our society where we no longer need the security of ownership to get the benefits in our life – and these property owners that we get our benefits from, may appear to yield power but they also have a lot of financial risk, government accountability and public scrutiny (unlike history’s aristocracy).

2516780900_fab76bf33e_o

Take a look at companies and how they outsource a lot of their functions (or even simplify their businesses’ value-activities). Every single client of mine – multi-million dollar businesses at that as well – pay rent. They don’t own the office space they are in, as for them to get the benefits, they instead simply need access which they get through rental. “Owning” the property is not part of the core value of the business. Whilst security is needed, because not having ownership can put you at the mercy of the landlord, this doesn’t mean you can’t contract protection like my clients do as part of the lease agreements.

To bring it back to the topic, access to your data is what matters – but it also needs to be carefully understood. For example, access to your health records might not be a good thing. Rather, you can control who has access to that data. Similarly, whilst no one might own your data, what you do have is the right to demand guidelines and principles like what we are trying to do at the DataPortability Project on how “your” data can be used. Certainly, the various governmental privacy and data protection legislation around the world does exactly that: it governs how companies can use personally identifiable data.

Incomplete thoughts, but I hope I’ve made you think. I know I’m still thinking.

« Older posts Newer posts »