Frequent thinker, occasional writer, constant smart-arse

Tag: data (Page 3 of 5)

Three startups in 24 hours – lessons in the costs of innovation

I’m sitting here at Start-up weekend, a concept instigated by Bart Jellema and partner Kim Chen, as an experiment of bringing the Australian tech community together, part of the broader effort several of us are pursuing to build Silicon Beach in Australia.

I’ve dropped in on the tail-end, and it’s amazing to see what happened. Literally, in the space of 24 hours, three teams of five people have created three separate products. They are all well-thought out, with top-notch development, and clearly designed to actually pass off as a genuine start-up that’s been working on the idea for weeks if not months.

TrafficHawk.com.au is a website delivering up to the minute traffic alerts for over six million drivers in the state of New South Wales, the biggest state in Australia
LinkViz.com is a service that enables you to visually determine what’s hot on Twitter, a social media service.
uT.ag is a service that monetises links people share with other people.

All three of these products have blown me away, not just in the quality, but the innovation. For example, ut.ag is a url shortening application that competes against the many others out on the market (popular with the social media services) but adds an advertisement to the page people click to view. Such a stupidly simple idea that I can’t believe it hasn’t been created before…and it’s already profitable in the few hours of operation! LinkViz provides a stunningly visual representation of links, that it’s hard to believe this was created – from concept to product – in such a short time. Traffic Hawk is a basic but useful mashup that genuinely adds value to consumers.

The challenges
This is a clear demonstration that when you get a bunch of people together, anything can happen: pure innovation, in a timewarp of a few hours. Products, with real revenue potential. It’s scary to witness how the costs of business in the digital economy can be boiled down to simply a lack of sleep! But talking to the teams, I realised that the costs were not as low as they could have been. It also is an interesting insight into the costs to business on the internet, seen through this artificial prism of reality.

For example, Geoff McQueen (a successful developer/businessman, that was also one of the founders of Omnidrive) of the Traffic Hawk site was telling me about the difficulty they experienced. Whilst the actual service looks like a basic mashup on Google maps, the actual scraping of data from a government website (real time traffic incidents), was considerably painful. Without attention from the team in future, their site will break as the scraping script is dependent on the current configuration with the HTML pages displaying the data. The fact that this is a ‘cost’ for them to develop the idea, is a wasteful cost of business. The lost resources in developing the custom scraping script, and the future maintenance required, is an inefficient allocation of resources in the economy. And for what good reason? Traffic hawk is making the same data, more useful – it’s not hurting, only helping the government service.

This is a clear example of the benefits of open data, where in a DataPortability enabled world, the entrepreneurial segments of the economy that create this innovation, can focus their efforts in areas that add value. In fact, it is a clear demonstration of the value chain for information.

LinkViz on the other end have a different disadvantage. Whilst they have access to the Twitter API that enables them to pull data and create their own mashup with it (unlike the hawkers), the actual API is slow. So the price for a better service from them, is money to get faster access.

Talking with the ut.ag boys, the issue was more a matter of infrastructure. They could have had the same product launched with more features, 12 hours earlier (ie, 12 hours AFTER conception) if it wasn’t for the niggly infrastructure issues. This is more a function of the purposely disorganised nature of the camp, and so the guys lost a lot of time with setting up networks and repositories. I believe however, this should be a lesson about the impact of the broader ecosystem and governments influence on innovation.

For example, we need faster broadband…but also an omnipresent connection. A person should be able to flip open their laptop, and access the internet in an affordable matter that connects them to their tools, anywhere.

Concluding thoughts
The whole purpose of this camp, was not the destination but the process. People getting to know each other, learn new skills – or as Linda Gehard says: “It’s like exercise for entrepreneurs”. However what started as a quick and dirty post to give the guys some exposure, has had me realise the costs to innovation in the economy. When you take out the usual whines about investors and skills shortages, and put together some highly capable people, there are some specific things that still need to be done.

Update 8/9/08:Apologies for the typos and misspellings – my blog locked me out due to a corruption in the software and didn’t save my revised version that I thought I saved. And a huge thank you to Joan Lee, who has now become my official proof reader. Liako.Biz, it appears, is no longer a one man band!

It’s the experience that matters

One of the great things about working on the DataPortability Project, is the exposure to some amazing thinking. Today alone, I stumped on this great piece questioning the point of a music label (via Crosbie Fitch ). Separately, I also came across this interesting bit of thinking about imagining what a world would look like without copyright . Those pieces helped give me more solid arguments with something that’s been on my mind a lot. That being, consumers don’t pay for content’s representation per se. Instead, they pay for the associated experience.

With the digital age, we have seen an uprooting of these traditional industries that operate in the content industries as we have seen with the recording & publishing industries. Our traditional approaches to managing content are being challenged, because we (or rather, they) grew complacent on the technological limitations of content distribution. However, now that we have a new type of technology to distribute content (due to computing, the Internet and the web), we are seeing greater potential for content to be consumed – and it’s also exposing something we have forgotten. The digital revolution is changing business practices but it highlights the true nature of content: it’s about the experience.

To illustrate what I mean, let’s define content as being products like music and books.

When you buy a album, you are not buying it for the physical CD or the plastic casing. The reason you are buying it, is so you can get access to the music. This access entitles you to experiencing the music. On a similar note, when you go to a concert to hear a band, you are not paying to stand in a concert hall. You are paying for the experience of hearing the music live, which also incorporates the associated experience of being a part of a crowd. Both those experiences trigger an emotional reaction – which can be positive or negative, but regardless, is what makes us feel alive. Humans pay for music, because the emotions being triggered by that content, helps them feel like humans.

beyonce

Beyonce’s movements: something you pay to experience

With books, what you are purchasing is knowledge. The paper that you read the novel on, which although can sometimes been done up nicely, isn’t why you buy it. What you are buying, is an experience to consume that knowledge. Some books offer intellectual stimulation; other books offer excitement through a riveting storyline. Regardless, the experience of the book reading is what you are purchasing.

It’s about the experience, stupid
Talking about cultural artifacts like music and books is one thing. But there is no reason why we can’t consider this with information in a generic sense – as the initial data is simply a stage earlier in the value chain . In the context of my personal data, this is something that I have generated. Nothing really special about it. But it becomes special, when a web application can do interesting things with that data. That meaning, when a application can process my data in such a way that gives me a new experience.

For example, there are certain Facebook applications that reveal some interesting information about my friends, by generating insight. Knowing that 58% of my friends are male is useful when I’m considering a party (more beer and Beam; less wine and champagne). Knowing that some of my friends are traveling or living in a certain country, is useful because it gives me awareness that I can meet up with them. By Facebook allowing applications to process my data in the context of my friends, the information they can generate is a lot more valuable if Facebook locked this down. The experience of having access to this information, is not as emotionally driven as a Jane Austen book; but the experience of insight is still something I get out of it.

The ability to offer a unique experience to a consumer, is what is key to any information-based products. Triggering emotions is a powerful thing about humanity, and a consumer when consuming information is looking to get an experience which in reality can only be captured in their memory. Of course, content in the form of entertainment is more about the emotion, whilst news is more about the access , but that doesn’t take away from the inherent characteristics of information.

Recognising that information-products are an experience, should give a better understanding about what we do with them. For example, writing this blog I don’t get any monetary benefit from it. However, the more people that want to copy my "original work", the better. Whilst that may sound contrary to smart business sense, it’s because I recognise the benefit I get from blogging is reputation (well one of them at least). And despite the fact people can ‘steal’ my content, doesn’t mean they can steal my brain. As a content creator, I am being rewarded with the associated benefits of a good reputation, despite the fact I cannot assert ownership over my words.

permission

"If you put that picture on the Internet I’ll call my lawyer"

So why do we obsess over control?
If you are a web application, a book author, or a musician – the way you make money isn’t through the information you generate. Instead, what you are being rewarded with is with a brand; a relationship with your consumer of trust; or just simply attention. Open source developers can appear to be like some hippies helping the world. But look closely at how they make a living, and it’s on the associated expertise that has been recognised onto them through their brand, which allows them to charge for consulting.

If you operate in the information industry, the way you make money is on the experience you create for the consumer – and by generating that experience, you can then create a monetary stream off it. For example, a band that no one knows about has no demand for their music. A cult following, because people get obsessed over their songs played freely everywhere, allows them to make buckets of money on merchandise and concerts. Twitter is a web application, that when I first heard about it, I would never have used it. Now that I use it, I am willing to pay for certain benefits that make my experience more enjoyable (ie, profiling of tweets, etc). Twitter has an opportunity to make money because I value the experience they offer me, and I’m willing to pay to make it a better experience.

In the information business, experience is ultimately your product. Ignore that, and you will be making decisions that at best, will amount to a huge amount of opportunity cost. Here’s hoping that as we move forward with DataPortability, the thinking of businesses can change. Locking down data is not how you make money; it’s the compelling experience you offer your consumers that is the true source of competitive advantage and ultimately, revenues.

What is data?

The leading voices in technology have exploded in discussion about data portability, data rights, and the future of web applications. As an active member in the DataPortability Policy group, here is my suggestion on how the debate needs to proceed: break it down. Michael Arrington seems pretty convinced you own all your data, but I don’t think that’s a fair thing to say – and at core is the reason he is clashing with Robert Scoble’s view. For things to proceed, I really think a deeper analysis of the issues need to be made.

1) Define the difference between data, information and knowledge. There’s a big difference.
2) Determine what things are. (is an e-mail address data or information?)
3) Recognise the difference between ownership, rights and their implications.
4) Determine what rights (if that’s what it is) the various entities have over data (users, web apps, etc).

This is a big area and has a lot of abstract concepts – break it down and debate it there.

Some of my own thoughts to give some context

1) Data is an object and information is generated when you create linkages between different types of data Рthe ‘relationships’. Knowledge is the application of information.

  • 2000 is data – a symbol with no meaning. Connect it with other data, like the noun "year", and you have information because 2008 now has meaning. Connect that information with other information, like "computer bug" and "HSBC and you now have an application of that information. That being, there was an issue with the Y2K bug that has something to the bank HSBC.

2) Define what things are

What’s an e-mail address, a phone number, a social graph, an image, a podcast…I’m not entirely sure. I wouldn’t be blogging this if I had all the answers. Once we agree on definitions, we can then start categorising them and applying a criteria.

3) Ownership:

Here is something Steve Greenberg explained to me

– Ownership is relevant when there is scarcity.
РOwnership is the ability to deny someone else’s use of the asset.
– So, if data is shared and publicly available, it is a practical impossibility for me to deny use
Рand if data is available in a form where I can’t control others’ use of it, I can not really claim to own it

Nitin Borwankar has a very different argument: you should have ownership based on property rights. He explained that to me here .

4) Rights over data

I personally think no one owns data (which is inspired by the definition of data being inherently meaningless); instead you own things further down the value chain when that data becomes something with value. You own your overall blog posts – but not the words.

But again, this goes back to what is data?

The value chain for information

Lately, I’ve been doing a lot of thinking about the value chain of information, based on the Porter model of doing a value chain analysis . Given there is an undeniable trend to an knowledge-based economy (that is, if we’re not already there!), it seems pretty valuable that we should at least understand the different facets in the value chain to better understand the information sector.

Below are some thoughts about what I think are the broad aspects of the value system, with some commentary under each to help you understand my thinking. I’ve used common social computing sites to help illustrates the concepts, as everyone can relate to them. Also my definitions for data, information, and knowledge .

value chain is sweet
The value chain
1) Data collection
– value is in the storage
Competitive advantage: who offers the consumers the lowest price for the most storage. You should not just consider this in terms of cost in hosting but also about whether is costs the user their rights to control over some of their data.
Example: MySpace is where you store all your demographic data; SmugMug is where you store all your photos (which I consider data)

2) Data processing
– value is in the ability to manipulate the data
Competitive advantage: The infrastructure to process vasts amount of data at the highest output with the lowest cost
Example: Facebook calculates how many friends you have. The raw computing power to calculate the information requires substantial computing power, which is why Friendster fell when it captured the imagination of the industry as the first major social networking site.

3) Information generation
– Value is in the type and diversity of information. The connection of data (objects) is what generates information. Requires unique ability to understand what data inputs to pull.
Competitive advantage: Ability to access the most data (ie, relationships with the data storage components in the chain), and be able to creatively apply the data in a unique way.
Example: LinkedIn allows me to know that I am two degrees separated from a certain individual. The ability for LinkedIn to do that is a combination of what data they can use as well as the ability to process it. Essentially, the creativity of the company’s management to determine the feature’s value and the relationships with storage vendors or methods of using their own storage. In a DataPortability enabled world, it’s not so much how much data you can store of a user – but how much you can access from the storage vendors ie, relationships with these vendors.

4) Knowledge application
– value is in the application of information
Competitive advantage is on the application of information in a unique way that has not been done before
Example: A network analysis of my social graph. So if a social networking sites can tell me that 48% of my friends are male; and another piece of information that 98% of them are heterosexual; then therefore it is likely I am a straight male. The ability to derive insight, despite the multiple piece of information available, is filtered by those with the unique ability to recognise application of information in certain ways. The determination that I am straight is inference, which is a higher order type value as opposed to just information (which is grounded in hard data and more based on fact).

Implications of the value chain
It is important to note, and why it will be difficult for you to conceptualise the above, is that the Internet industry which is the backbone of the Information Sector of the economy, is still relatively immature. Flickr for example does most of the value chain – they store my photos, they allow me to make changes to the photos and add addition data like tags; they generate information by allowing me to organise my photos into sets (hence giving more value to the photo by putting it into context). And of course, they allow for knowledge application through their community – people passing by, leaving comments, is quite a unique thing that is unique to Flickr.

By better understanding the value chain, hopefully we can also realise that business can thrive by focussing on specific areas and it may not be in their interest to be in all areas. For example, the notion that locking up a person’s user data as being a competitive advantage is silly, if you can offer value through knowledge application.

To put the above in context, MySpace’s recent data availability announcement is a step into the direction of DataPortability (something that will take until the end of this year to finalise at minimum), but whilst Google and Facebook race to offer similar services to ‘lock’ their data, they are in fact missing the point. The value of MySpace for example is the community, and they get value in accessing data and information from as many diverse places as possible to apply that in a unique way. Because they think locking in the data is what determines their business strategy, it forces them to compete in the data storage market – and that is something I would not want to be in given the ability for it to be commoditised, and the massive compliance demands with government and user expectations with their rights. As highlighted by Nitin , data redundancy is a big issue so battling in the storage market puts you at risk if you are solely relying on it as your source for information and knowledge.

As always, I write my blog posts to extend on my thoughts. I’d love feedback and people to challenge the assumptions I’ve made, because I think this can be a very valuable tool in how we view businesses on the web.

Update 1 June 2008: Tim Bull made a video of this posting, which does a better job explaining the concepts presented above

It’s all still alpha in my eyes

The invention of hypertext has been the most revolutionary thing since two previous technologies before: the printing press and the alphabet. Combined with computing and the Internet, we have seen a new world represented by the World Wide Web that has transformed entire industries in its mere 19 15 year existence.

The web caught our imagination in the nineties, which became the Dot-Com bubble. Several years after the bust, optimism reawakened when the Google machine listed on the stock exchange – heralding a new era dubbed “web2.0”. This era has now been recognised in the mainstream, elevated by the mass adoption of the social computing services, and has once again seen the web transform traditional ideas and generate excitement.

davewiner
The web2.0 era is far from over – the recent global recession however has flagged though that the pioneers of the industry are looking for something new. As the mainstream is rejuvenated by web2.0 like the Valley was not that long ago, it’s time to now look for what the next big thing will be. Innovation on the web is apparently flattening. Perhaps it has – but the seeds of the next generation of innovation on the web are already here.

Controversy of the meaning of web2.0 – and what its successor will be – should not distract us. We are seeing the web and associated technologies evolve to new heights. So the question is not when web2.0 ends, but what are we seeing now, that will dominate in the future?

My view:
• The mobile web. The mobile phone is now evolving into a generic entertainment device, becoming a new computing device that extends the reach of the internet. First with the desktop computer, and then with the laptop computer – new opportunities presented themselves in the way we could use computers. The use of this new computing platform will create new opportunities that we have only scratched the surface.
• The 3D web. Visit second life, the virtual world, as you quickly note the main driver of activity is sex and that it’s just a game. However, porn and games have spearheaded a lot of the innovation of technology in the past. The 3D web is now emerging with four separate but related trends: virtual worlds, mirror worlds, augmented reality and lifelogging.
• The data web. Data has now become a focus in the industry. The semantic web, eventually, will allow a weak form of artificial intelligence that will allow computer agents to work in an automated fashion. Vendor Relationship Management is changing the fundamental assumptions of advertising, with a new way of how we transact in our world. Those trends, when combined with the drive for portability of peoples data, is having us see the web in a new light with new potential. Not as a collection of documents, and not as a platform for computing, but as a database that can be queried.

So to get some discussion, I thought I might ping some smart people I know in the industry on what they think: Chris Saad, Daniela Barbosa, Ben Metcalfe, Ross Dawson, Mick Liubinskas, Randal Leeb-du Toit, Stewart Mader, Tim Bull, Seth Yates, Richard Giles as well as you reading this now.
What do you think is currently in the landscape that will dominate the next generation of the web?

What is the DataPortability Project

When we created the DataPortability workgroup in November 2007, it was after discussion amongst a few of us to further explore an idea; a vision for the future of the social web. By working together, we thought we could make real change in the industry. What we didn’t realise, was how quickly and how big the attention generated by this workgroup was to be. A press release has been released that details the journey to date, which highlight’s some interesting tidbits. What I am going to write below, are how my own thoughts have evolved over the last few months, and what it is that I think DataPortability is.

1) Getting companies to adopt open, existing standards
RSS , OpenID , APML , oAuth , RDF , and the rest. These technologies exist, with of which have been around for many years. Everyone that understands what they are, know that they rock. If these standards are all so great – why hasn’t the entire technology industry adopted them yet? Now we just need awareness, education and in some cases pressure on the industry heavies to adopt them.

2) Create best practices of implementing these standards
When you are part of a community, you are in the know, and don’t realise how the outside world looks in. Let the standards communities focus their precious energies on creating and maintaining the technologies; and DataPortability can help provide resources for people to implement them. Is providing PHP4 support for oAuth really a priority? It isn’t for them – but by pooling the community with people that have diverse skillsets and are committed to the overall picture, it has a better chance of happening.

3) Synthesise these open standards to play nice with each other.
All these different communities working in isolation have been doing their own thing. An example is how Yadis-XRDS are working on service discovery and have a lacklustre catalogue. Do we just leave them to do their own thing? Does someone else in Bangalore create his own catalogue? (Which is highly likely given the under-exposure of this key aspect to groups needing it for the other standards, and the current state its in). Thanks to Kaliya for mentioning that the XRDS guys have been more then proficient in working with other groups – "how do you think their spec is part of the OpenID spec?". Julian Bond goes on to say: "Yadis-XRDS is only months old and XRDS-Simple is literally days old…Having trouble thinking of a community that is working in isolation. And that isn’t likely to be hugely offended if you suggested it. " So let me leave the examples here, and just say the DataPortability Project when defining technical and policy blueprints, can identify issues and from the bigger picture perspective focus attention on where it’s needed. By embracing the broader community, and focusing our attention on weaknesses, we can ensure no one is reinventing wheels .

4) Communicate all the good things the existing communities are doing, under the one brand, to the end user.
RSS is by far the most recognised open standard. Have you ever tried explaining RSS to someone who is outside of the tech industry? I have. Multiple times. It’s like I’ve just told them about the future with flying cars and settlements on Mars. I’ve done it in in the corporate world, to friends, family, girls I date, guys I weight train with and anyone else. Moving onto OpenID – does anyone apart from Scoble and the technorati who try all the webservices they can, really care? Most people use Facebook, Hotmail (the cutting edge are using Gmail) and that’s it. On your next trip to Europe ask a cultured French (wo)man if they know what OpenID is; why they need it; what they can do with it. Now try explaining RSS to the mix. And APML. And oAuth. Bonus if you can explain RDF to yourself.

Wouldn’t it be just easier if you explained what DataPortability is, and explained the benefits that can be achieved by using all these standards? Standards are invisible things that consumers shouldn’t need to care about; they just care about the benefits. Do consumers care about the standards behind Wi-Fi, as defined by Zero-conf – or do they care about clicking "enable wireless" on their laptop and them connecting to the Internet. If you are going around evangelising the technical standards, the only audience you will get are the corporates in IT departments, who couldn’t care less. The corporate IT guys respond to their customer/client facing guys, who in turn respond to consumers – and consumers couldn’t care less on how its done, but just what they can do. Have the consumer channel their demand, and it benefits the whole ecosystem.


The new DataPortability trustmark

It has been said the average consumer doesn’t care about DataPortability. Of course they don’t – we are still in the investigation phase of the Project ; which later on will evolve to the design phases and then evangelising phases. We know people would want RSS, oAuth, and the rest of the Alphabet soup – so lets use DataPortability as a brand that we can communicate this. Sales is about creating demand – lets coordinate our ‘selling’ to make it overwhelming – and make it easy for consumers to channel that want in a way they can relate to. You don’t say "oAuth"; you say "preventing password theft" to them instead.

5) Make the business case that a user should get open access to their data
Why should Facebook let other applications use the data it has on its servers? Why should google give up all this data they have about their users to a competitor? Why should a Fortune 500 adopt solutions that decentralise their control? Why should a user adopt RDF on their blog when they get no clear benefit from it? Is a self-trained PHP coder who can whack something together, going to be able to articulate that to the VC’s?

The tech industry has this obsession that nothing gets done unless the developers are on board. No surprises there – if we don’t have an engineer to build the bridge, we are going to have to keep jumping off the cliff hoping we make it to the other side. But at the same time, if you don’t have the people persuading the people that would fund this bridge; or the broader population about how important it is for them to have this bridge – that engineer can build what he wants but the end result is that no one will ever walk on it. Funny how web2.0 companies suck at the revenue model thing : overhype on the development innovation, with under-hype on the value-proposition to the ordinary consumer who funds their business .

Developers need to be on board because they hassle their bosses and sometimes that evangelising from within works; but imagine if we get the developers bosses bosses on board because some old bear on the board of directors wants DataPortability after his daughter explained it to him (the same person that also told him about Facebook and Youtube). I can assure you, as I’ve seen it first hand with the senior leadership at my own firm, this is exactly what is happening.

Intel is one of the best selling computer-chip companies in the world. Do you really think as a consumer I care about what chip my computers works on? Logically – no. But "Intel’s Inside" marketing campaign gave them a monopoly, because end consumers would ask "does it have intel inside?" and this pressure forced Intel’s customers (IBM and the rest) to actually use Intel. Steve Greenberg corrects me by saying "The Intel Inside campaign came a decade after Intel took over the world. It wasn’t what got them there. It was in response to Microsoft signaling that they liked AMD. Looked like AMD was going to take off… but then they didn’t". So my facts were slightly wrong, but the point still remains.
At the same time, it isn’t just political pressure but its also to educate. I genuinely believe opening up your data is a smart business strategy that will change the potential of web services.

You make people care by giving them an incentive to do it (business opportunities; customer political pressure; peer pressure as individuals and an industry which later evolve to industry norms). The semantic web communities, the VRM communities, the entire open standards communities – all have a common interest in doing this. DataPortability is culture change on an industry wide level, that will improve the entire ecosystem. Apparently innovation has died – I say it’s just beginning .

Information overload: we need a supply side solution

About a month ago, I went to a conference filled with journalists and I couldn’t help but ask them what they thought about blogs and its impact on their profession. Predictably, they weren’t too happy about it. Unpredictably however, were the reasons for it. It wasn’t just a rant, but a genuine care about journalism as a concept – and how the blogging “news industry” is digging a hole for everyone.

Bloggers and social media are replacing the newspaper industry as a source of breaking news. What they still lack, is quality – as there have been multiple examples of blogs breaking news that in the rush to publish it, turns out it was in fact fallacious . Personally, I think as blogging evolves (as a form of journalism) the checks and balances will be developed – such as big names blogs with their brands, effectively acting like a traditional masthead. And when a brand is developed, more care is put into quality.

Regardless, the infancy of blogging highlights the broader concern of “quality”. With the freedom for anyone to create, the Information Age has seen us overload with information despite our finite ability to take it all in. The relationship between the producer of news and consumer of news, not only is blurring – but it’s also radically transforming the dynamics that is impacting even the offline world.

Traditionally, the concept of “information overload” has been relegated as a simple analysis of lower costs to entry as a producer of content (anyone can create a blog on wordpress.com and away you go). However what I am starting to realise, is the issue isn’t so much the technological ability for anyone to create their own media empire, but instead, the incentive system we’ve inherited from the offline world.

Whilst there have been numerous companies trying to solve the problem from the demand side with “personalisation” of content (on the desktop , as an aggregator , and about another 1000 different spins), what we really need are attempts on the supply side, from the actual content creators themselves.

info overload

Too much signal, can make it all look like noise

Information overload: we need a supply side solution
Marshall Kirkpatrick , along with his boss Richard McManus , are some of the best thinkers in the industry. The fact they can write, makes them not journalists in the traditional sense, but analysts with the ability to clearly communicate their thoughts. Add to the mix Techcrunch don Michael Arrington , and his amazing team – they are analysts that give us amazing insight into the industry. I value what they write; but when they feel the stress of their industry to write more, they are not only doing a disservice to themselves, but also to the humble reader they write to. Quality is not something you can automate – there’s a fixed amount a writer can do not because of their typing skills but because quality is a factor of self-reflection and research.

The problem is that whilst they want, can and do write analysis – their incentive system is biased towards a numbers system driven by popularity. The more people that read and the more content created (which creates more potential to get readers) means more pageviews and therefore money in the bank as advertisers pay on number of impressions. The conflict of the leading blogs churning out content , is that their incentive system is based on a flawed system in the pre-digital world, which is known as circulation offline, and is now known as pageviews online.

A newspaper primarily makes money through their circulation: the amount of physical newspapers they sell, but also the audited figures of how many people read their newspaper (readership can have a factor of up to three times the physical circulation ). With the latter, a newspaper can sell space based on their proven circulation: the higher the readership, the higher the premium. The reason for this is that in the mass media world, the concept of advertising was about hitting as many people as possible. I liken it to the image of flying a plane over a piece of land, and dropping leaflets with the blind faith that of those 100,000 pamphlets, at least 1000 people catch them.

It sounds stupid why an advertiser would blindly drop pamphlets, but they had to: it was the only way they could effectively advertise. For them to make sales, they need the ability to target buyers and create exposure of the product. The only mechanism available for this was the mass media as it was a captured audience, and at best, an advertiser could places ads on specialist publications hoping to getter better return on their investment (dropping pamphlets about water bottles over a desert, makes more sense than over a group of people in a tropical rainforest). Nevertheless, this advertising was done on mass – the technology limited the ability to target.

catch the advert

Advertising in the mass media: dropping messages, hoping the right person catches them

On the Internet, it is a completely new way to publish. The technology enables a relationship with a consumer of content, a vendor, a producer of content unlike anything else previously in the world. The end goal of a vendor advertising is about sales and they no longer need to drop pamphlets – they can now build a one on one relationship with that consumer. They can now knock on your door (after you’ve flagged you want them to), sit down with you, and have a meaningful conversion on buying the product.

“Pageviews” are pamphlets being dropped – a flawed system that we used purely due to technological limitations. We now have the opportunity for a new way of doing advertising, but we fail to recognise it – and so our new media content creators are being driven by an old media revenue model.

It’s not technology that holds us back, but perception
Vendor Relationship Management or (VRM) is a fascinating new way of looking at advertising, where the above scenario is possible. A person can contain this bank of personal information about themselves, as well as flagging their intention of what products they want to buy – and vendors don’t need to resort to advertising to sell their product, but by building a relationship with these potential buyers one on one. If an advertiser knows you are a potential customer (by virtue of knowing your personal information – which might I add under VRM, is something the consumer controls), they can focus their efforts on you rather than blindly advertising on the other 80% of people that would never buy their product). In a world like this, advertising as we know it is dead because we know longer need it.

VRM requires a cultural change in our world of understanding a future like this. Key to this is the ability for companies to recognise the value of a user controlling their personal data is in fact allowing us new opportunities for advertising. Companies currently believe by accumulating data about a user, they are builder a richer profile of someone and therefore can better ‘target’ advertising. But companies succeeding technologically on this front, are being booed down in a big way from privacy advocates and the mainstream public. The cost of holding this rich data is too much. Privacy by obscurity is no longer possible, and people demand the right of privacy due to an electronic age where disparate pieces of their life can be linked online

One of the biggest things the DataPortability Project is doing, is transforming the notion that a company somehow has a competitive advantage by controlling a users data. The political pressure, education, and advocacy of this group is going to allow things like VRM. When I spoke to a room of Australia’s leading technologists at BarCamp Sydney about DataPortability, what I realised is that they failed to recognise what we are doing is not a technological transformation (we are advocating existing open standards that already exist, not new ones) but a cultural transformation of a users relationship with their data. We are changing perceptions, not building new technology.

money on the plate

To fix a problem, you need to look at the source that feeds the beast

How the content business will change with VRM
One day, when users control their data and have data portability, and we can have VRM – the content-generating business will find a light to the hole currently being dug. Advertising on a “hits” model will no longer be relevant. The page view will be dead.

Instead, what we may see is an evolution to a subscription model. Rather than content producers measuring success based on how many people viewed their content, they can now focus less on hits and more on quality as their incentive system will not be driven by the pageview. Instead, consumers can build up ‘credits’ under a VRM system for participating (my independent view, not a VRM idea), and can then use those credits to purchase access to content they come across online. Such a model allows content creators to be rewarded for quality, not numbers. They will need to focus on their brand managing their audiences expectations of what they create, and in return, a user can subscribe with regular payments of credits they earned in the VRM system.

Content producers can then follow whatever content strategy they want (news, analysis, entertainment ) and will no longer be held captive by the legacy world system that drives reward for number of people not types of people.

Will this happen any time soon? With DataPortability, yes – but once we all realise we need to work together towards a new future. But until we get that broad recognition, I’m just going to have to keep hitting “read all” in my feed reader because I can’t keep up with the amount of content being generated; whilst the poor content creators strain their lives, in the hope of working in a flawed system that doesn’t reward their brilliance.

Here’s a secret: the semantic web is the boring bit

Marshall Kirkpatrick caused a wave today, when he gave a brutally honest assessment of one of the most talked up semantic web applications, Twine. It was as per usual, an excellent analysis by Marshall and I don’t think he needs to hide behind his words as they are fair. However, what I think is crucial is now that the semantic web is gaining traction into the mainstream from a academic thesis to real world web applications, is we do a little bit of stakeholder management.

Ready? The semantic web is as boring as bat shit.

Essentially, the semantic web is about structuring content in a way so that computers can interpret the information. It’s a bit like linking every word on the web, to a dictionary entry so that computers understand the language that humans use.

But seriously, how is that exciting? People don’t get the semantic web, because it’s the fundamentals – and thats boring! Take for example RDF, the semantic web building block, and which is about structuring data into subject, predicate and object. This is straight from primary school grammar lessons, where we learn about the fundamentals of the English language (no coincidence I just linked to an grammar guide, not the RDF guide). And if you have heard of subject, predicate and object before in the context of the semantic web, you probably didn’t even realise it’s how the entire English language is based. It’s because you probably did learn it, and forgot – it’s boring as bat shit. But damn, without them, we wouldn’t be communicating right now to each other.

The point I want to make, is that the building blocks are not where the excitement: the excitement, is what you can do once we have those building blocks. In English, we have poetry, literature, and just language in general where we communicate as human beings. Once we get the basics down of information, we are laying the foundation of a whole new world of computational possibilities. Marshall is spot on in saying “…semantics may be best suited to the back end…” because the excitement is what they enable, not the actual semantics itself which is going to take a long time to build up.

Imagine, the sum of human knowledge accessible by a computer to query? Semantic web applications are boring and you won’t ever get them – but what they enable, is a whole new world of potential which once we can flick the switch, will mean a world we will barely recognise from today’s standpoint.

DataPortability is about user value, fool!

In a recent interview, VentureBeat asks Facebook creator and CEO Mark Zuckerberg the following:

VB: Facebook has recently joined DataPortability.org, a working group among web companies, that intends to develop common standards so users can access their data across sites. Is Facebook going to let users — and other companies — take Facebook data completely off Facebook?

MZ: I think that trend is worth watching.

It disappoints me to see that, because it seems like a quick journalists hit at a contentious issue. On the other hand, we have seen amazing news today which are examples of exactly the type of thing we should be expecting in a data portability enabled world: the Google contacts API which has been a thing we have highlighted for months now as an issue for data security and Google analytics allowing benchmarking which is a clear example of a company that understands by linking different types of data you generate more information and therefore value for the user. The DataPortability project is about trying to advocate new ways of thinking, and indeed, we don’t have to formally produce a product in as much maintain the agenda in the industry.

However the reason I write this is that it worries me a bit that we are throwing around the term “data portability” despite the fact the DataPortability Project has yet to formally define what that means. I can say this because as a member of the policy action group and the steering action group which are responsible for making this distinction, we have yet to formally decide.

Today, I offer an analysis of what the industry needs to be talking about, because the term is being thrown around like buggery. Whilst it may be weeks or months before we finalise this, it’s starting to bother me that people seem to think the concept means solving the rest of the world’s problems or to disrupt the status quo. It’s time for some focus!

Value creation
First of all, we need to determine why the hell we want data portability. DataPortability (note the distinction of the term with that of ‘data portability’ Рthe latter represents the philosophy whilst the former is the implementation of that philosophy by DataPortability.org) is not a new utopian ideal; it’s a new way of thinking about things that will generate value in the entire Information sector. So to genuinely want to create value for consumers and businesses alike, we need to apply thinking that we use in the rest of the business world.

A company should be centered on generating value for its customers. Whilst they may have obligations to generate returns for their shareholders, and may attempt different things to meet those obligations; they also have an obligation to generate shareholder value. To generate shareholder value, means to fund the growth of their business ultimately through increased customer utility which is the only long term way of doing so (taking out acquisitions and operational efficiency which are other ways companies generate more value but which are short term measures however). Therefore an analysis of what value DataPortability creates should be done with the customer in mind.

The economic value of a user having some sort of control over their data is that they can generate more value through their transactions within the Information economy. This means better insights (ie, greater interoperability allowing the connection of data to create more information), less redundancy (being able to use the same data), and more security (which includes better privacy which can compromise a consumers existence if not managed).

Secondly, what does it mean for a consumer to have data portability? Since we have realised that the purpose of such an exercise is to generate value, questions about data like “control”, “access” and “ownership” need to be reevaluated because on face value, the way they are applied may have either beneficial or detrimental effects for new business models. The international accounting standards state that you can legally “own” an asset but not necessarily receive the economics benefits associated with that asset. The concept of ownership to achieve benefit is something we really need to clarify, because quite frankly, ownership does not translate into economic benefit which is what we are at stake to achieve.

Privacy is a concept that has legal implications, and regardless of what we discuss with DataPortability, it still needs to be considered because business operates within the frameworks of law. Specifically, the human rights of an individual (who are consumers) need to be given greater priority than any other factor. So although we should be focused on how we can generate value, we also need to be mindful that certain types of data, like personally identifiable data, needs to be considered in adifferent light as there are social implications in addition to the economic aspects.

The use cases
The technical action group within the DataPortability project has been attempting to create a list of scenarios that constitute use cases for DataPortability enablement. This is crucial because to develop the blueprint, we also need to know what exactly the blueprint applies to.

I think it’s time however we recognise, that this isn’t merely a technical issue, but an industry issue. So now that we have begun the research phase of the DataPortability Project, I ask you and everyone else to join me as we discuss what exactly is the economic benefit that DataPortability creates. Rather than asking if Facebook is going to give up its users data to other applications, we need to be thinking on what is the end value that we strive to achieve by having DataPortability.

Portability in context, not location
When the media discuss DataPortability, please understand that a user simply being able to export their data is quite irrelevant to the discussion, as I have outlined in my previous posting. What truly matters is “access”. The ability for a user to command the economic benefits of their data, is the ability to determine who else can access their data. Companies need to be thinking that value creation comes from generating information – which is simply relationships between different data ‘objects’. If a user is to get the economic benefits of using their data from other repositories, companies simply need to allow the ability for a user to delegate permission for others to access that data. Such a thing does not compromise a company’s competitive advantage as they won’t necessarily have to delete data they have of a user; rather it requires them to try to to realise that holding in custody a users data or parts of it gives them a better advantage as hosting a users data gives them complete access, to try to come up with innovative new information products for the user.

So what’s my point? When discussing DataPortability, let’s focus on the value to the user. And the next time the top tech blogs confront the companies that are supporting the movement with a simplistic “when are you going to let users take their data completely off ” I am going to burn my bra in protest.

Disclosure: I’m a hetrosexual male that doesn’t cross-dress

Update: I didn’t mean to scapegoat Eric from VentureBeat who is a brilliant writer. However I used him to give an example of the language being used in the entire community which now needs to change. With the DP research phase now officially underway for the next few months, the questions we should be asking should be more open-ended as we at the DataPortability project have realised these issues are complex, and we need to get the entire community to come to a consensus. DataPortability is no longer just about exporting your social graph – it’s an entirely new approach to how we will be doing business on the net, and as such, requires us to fundamentally reexamine a lot more than we originally thought.

Can you answer my question?

We at the DataPortability project have kick started a research phase, because we’ve realised we need to spend more time consulting with the community working out issues which don’t quite have one answer.

As Chris Saad and myself are also experimenting with a new type of social organisation as we incubate the DataPortability project, which I call wikiocracy (Chris calls it participant democracy), I thought I might post these issues on my blog to keep in line with the decentralised ethos we are encouraging with DataPortability. This is something the entire world should be questioning,

So below are some thoughts I have had. They’ve changed a lot since I first thought about what a users data rights are, and no doubt, they will change again. But hopefully my thoughts can act as a catalyst for what people think data rights really are, and a focus on the issue at stake which I conclude as my question. I think the bill of rights for users on the social web is not quite adequate, and we need a more careful analysis of the issues.

It’s the data, stupid
Data is essentially an object. Standalone it’s useless – take for example the name “Elias”. In the absence of anything else, that piece of datum means nothing. However when you associate that name with my identity (ie, appending my surname Bizannes or linking it to my facebook profile), that suddenly becomes “information”. Data is an object and information is generated when you create linkages between different types of data – the ‘relationships’.

Take this data definition from DMReview which defines data (and information):

Items representing facts, text, graphics, bit-mapped images, sound, analog or digital live-video segments. Data is the raw material of a system supplied by data producers and is used by information consumers to create information.

Data is an object and information is a relationship between data – I’ve studied database theory at university to be authoritative on that! But since I didn’t do philosophy, then what is knowledge?

Knowledge can be considered as the distillation of information that has been collected, classified, organized, integrated, abstracted and value added
(source)

Relationships, facts, assumptions, heuristics and models derived through the formal and informal analysis or interpretation of data
(source)

So in other words, knowledge is the application of information to a scenario. Whilst I apologise if this appears that I am splitting hairs, I think clarifying what these terms are is fundamental to the implementation of DataPortability. Why this is relevant will be seen below, but now we need to move onto what does the second concept mean.

Portability
On first interpretation, portability means the ability to move something – exporting and importing. I think we shouldn’t take the ability to move data around as the sole definition of portability but it should also mean being able to port the context that data is used. After all, information and knowledge is based on the manipulation of data, and you don’t need to move data per se but merely change the context to do that. A vendor can add value to a consumer by building unique relationships between data and giving unique application to other scenarios – where the original data is stored is irrelevant as long as its accessible.

Portability to me means a person needs to have the ability to determine where their data is used. But to do that, they need control over that data – which means determining how it is used. Yet there is little point being able to determine how your data is used, if you can’t determine who can access your data. Therefore, the concept of portability invokes an understanding of what exactly control and accessibility means.

So to discuss portability, requires us to also understand what does data control and data accessibility really mean. You can’t “port” something unless you control it; and you can’t “control” something, if you can’t determine who can “access” it. As I state, as long as the data is accessible, the location of it can be on the moon for all I care: for the concept of portability by context to exist, we must ensure as a condition that the data is open to access.

Ownership
Now here is where it gets complicated: who owns what? Maybe the conversation should come to who owns the information and knowledge generated from that data. Data on its own, potentially doesn’t belong to anyone. My name “Elias” is shared by millions of other people in the world. Whilst I may own my identity, which my name is a representation of that, is it fair to say I own the name “Elias”? On the flip side, if a picture I took is considered data – I think it’s fair to say I “own” that piece of data.

Information on the other hand, requires a bit of work to create. Therefore, the generator of that information should get ownership. However when we start applying this concept to something like a social relationship, it gets a bit tricky. If I add a friend on Facebook, and they accept me, who “owns” that relationship? Effectively both of us – so we become join partners in ownership of that piece of information. If I was to add someone as a friend on MySpace, they don’t necessarily have to reciprocate – therefore it’s a one way relationship. Does that mean, I own that information?

This is when the concept of privacy comes in. If I am generating information about someone, am I entitled to it? If someone owns the underlying data I used to generate that information – then it would be fair to say, I am “licensing” usage of that data to generate information which de-facto is owned by them. But privacy as a concept and in the legislation of many countries doesn’t work like that. Privacy is even a right along side other basic rights like freedom of expression and religion in the constitution of Iraq (Article 17). So what’s privacy in the context of information that relates to someones identity?

Perhaps we should define privacy as the right to control information that represents an entity’s identity (being a person or legal body). Such as definition ties with defamation law for example, and the principle of privacy: you have control over what’s been said about you, as a fundamental human right. But yet again, I’ve just opened up a can of worms: what is “identity”? Maybe the Identity commons people can answer that? Would it be fair to say, that in the context of an “identity”, an entity like a person ‘owns’ that? So when it comes to information relating to someones identity, do we override it with this human right to privacy as to who owns that information, regardless of who generated that information?

This posting is a question, rather than an answer. When we say we want “data portability”, we need to be clear what exactly this means. Companies I believe are slightly afraid of DataPortability, because they think they will lose something, which is not true. Companies commercial interests are something I am very mindful when we have these discussions, and I will ensure with my involvement that DataPortability pioneers not some unrealistic ideal but a genuine move forward in business thinking. It needs to be clear what constitutes ownership and of what so we can design a blueprint that accounts for users’ data rights, without ruining the business models of companies that rely on our data.

Which brings me to my question – “who owns what”?

« Older posts Newer posts »