Frequent thinker, occasional writer, constant smart-arse

Tag: data rights

Thoughts on privacy – possibly just a txt file away

The other week, a good friend of mine through my school and university days, dropped me a note. He asked me that now that he is transitioning from being a professional student to legal guru (he’s the type I’d expect would become a judge of the courts), that I pull down the website that hosts our experiment in digital media from university days. According to him, its become "a bit of an issue because I have two journal articles out, and its been brought to my attention that a search brings up writing of a very mixed tone/quality!".

In what seemed like a different lifetime for me, I ran a university Journalist’s Society and we experimented with media as a concept. One of our successful experiments, was a cheeky weekly digital newsletter, that held the student politicians in our community accountable. Often our commentary was hard-hitting, and for $4 web hosting bills a month and about 10 hours work each, we become a new power on campus influencing actions. It was fun, petty, and a big learning experience for everyone involved, including the poor bastards we massacred with accountability.

control panel

Privacy in the electronic age: is there an off button?

However this touches on all of us as we progress through life, what we thought was funny in a previous time, may now be awkward that we are all grown up. In this digitally enabled world, privacy has come to the forefront as an issue – and we are now suddenly seeing scary consequences of having all of our information available to anyone at anytime.

I’ve read countless articles about this, as I am sure you have. One story I remember is a guy who contributed to a marijuana discussion board in 2000, now struggles with jobs as that drug-taking past of his is the number one search engine result. The digital world, can really suck sometimes.

Why do we care?

This is unique and awkward, because it’s not someone defaming us. It’s not someone taking our speech out of context, and menacingly putting it a way that distorts our words. This is 100% us, stating what we think, with full understanding what the consequences of our actions were. We have no one but ourselves to blame.

nice arse

Time changes, even if the picture doesn’t: Partner seeing pictures of you – can be ok. Ex seeing pictures of you – likely not ok.

In the context of privacy, is it our right to determine who can see what about us, when we want them to? Is privacy about putting certain information in the "no one else but me" box or is it more dynamic then that – meaning, it varies according to the person consuming the information?

When I was younger, I would meet attractive girls quite a bit older than me, and as soon as I told them my age, they suddenly felt embarrassed. They either left thinking how could they let themselves be attracted to a younger man, treating me like I was suddenly inferior, or they showed a very visible reaction of distress! Actually, quite memorably when I was 20 I told a girl that I was on a date with that I was 22 – and she responded "thank God, because there is nothing more unattractive I find, than a guy that is younger than me". It turned out, fortunately, she had just turned 22. My theory about age just got a massive dose of validation.

Now me sharing this story is that certain information about ourselves can have adverse affects on us (in this case, my sex life!). I normally could not care less about my age, but with girls I would meet when I went out, I did care because it affected their perception of me. Despite nothing changing, the single bit of information about my age would totally change the interaction I had with a girl. Likewise, when we are interacting with people in our lives, the sudden knowledge of a bit of information could adversely affect their perception.

Bathroom close the hatch please

Some doors are best kept shut. Kinky for some; stinky for others

A friend of mine recently admitted to his girlfriend of six months that he’s used drugs before, which had her breakdown crying. This bit of information doesn’t change him in any way; but it shapes her perception about him, and the clash with her perception with the truth, creates an emotional reaction. Contrast this to these two party girls I met in Spain in my nine-months away, who found out I had never tried drugs before at the age of 21. I disappointed them, and in fact, one of them (initially) lost respect for me. These girls and my friends girlfriend, have two different value systems. And that piece of information, generates a completely differing perception – taking drugs can be seen as a "bad person" thing, or a "open minded" person, depending on who you talk to.

As humans, we care about what other people think. It influences our standing in society, our self-confidence, our ability to build rapport with other people. But the issue is, how can you control your image in an environment that is uncontrollable? What I tell one group of people for the sake of building rapport with them, I should also have the ability of ensuring that conversation is not repeated to others, who may not appreciate the information. If I have a fetish for women in red heels which I share with my friends, I should be able to prevent that information from being shared with my boss who loves wearing red heels and might feel a bit awkward the next time I look at her feet.

Any solutions?

Not really. We’re screwed.

Well, not quite. To bring it back to the e-mail exchange I had with my friend, I told him that the historian and technologist in me, couldn’t pull down a website for that reason. After all, there is nothing we should be ashamed about. And whilst he insisted, I made a proposal to him: what about if I could promise that no search engine would include those pages in their index, without having to pull the website down?

He responded with appreciation, as that was what the issue was. Not that he was ashamed of his prior writing, but that he didn’t want academics of today reading his leading edge thinking about the law, to come across his inflammatory criticism of some petty student politicians. He wanted to control his professional image, not erase history. So my solution of adding a robots.txt file was enough to get his desired sense of privacy, without fighting a battle with the uncontrollable.

Who knew, that privacy can be achieved with a text file that has two lines:

User-agent: *

Disallow: /

Those two lines are enough to control the search engines, from a beast that ruins our reputation, to a mechanism of enforcing our right to privacy. Open standards across the Internet, enabling us to determine how information is used, is what DataPortability can help us do achieve so we can control our world. The issue of privacy is not dead – we just need some creative applications, once we work out what exactly it is we think we are losing.

What is data?

The leading voices in technology have exploded in discussion about data portability, data rights, and the future of web applications. As an active member in the DataPortability Policy group, here is my suggestion on how the debate needs to proceed: break it down. Michael Arrington seems pretty convinced you own all your data, but I don’t think that’s a fair thing to say – and at core is the reason he is clashing with Robert Scoble’s view. For things to proceed, I really think a deeper analysis of the issues need to be made.

1) Define the difference between data, information and knowledge. There’s a big difference.
2) Determine what things are. (is an e-mail address data or information?)
3) Recognise the difference between ownership, rights and their implications.
4) Determine what rights (if that’s what it is) the various entities have over data (users, web apps, etc).

This is a big area and has a lot of abstract concepts – break it down and debate it there.

Some of my own thoughts to give some context

1) Data is an object and information is generated when you create linkages between different types of data Рthe ‘relationships’. Knowledge is the application of information.

  • 2000 is data – a symbol with no meaning. Connect it with other data, like the noun "year", and you have information because 2008 now has meaning. Connect that information with other information, like "computer bug" and "HSBC and you now have an application of that information. That being, there was an issue with the Y2K bug that has something to the bank HSBC.

2) Define what things are

What’s an e-mail address, a phone number, a social graph, an image, a podcast…I’m not entirely sure. I wouldn’t be blogging this if I had all the answers. Once we agree on definitions, we can then start categorising them and applying a criteria.

3) Ownership:

Here is something Steve Greenberg explained to me

– Ownership is relevant when there is scarcity.
РOwnership is the ability to deny someone else’s use of the asset.
– So, if data is shared and publicly available, it is a practical impossibility for me to deny use
Рand if data is available in a form where I can’t control others’ use of it, I can not really claim to own it

Nitin Borwankar has a very different argument: you should have ownership based on property rights. He explained that to me here .

4) Rights over data

I personally think no one owns data (which is inspired by the definition of data being inherently meaningless); instead you own things further down the value chain when that data becomes something with value. You own your overall blog posts – but not the words.

But again, this goes back to what is data?

DataPortability is about user value, fool!

In a recent interview, VentureBeat asks Facebook creator and CEO Mark Zuckerberg the following:

VB: Facebook has recently joined DataPortability.org, a working group among web companies, that intends to develop common standards so users can access their data across sites. Is Facebook going to let users — and other companies — take Facebook data completely off Facebook?

MZ: I think that trend is worth watching.

It disappoints me to see that, because it seems like a quick journalists hit at a contentious issue. On the other hand, we have seen amazing news today which are examples of exactly the type of thing we should be expecting in a data portability enabled world: the Google contacts API which has been a thing we have highlighted for months now as an issue for data security and Google analytics allowing benchmarking which is a clear example of a company that understands by linking different types of data you generate more information and therefore value for the user. The DataPortability project is about trying to advocate new ways of thinking, and indeed, we don’t have to formally produce a product in as much maintain the agenda in the industry.

However the reason I write this is that it worries me a bit that we are throwing around the term “data portability” despite the fact the DataPortability Project has yet to formally define what that means. I can say this because as a member of the policy action group and the steering action group which are responsible for making this distinction, we have yet to formally decide.

Today, I offer an analysis of what the industry needs to be talking about, because the term is being thrown around like buggery. Whilst it may be weeks or months before we finalise this, it’s starting to bother me that people seem to think the concept means solving the rest of the world’s problems or to disrupt the status quo. It’s time for some focus!

Value creation
First of all, we need to determine why the hell we want data portability. DataPortability (note the distinction of the term with that of ‘data portability’ Рthe latter represents the philosophy whilst the former is the implementation of that philosophy by DataPortability.org) is not a new utopian ideal; it’s a new way of thinking about things that will generate value in the entire Information sector. So to genuinely want to create value for consumers and businesses alike, we need to apply thinking that we use in the rest of the business world.

A company should be centered on generating value for its customers. Whilst they may have obligations to generate returns for their shareholders, and may attempt different things to meet those obligations; they also have an obligation to generate shareholder value. To generate shareholder value, means to fund the growth of their business ultimately through increased customer utility which is the only long term way of doing so (taking out acquisitions and operational efficiency which are other ways companies generate more value but which are short term measures however). Therefore an analysis of what value DataPortability creates should be done with the customer in mind.

The economic value of a user having some sort of control over their data is that they can generate more value through their transactions within the Information economy. This means better insights (ie, greater interoperability allowing the connection of data to create more information), less redundancy (being able to use the same data), and more security (which includes better privacy which can compromise a consumers existence if not managed).

Secondly, what does it mean for a consumer to have data portability? Since we have realised that the purpose of such an exercise is to generate value, questions about data like “control”, “access” and “ownership” need to be reevaluated because on face value, the way they are applied may have either beneficial or detrimental effects for new business models. The international accounting standards state that you can legally “own” an asset but not necessarily receive the economics benefits associated with that asset. The concept of ownership to achieve benefit is something we really need to clarify, because quite frankly, ownership does not translate into economic benefit which is what we are at stake to achieve.

Privacy is a concept that has legal implications, and regardless of what we discuss with DataPortability, it still needs to be considered because business operates within the frameworks of law. Specifically, the human rights of an individual (who are consumers) need to be given greater priority than any other factor. So although we should be focused on how we can generate value, we also need to be mindful that certain types of data, like personally identifiable data, needs to be considered in adifferent light as there are social implications in addition to the economic aspects.

The use cases
The technical action group within the DataPortability project has been attempting to create a list of scenarios that constitute use cases for DataPortability enablement. This is crucial because to develop the blueprint, we also need to know what exactly the blueprint applies to.

I think it’s time however we recognise, that this isn’t merely a technical issue, but an industry issue. So now that we have begun the research phase of the DataPortability Project, I ask you and everyone else to join me as we discuss what exactly is the economic benefit that DataPortability creates. Rather than asking if Facebook is going to give up its users data to other applications, we need to be thinking on what is the end value that we strive to achieve by having DataPortability.

Portability in context, not location
When the media discuss DataPortability, please understand that a user simply being able to export their data is quite irrelevant to the discussion, as I have outlined in my previous posting. What truly matters is “access”. The ability for a user to command the economic benefits of their data, is the ability to determine who else can access their data. Companies need to be thinking that value creation comes from generating information – which is simply relationships between different data ‘objects’. If a user is to get the economic benefits of using their data from other repositories, companies simply need to allow the ability for a user to delegate permission for others to access that data. Such a thing does not compromise a company’s competitive advantage as they won’t necessarily have to delete data they have of a user; rather it requires them to try to to realise that holding in custody a users data or parts of it gives them a better advantage as hosting a users data gives them complete access, to try to come up with innovative new information products for the user.

So what’s my point? When discussing DataPortability, let’s focus on the value to the user. And the next time the top tech blogs confront the companies that are supporting the movement with a simplistic “when are you going to let users take their data completely off ” I am going to burn my bra in protest.

Disclosure: I’m a hetrosexual male that doesn’t cross-dress

Update: I didn’t mean to scapegoat Eric from VentureBeat who is a brilliant writer. However I used him to give an example of the language being used in the entire community which now needs to change. With the DP research phase now officially underway for the next few months, the questions we should be asking should be more open-ended as we at the DataPortability project have realised these issues are complex, and we need to get the entire community to come to a consensus. DataPortability is no longer just about exporting your social graph – it’s an entirely new approach to how we will be doing business on the net, and as such, requires us to fundamentally reexamine a lot more than we originally thought.

Can you answer my question?

We at the DataPortability project have kick started a research phase, because we’ve realised we need to spend more time consulting with the community working out issues which don’t quite have one answer.

As Chris Saad and myself are also experimenting with a new type of social organisation as we incubate the DataPortability project, which I call wikiocracy (Chris calls it participant democracy), I thought I might post these issues on my blog to keep in line with the decentralised ethos we are encouraging with DataPortability. This is something the entire world should be questioning,

So below are some thoughts I have had. They’ve changed a lot since I first thought about what a users data rights are, and no doubt, they will change again. But hopefully my thoughts can act as a catalyst for what people think data rights really are, and a focus on the issue at stake which I conclude as my question. I think the bill of rights for users on the social web is not quite adequate, and we need a more careful analysis of the issues.

It’s the data, stupid
Data is essentially an object. Standalone it’s useless – take for example the name “Elias”. In the absence of anything else, that piece of datum means nothing. However when you associate that name with my identity (ie, appending my surname Bizannes or linking it to my facebook profile), that suddenly becomes “information”. Data is an object and information is generated when you create linkages between different types of data – the ‘relationships’.

Take this data definition from DMReview which defines data (and information):

Items representing facts, text, graphics, bit-mapped images, sound, analog or digital live-video segments. Data is the raw material of a system supplied by data producers and is used by information consumers to create information.

Data is an object and information is a relationship between data – I’ve studied database theory at university to be authoritative on that! But since I didn’t do philosophy, then what is knowledge?

Knowledge can be considered as the distillation of information that has been collected, classified, organized, integrated, abstracted and value added
(source)

Relationships, facts, assumptions, heuristics and models derived through the formal and informal analysis or interpretation of data
(source)

So in other words, knowledge is the application of information to a scenario. Whilst I apologise if this appears that I am splitting hairs, I think clarifying what these terms are is fundamental to the implementation of DataPortability. Why this is relevant will be seen below, but now we need to move onto what does the second concept mean.

Portability
On first interpretation, portability means the ability to move something – exporting and importing. I think we shouldn’t take the ability to move data around as the sole definition of portability but it should also mean being able to port the context that data is used. After all, information and knowledge is based on the manipulation of data, and you don’t need to move data per se but merely change the context to do that. A vendor can add value to a consumer by building unique relationships between data and giving unique application to other scenarios – where the original data is stored is irrelevant as long as its accessible.

Portability to me means a person needs to have the ability to determine where their data is used. But to do that, they need control over that data – which means determining how it is used. Yet there is little point being able to determine how your data is used, if you can’t determine who can access your data. Therefore, the concept of portability invokes an understanding of what exactly control and accessibility means.

So to discuss portability, requires us to also understand what does data control and data accessibility really mean. You can’t “port” something unless you control it; and you can’t “control” something, if you can’t determine who can “access” it. As I state, as long as the data is accessible, the location of it can be on the moon for all I care: for the concept of portability by context to exist, we must ensure as a condition that the data is open to access.

Ownership
Now here is where it gets complicated: who owns what? Maybe the conversation should come to who owns the information and knowledge generated from that data. Data on its own, potentially doesn’t belong to anyone. My name “Elias” is shared by millions of other people in the world. Whilst I may own my identity, which my name is a representation of that, is it fair to say I own the name “Elias”? On the flip side, if a picture I took is considered data – I think it’s fair to say I “own” that piece of data.

Information on the other hand, requires a bit of work to create. Therefore, the generator of that information should get ownership. However when we start applying this concept to something like a social relationship, it gets a bit tricky. If I add a friend on Facebook, and they accept me, who “owns” that relationship? Effectively both of us – so we become join partners in ownership of that piece of information. If I was to add someone as a friend on MySpace, they don’t necessarily have to reciprocate – therefore it’s a one way relationship. Does that mean, I own that information?

This is when the concept of privacy comes in. If I am generating information about someone, am I entitled to it? If someone owns the underlying data I used to generate that information – then it would be fair to say, I am “licensing” usage of that data to generate information which de-facto is owned by them. But privacy as a concept and in the legislation of many countries doesn’t work like that. Privacy is even a right along side other basic rights like freedom of expression and religion in the constitution of Iraq (Article 17). So what’s privacy in the context of information that relates to someones identity?

Perhaps we should define privacy as the right to control information that represents an entity’s identity (being a person or legal body). Such as definition ties with defamation law for example, and the principle of privacy: you have control over what’s been said about you, as a fundamental human right. But yet again, I’ve just opened up a can of worms: what is “identity”? Maybe the Identity commons people can answer that? Would it be fair to say, that in the context of an “identity”, an entity like a person ‘owns’ that? So when it comes to information relating to someones identity, do we override it with this human right to privacy as to who owns that information, regardless of who generated that information?

This posting is a question, rather than an answer. When we say we want “data portability”, we need to be clear what exactly this means. Companies I believe are slightly afraid of DataPortability, because they think they will lose something, which is not true. Companies commercial interests are something I am very mindful when we have these discussions, and I will ensure with my involvement that DataPortability pioneers not some unrealistic ideal but a genuine move forward in business thinking. It needs to be clear what constitutes ownership and of what so we can design a blueprint that accounts for users’ data rights, without ruining the business models of companies that rely on our data.

Which brings me to my question – “who owns what”?

Control doesn’t necessarily mean access

I was approached by multiple people – PR professionals and journalists alike – after I gave my presentation at the kickstart forum yesterday. Whilst I doubt DataPortability is something they will pick up on for feature stories given the product focus these journalists have, the conversations with them were extremely encouraging and I am thank full to get their feedback.

One conversation particularly stood out for me, which was with John Hepworth – a former engineer whose has been freelance writing for over 20 years, and it was in the context of the ability to port your health information. I’ve been thinking a lot of the scenario whereby consumers can move their health records from clinics, and with Google Health launching and the discussions in the DataPortability forums I am certainly not alone. Something that caught my attention was Deepak Singh who recently posted an interesting perspective: we shouldn’t give users access to their health records, because they will make uninformed judgments if they have control of them. That’s an excellent point, but one which prickles the whole issue of not just who owns your data, but who should have access to it (including yourself).

Hepworth provided a simple but extremely insightful position to this issue: you don’t need to give users the ability to see their data, for them to control it. Brilliant!

The benefits of controlling your data, needs to be looked at not just in the context of the laws of a country, but on the net benefit it provides to an individual. Comments provided by your physicians in your medical history, whilst although they deserve to be given ownership to the individual they are about, they also need to be given access to people who are qualified to make educated judgments. In others words, you should have the right to port your data to another doctor, but you should only have access to it in the presence of a qualified doctor.

DataPortability should not equate in you seeing your data all the time – rather it should be about determining how it gets used by others.

My presentation at Kickstart forum

I’m currently at Kickstart forum (along with the Mickster), and I just gave a presentation on DataPortability to a bunch of Aussie journalists. I didn’t write a speech, but I did jot down some points on paper before I spoke, so I thought I might share them here given I had a good response.

My presentation had three aspects: background, explanation, and implications of DataPortability. Below is a summary of what I said

Background

  • Started by a bunch of Australians and a few other people overseas in November 2007 out of a chatroom. We formed a workgroup to explore the concept of social network data portability
  • In January 2008, Robert Scoble had an incident, which directed a lot of attention to us. As a consequence, we’ve seen major companies such as Google, Microsoft, Yahoo, Facebook, Six Apart, LinkedIn, Digg, and a host of others pledge support for the project.
  • We now have over 1000 people contributing, and have the support of a lot of influential people in the industry who want us to succeed.

Explanation

  • The goal is to not invent anything new. Rather, it’s to synthesise existing standards and technologies, into one blueprint – and then we push it out to the world under the DataPortability brand
  • When consumers see the DataPortability brand, they will know it represents certain things – similar to how users recognise the Centrino brand represents Intel, mobility, wireless internet, and a long battary life. The brand is to communicate some fundamental things about a web service, that will allow a user to recognise a supporting site respects it’s users data rights and certain functionality.
  • Analogy of zero-networking: before the zeroconf initiative it was difficult to connect to the internet (wirelessly). Due to the standardisation of policies, we can now connect on the internet wirelessly at the click of a button. The consequence of this is not just a better consumer experience, but the enablement of future opportunities such as what we are seeing with the mobile phone. Likewise, with DataPortability we will be able to connect to new applications and things will just “work” – and it will see new opportunity for us
  • Analogy of the bank: I stated how the attention economy is something we give our attention to ie, we put up with advertising, and in return we get content. And that the currency of the attention economy is data. With DataPortability, we can store our data in a bank, and via “electronic transfer”, we can interact with various services controlling the use of that data in a centralised manner. We update our data at the bank, and it automatically synchronises with the services we use ie, automatically updating your Facebook and MySpace profiles

Implications

  1. Interoperability: When diverse systems and organisations work together. A DataPortability world will allow you to use your data generated from other sites ie, if you buy books on Amazon about penguins, you can get movie recommendations on your pay TV movie catalog for penguins. Things like the ability to log in across the web with one sign-on, creates a self-supporting ecosystem where everyone benefits.
  2. Semantic web: I gave an explanation of the semantic web (which generated a lot of interest afterwards in chats), and then I proceeded to explain that the problem for the semantic web is there hasn’t been this uptake of standards and technologies. I said that when a company adopts the DataPortability blueprint, they will effectively be supporting the semantic web – and hence enabling the next phase of computing history
  3. Data rights: I claimed the DataPortability project is putting data rights in the spotlight, and it’s an issue that has generated interest from other industries like the health and legal sectors, and not just the Internet sector. Things like what is privacy, and what exactly does my “data” mean. DataPortability is creating a discussion on what this actually means
  4. Wikiocracy: I briefly explained how we are doing a social experiment, with a new type of of governance model, which can be regarded as an evolution of the open source model. “Decentralised” and “non-hierarchical”, which with time it will be more evident with what we are trying to do

Something that amused me was in the sessions I had afterwards when the journalists had a one-on-one session with me, one woman asked: “So why are you doing all of this?”. I said it was an amazing opportunity to meet people and build my profile in the tech industry, to which she concluded: “you’re doing this to make history, aren’t you?”. I smiled 🙂