Frequent thinker, occasional writer, constant smart-arse

Tag: apml

Blog posts on Liako.Biz for 2007

Continued on – a series of posts that summarises content created on Liako.Biz

You can also read 2008 and 2005 summaries.

December 2007

November 2007

October 2007

September 2007

August 2007

July 2007

June 2007

May 2007

April 2007

March 2007

So open it’s closed

The DataPortability Project has successfully promoted in 2008 the concept of “data portability”. However it’s become too successful – people make announcements now that claim to be “data portability” but are misleadingly not. Further, the term “Open” has become the new black. But really, when people say they are open – are they?

Status update on the DataPortability Project & context
The DataPortability Project now has developed a strong underlying transparent governance model to make decisions which embeds a process to achieve outcomes. We have also formulated our vision that forms the core DNA of the Project and allow us to align our efforts. Organisationally, we are currently working on a legal entity to protect our online community, and we are doing this whilst also ensuring we are working with others in the industry, such as the discussions we’ve had within the IDTBD proposal with Liberty Alliance, Identity Commons and others.

Our brand communications are nearly finalised (this time, legally vetted), and a refreshed website with a new blog has been rolled out. We’ve put out calls for positions and have already finalised our agreement with a new community manager. (Now open are positions for our analyst roles if you are interested.)

We have a Health Care task force that’s just started, looking to broaden our work into another sector of the economy. We also have an Service Provider Grid Task force finalising its work, which via an online interface and API, will allow people to query what various entities use in terms of open standards. We also have a task force that will provide sample EULA and TOS documents that encourage data portability, and further our vision.

The DataPortability vision states that people should be able to reuse their data. Traditionally in the past, people have said this means “physically” porting their data amongst web services. Whilst this applies in some cases, it is also about access as I recently argued .

So to synchronise our work on the EULA/ToS task force, I believe we need a technology equivalent, and which will give additional value to our Service Provider Grid. This is because Open Standards comply with our vision, and we need to ensure we only support efforts that we believe are worthy.

Hi, I’m open
Open Standards have been a core value that the DataPortability Project has advocated for since its founding, getting to the point where its even been confused as its core mission (it’s not). For us, they are an enabler Рand it has always been in our interest to see all of them work together.

Standards are important because they allow interoperability. For people to be able to access their data from multiple systems, we require systems to be able to easily communicate with each other. Likewise, for people to get value of any data they export from a system, they need to be able to import it – and this can only occur if the data is structured in a way that is compatible with another system.

We advocate “Open” because we want to minimise the costs of business for wanting to comply with our vision. However during 2008, the term "Open" Standards has been over-used, to the point of abuse.

An open standard is a standard that is publicly available and has various rights of use associated with it. But really, what’s open?
– its availability?
– the authority controlling the standard?
– the decision making process over the standard?

Liberty Alliance defines it as:

– The costs for the use of the standard are low.
– The standard has been published.
– The standard is adopted on the basis of an open decision-making procedure.
– The intellectual property rights to the standard are vested in a not-for-profit organisation, which operates a completely free access policy.
– There are no constraints on the re-use of the standard.

That I believe, perfectly encapsulates what I think an Open Standard should be. However as someone who spends his days applying international accounting standards to what companies report in their financials, I can assure you, simply flagging the criteria is only half the fun. Interpreting them is a whole debate in itself.

In my eye, most of these "open" efforts don’t fit that criteria. To illustrate, I am going to shame myself as I am a member of a workgroup that claims to be open: the APML workgroup. The group fails the open test because:
– it has a closed workgroup that makes the decisions, without a clearly defined decision making procedure
– it does not have a non-profit behind it, with the copyright owned by a company (although it’s made clear there is no intention to issue patents)
– it has no clear rights attached to it

So does that mean every standard group needs to create a legal entity for it to be open? Thankfully no – the Open Web Foundation (OWF) will solve this problem. Or does it? Whilst the decision making process is "open" (you can read the mailing list where the discussion occurs), what about the way it selects members? It’s dependent on being invited. That’s Open with a big But.

How about OpenID (which I am also a member of) – that poster child for "Open Standards". On the face of it, it fits the bill. But did you know OpenID contains other standards as part of it? As my friend and intellectual mentor Steve Greenberg said:

openid xrds greenberg

Now thankfully, XRDS fits the bill as a safe standard. Well kind of. It has links to another standard XRI, which it is alleged are subject to patent claims. Well sort of. Kinda. Oh God, let’s not get into a discussion about this again. But don’t give poor APML, the OWF or Open ID too much grief – I could indeed raise some nastier questions especially at other groups. However this isn’t about shaming – rather, it’s about raising questions.

The standards communities are fraught with politics, they are murky, and now they are creeping into the infrastructure of our online world. As a proponent for these "Open Standards", I think it’s time we start looking at them with a more critical eye. Yes, I recognise all these questions I’m raising are fixable, but that’s why I want to raise the point, because they are currently being swept under the carpet outside of the traditional authorities like the W3C.

It’s time some boundaries were set on what is effectively the brand of Open. It’s also time the term is defined, because quite frankly, its lost all meaning now. I’ve listed some criteria – but what we really need is some consensus on what ‘the’ criteria for Open should be.

How Google reader can finally start making money

Today, you would have heard that Newsgator, Bloglines, Me.dium, Peepel, Talis and Ma.gnolia have joined the APML workgroup and are in discussions with workgroup members on how they can implement APML into their product lines. Bloglines created some news the other week on their intention to adopt it, and the announcement today about Newsgator means APML is now fast becoming an industry standard.

Google however, is still sitting on the side lines. I really like using Google reader, but if they don?¢‚Ǩ‚Ñ¢t announce support for APML soon, I will have to switch back to my old favourite Bloglines which is doing some serious innovating. Seeing as Google reader came out of beta recently, I thought I?¢‚Ǩ‚Ñ¢d help them out to finally add a new feature (APML) that will see it generate some real revenue.

What a Google reader APML file would look like
Read my previous post on what exactly APML is. If the Google reader team was to support APML, what they could add to my APML file is a ranking of blogs, authors, and key-words. First an explanation, and then I will explain the consequences.

In terms of blogs I read, the percentage frequency of posting I read from a particular blog will determine the relevancy score in my APML file. So if I was to read 89% of Techcrunch posts ?¢‚Ǩ‚Äú which is information already provided to users ?¢‚Ǩ‚Äú it would convert this into a relevancy score for Techcrunch of 89% or 0.89.

ranking

APML: pulling rank

In terms of authors I read, it can extract who posted the entry from the individual blog postings I read, and like the blog ranking above, perform a similar procedure. I don?¢‚Ǩ‚Ñ¢t imagine it would too hard to do this, however given it?¢‚Ǩ‚Ñ¢s a small team running the product, I would put this on a lower priority to support.

In terms of key-words, Google could employ its contextual analysis technology from each of the postings I read and extract key words. By performing this on each post I read, the frequency of extracted key words determines the relevance score for those concepts.

So that would be the how. The APML file generated from Google Reader would simply rank these blogs, authors, and key-words – and the relevance scores would update over time. Over time, the data is indexed and re-calculated from scratch so as concepts stop being viewed, they start to diminish in value until they drop off.

What Google reader can do with that APML file
1. Ranking of content
One of the biggest issues facing consumers of RSS is the amount of information overload. I am quite confident to think that people would pay a premium, for any attempt to help rank the what can be the hundreds of items per day, that need to be read by a user. By having an APML file, over time Google Reader can match postings to what a users ranked interests are. So rather than presenting the content by reverse chronology (most recent to oldest); it can instead organise content by relevancy (items of most interest to least).

This won?¢‚Ǩ‚Ñ¢t reduce the amount of RSS consumption by a user, but it will enable them to know how to allocate their attention to content. There are a lot of innovative ways you can rank the content, down to the way you extract key works and rank concepts, so there is scope for competing vendors to have their own methods. However the point is, a feature to ?¢‚ǨÀúSort by Personal Relevance?¢‚Ǩ‚Ñ¢ would be highly sort after, and I am sure quite a few people will be willing to pay the price for this God send.

I know Google seems to think contextual ads are everything, but maybe the Google Reader team can break from the mould and generate a different revenue stream through a value add feature like that. Google should apply its contextual advertising technology to determine key words for filtering, not advertising. It can use this pre-existing technology to generate a different revenue stream.

2. Enhancing its AdSense programme

blatant ads

Targeted advertising is still bloody annoying

One of the great benefits of APML is that it creates an open database about a user. Contextual advertising, in my opinion is actually a pretty sucky technology and its success to date is only because all the other types of targeted advertising models are flawed. As I explain above, the technology instead should be done to better analyse what content a user consumes, through keyword analysis. Over time, a ranking of these concepts can occur ?¢‚Ǩ‚Äú as well as being shared from other web services that are doing the same thing.

An APML file that ranks concepts is exactly what Google needs to enhance its adwords technology. Don?¢‚Ǩ‚Ñ¢t use it to analyse a post to show ads; use it to analyse a post to rank concepts. Then, in aggregate, the contextual advertising will work because it can be based off this APML file with great precision. And even better, a user can tweak it ?¢‚Ǩ‚Äú which will be the equivalent to tweaking what advertising a user wants to get. The transparency of a user being able to see what ‘concept ranking’ you generate for them, is powerful, because a user is likely to monitor it to be accurate.

APML is contextual advertising biggest friend, because it profiles a user in a sensible way, that can be shared across applications and monitored by the user. Allowing a user to tweak their APML file for the motivation of more targeted content, aligns their self-interest to ensure the targeted ads thrown at them based on those ranked concepts, are in fact, relevant.

3. Privacy credibility
Privacy is the inflation of the attention economy. You can?¢‚Ǩ‚Ñ¢t proceed to innovate with targeted advertising technology, whilst ignoring privacy. Google has clearly realised this the hard way by being labeled one of the worst privacy offenders in the world. By adopting APML, Google will go a long way to gain credibility in privacy rights. It will be creating open transparency with the information it collects to profile users, and it will allow a user to control that profiling of themselves.

APML is a very clever approach to dealing with privacy. It?¢‚Ǩ‚Ñ¢s not the only approach, but it a one of the most promising. Even if Google never uses an APML file as I describe above, the pure brand-enhancing value of giving some control to its users over their rightful attention data, is something alone that would benefit the Google Reader product (and Google?¢‚Ǩ‚Ñ¢s reputation itself) if they were to adopt it.

privacy

Privacy. Stop looking.

Conclusion
Hey Google – can you hear me? Let’s hope so, because you might be the market leader now, but so was Bloglines once upon a time.

Explaining APML: what it is & why you want it

Lately there has been a lot of chatter about APML. As a member of the workgroup advocating this standard, I thought I might help answer some of the questions on people’s minds. Primarily – “what is an APML file”, and “why do I want one”. I suggest you read the excellent article by Marjolein Hoekstra on attention profiling that she recently wrote, if you haven’t already done so, as an introduction to attention profiling. This article will focus on explaining what the technical side of an APML file is and what can be done with it. Hopefully by understanding what APML actually is, you’ll understand how it can benefit you as a user.

APML – the specification
APML stands for Attention Profile Markup Language. It’s an attention economy concept, based on the XML technical standard. I am going to assume you don’t know what attention means, nor what XML is, so here is a quick explanation to get you on board.

Attention
There is this concept floating around on the web about the attention economy. It means as a consumer, you consume web services – e-mail, rss readers, social networking sites – and you generate value through your attention. For example, if I am on a Myspace band page for Sneaky Sound System, I am giving attention to that band. Newscorp (the company that owns MySpace) is capturing that implicit data about me (ie, it knows I like Electro/Pop/House music). By giving my attention, Newscorp has collected information about me. Implicit data are things you give away about yourself without saying it, like how people can determine what type of person you are purely off the clothes you wear. It’s like explicit data – information you give up about yourself (like your gender when you signed up to MySpace).

Attention camera

I know what you did last Summer

XML
XML is one of the core standards on the web. The web pages you access, are probably using a form of XML to provide the content to you (xHTML). If you use an RSS reader, it pulls a version of XML to deliver that content to you. I am not going to get into a discussion about XML because there are plenty of other places that can do that. However I just want to make sure you understand, that XML is a very flexible way of structuring data. Think of it like a street directory. It’s useless if you have a map with no street names if you are trying to find a house. But by having a map with the street names, it suddenly becomes a lot more useful because you can make sense of the houses (the content). It’s a way of describing a piece of content.

APML – the specification
So all APML is, is a way of converting your attention into a structured format. The way APML does this, is that it stores your implicit and explicit data – and scores it. Lost? Keep reading.

Continuing with my example about Sneaky Sound System. If MySpace supported APML, they would identify that I like pop music. But just because someone gives attention to something, that doesn’t mean they really like it; the thing about implicit data is that companies are guessing because you haven’t actually said it. So MySpace might say I like pop music but with a score of 0.2 or 20% positive – meaning they’re not too confident. Now lets say directly after that, I go onto the Britney Spears music space. Okay, there’s no doubting now: I definitely do like pop music. So my score against “pop” is now 0.5 (50%). And if I visited the Christina Aguilera page: forget about it – my APML rank just blew to 1.0! (Note that the scoring system is a percentage, with a range from -1.0 to +1.0 or -100% to +100%).

APML ranks things, but the concepts are not just things: it will also rank authors. In the case of Marjolein Hoekstra, who wrote that post I mention in my intro, because I read other things from her it means I have a high regard for her writing. Therefore, my APML file gives her a high score. On the other hand, I have an allergic reaction whenever I read something from Valleywag because they have cooties. So Marjolein’s rank would be 1.0 but Valleywag’s -1.0.

Aside from the ranking of concepts (which is the core of what APML is), there are other things in an APML file that might confuse you when reviewing the spec. “From” means ‘from the place you gave your attention’. So with the Sneaky Sound System concept, it would be ‘from: MySpace’. It’s simply describing the name of the application that added the implicit node. Another thing you may notice in an APML file is that you can create “profiles”. For example, the concepts about me in my “work” profile is not something I want to mix with my “personal” profile. This allows you to segment the ranked concepts in your APML into different groups, allowing applications access to only a particilar profile.

Another thing to take note of is ‘implicit’ and ‘explicit’ which I touched on above – implicit being things you give attention to (ie, the clothes you wear – people guess because of what you wear, you are a certain personality type); explicit being things you gave away (the words you said – when you say “I’m a moron” it’s quite obvious, you are). APML categorises concepts based on whether you explicitly said it, or it was implicitly determined by an application.

Okay, big whoop – why can an APML do for me?
In my eyes, there are five main benefits of APML: filtering, accountability, privacy, shared data, and you being boss.

1) Filtering
If a company supports APML, they are using a smart standard that other companies use to profile you. By ranking concepts and authors for example, they can use your APML file in the future to filter things that might interest you. As I have such a high ranking for Marjolein, when Bloglines implements APML, they will be able to use this information to start prioritising content in my RSS reader. Meaning, of the 1000 items in my bloglines reader, all the blog postings from her will have more emphasis for me to read whilst all the ones about Valleywag will sit at the bottom (with last nights trash).

2) Accountability
If a company is collecting implicit data about me and trying to profile me, I would like to see that infomation thank you very much. It’s a bit like me wearing a pink shirt at a party. You meet me at a party, and think “Pink – the dude must be gay”. Now I am actually as straight as a doornail, and wearing that pink shirt is me trying to be trendy. However what you have done is that by observation, you have profiled me. Now imagine if that was a web application, where this happens all the time. By letting them access your data – your APML file – you can change that. I’ve actually done this with Particls before, which supports APML. It had ranked a concept as high based on things I had read, which was wrong. So what I did, was changed the score to -1.0 for one of them, because that way, Particls would never show me content on things it thought I would like.

3) Privacy
I joined the APML workgroup for this reason: it was to me a smart away to deal with the growing privacy issue on the web. It fits my requirements about being privacy compliant:

  • who can see information about you
  • when can people see information about you:
  • what information they can see about you

The way APML does that is by allowing me to create ‘profiles’ within my APML file; allowing me to export my APML file from a company; and by allowing me to access my APML file so I can see what profile I have.

drivers

Here is my APML, now let me in. Biatch.

4) Shared data
An APML file can, with your permission, share information between your web-services. My concepts ranking books on Amazon.com, can sit alongside my RSS feed rankings. What’s powerful about that, is the unintended consequences of sharing that data. For example, if Amazon ranked what my favourite genres were about books – this could be useful information to help me filter my RSS feeds about blog topics. The data generated in Amazon’s ecosystem, can benefit me and enjoy a product in another ecosystem, in a mutually beneficial way.

5) You’re the boss!
By being able to generate APML for the things you give attention to, you are recognising the value your attention has – something companies already place a lot of value on. Your browsing habits can reveal useful information about your personality, and the ability to control your profile is a very powerful concept. It’s like controlling the image people have of you: you don’t want the wrong things being said about you. 🙂

Want to know more?
Check the APML FAQ. Othersise, post a comment if you still have no idea what APML is. Myself or one of the other APML workgroup members would be more than happy to answer your queries.

Bloglines to support APML

Tucked away in a post by one of the leading RSS readers in the world, Bloglines had announced that they will be investigating on how they can implement APML into their service. The thing about standards is that as fantastic as they are, if no one uses them, they are not a standard. Over the last year, dozens of companies have implemented APML support and this latest annoucement by a revitalised Bloglines team that is set to take back what Google took from them, means we are going to be seeing a lot more innovation in an area that has largely gone unanswered.

The annoucement has been covered by Read/WriteWeb, APML founders Faraday Media,?Ç? and a thoughtful analysis has been done by Ross Dawson. Ben Melcalfe had also written a thought-provoking analysis, of the merits of APML.

What this means?

APML is about taking control of data that companies collect about you. For example, if you are reading lots of articles about dogs, RSS readers can make a good guess you like dogs – and will tick the “likes dogs” box on the profile they build of you which they use to determine advertising.?Ç? Your attention data is anything you give attention to – when you click on a link within facebook, that’s attention data that reveals things about you implicitly.

The big thing about APML is that is solves a massive problem when it comes to privacy. If you look at my definition of what constitutes privacy, the abillity to control what data is collected with APML, completely fits the bill. I was so impressed when I first heard about it, because its a problem I have been thinking about for years, that I immediately joined the APML workgroup.

Privacy is the inflation of the attention economy, and companies like Google are painfully learning about the natural tension between privacy and targetted advertising. (Targetted advertising being the thing that Google is counting on to fund its revenue.) The web has seen a lot of technological innovation, which has disrupted a lot of our culture and society. It’s time that the companies that are disrupting the world’s economies, started innovating to answer the concerns of the humans that are using their services. Understanding how to deal with privacy is a key competitive advantage for any company in the Internet sector. It’s good to see some finally realising that.

I’m on the APML workgroup

As Chris announced, I’m now a member of the APML work-group. So the question, is why have I joined it? Because profiling is huge. People are only starting to get to grips with the loss of privacy on the web – I suppose an externality of an electronic world. I remember reading about some guy who posted on a marijuana bulletin board in 2000, and that it still comes up in Google searches. Prospective employers, prospective girlfriends, prospective anything – he now cannot control the information that he was once a pot head. It’s like someone watching you get changed, and you don’t have the option of pulling the curtain. Privacy, is about giving you the choice to use that curtain – whether you’re an exhibitionist or not!

Something a lot of people arn’t aware of, is the amount of data other companies are collecting – and you can’t control it. You reading this blog posting – I can find out what browser you have, what city you are viewing this from, who your Internet service providor is – heck I even know what version of windows you use. And I’m not even trying to profile you – think about Google or DoubleClick that know of every website you visit by placing a cookie on your computer.

Why do people want to collect information about you, known as your “attention data”? Because they can profile you – and when you can profile someone, you can personalise the experience for them…and target their advertising better.

The APML standard does a very simple thing: it allows you to control your “attention”. It’s still early days, and although there are some smart people discussing some deep issues on it, everyone on the work-group is still feeling their way of where this standard is going to go.

If you have thought about targeted advertising – and if you don’t you should – I would watch this standard. Or better still, start discussing it – this is a huge opportunity to set things right, before the Internet dominates our lives.