Tag Archive for 'rss'

Data portability allows mashup for Australian bush fire crisis

Last night in Australia, one of the states developed a series of bush fires that have ravaged communities - survivors describe it as "raining fire" that came out of no where. As I write this, up to 76 people have been killed.

Victorian AU Fires 2009
The sky is said by Dave Hollis to look how it is in the movie 'Independence Day'

An important lesson has come out out of this. First, the good stuff.

Googler Pamela Fox has created an invaluable tool to display the bush fires in real time. Using Google technologies like App engine and the Maps API (which she is the support engineer for), she's been able to create a mashup that helps the public.

She can do so because the Victorian Fire department supports the open standard RSS. There are fires in my state of New South Wales as well, but like other Fire Department's in Australia, there is no RSS feed to pull the data from (which is why you won't see any data on the map from there) It appears states like NSW do support RSS for updates, but it would be more useful if there was some consistency - refer to discussion below about the standards.

For further information, you can read the Google blog post.

While the Fire Department's RSS allows the portability of the data, it doesn't have geocodes or a clear licence for use. That may not sound like a big deal, but the ability to contextualise a piece of information in this case matters a hell of a lot.

As a workaround, Pamela sent addresses through the Google geocoder to develop a database of addresses with latitude and longtitude.

GeoRSS and KML
In the geo standards world, two dominant standards exist that enable the portability of data. One is an extension to RSS (GeoRSS) that allows you to extend an RSS feed to show geodata. The other in Keyhole Markup Language, which was a standard developed by Google. GeoRSS is simply modifying RSS feeds to be more useful, while KML is more like how HTML is.

If the CFA and any other websites had supported them either of these standards, it would have made life a lot more easier. Pamela has access to Google resources to translate the information into a geocode and even she had trouble. (Geocoding the location data was the most time-consuming of the map-making process.)

The lessons
1) If you output data, output it in some standard structured format (like RSS, KML, etc).
2) If you want that data to be useful for visualisation, include both time and geographic (latitude/longitude information). Otherwise you're hindering the public's ability to use it.
3) Let the public use your data. The Google team spent some time to ensure they were not violating anything by using this data. Websites should be clearer about their rights of usage to enable mashers to work without fear
4) Extend the standards. It would have helped a lot of the CFA site extended their RSS with some custom elements (in their own namespace), for the structured data about the fires. Like for example <cfa:State>Get the hell out of here</cfa>.
5) Having all the Fire Department's using the same standards would have make a world of difference - build the mashup using one method and it can be immediately useful for future uses.

Pamela tells me that this is the fifth natural disaster she's dealt with. Every time there's been an issue of where to get the data and how to syndicate it. Data portability matters most for natural disasters- people don't have time to deal with scraping HTML (didn't we learn this with Katrina?).

Let's be prepared for the next time an unpredictable crisis like this occurs.

Liako is everywhere…but not here

Life's been busy, and this blog has been neglected. Not a bad thing - a bit of life-living, work-smacking, exposure to new experiences, and active osmosis from the things I am involved in - is what makes me generate the original perspectives I try to create on this blog.

However to my subscribers (Hi Dad!), let this post make it up to you with some content I've created elsewhere.

You already know about the first podcast I did with the Perth baroness Bronwen Clune and the only guy I know who can pull off a mullet Mike Cannon-Brookes of Atlassian . Here's a recap of some other episodes I've done:

  • Episode two: ex-PwC boy Matthew Macfarlane talks to current PwC boy myself and Bronwen, in his new role as partner of a newly created investment fund Yuuwa Capital. He joined us and told us about what he's looking for in startups, as he's about to spend $40million on innovative startups!
  • Episode three: marketing guru Steve Sammartino , tells us about building a business and his current startup Rentoid.com
  • Episode four: experienced entrepreneur Martin Hosking shares us lessons and insight, whilst talking about his social commerce art service Red Bubble .
  • Episode five: "oh-my-God-that-dude-from-TV!" Mark Pesce joins us in discussing that filthy government filter to censor the Internet
  • Episode six: ex-Fairfax Media strategist Rob Antulov tells us about 3eep - a social networking solution for the amateur and semi-professional sports world.

I've also put my data portability hat on beyond mailing list arguments and helped out a new social media service called SNOBS - a Social Network for Opportunistic Business women - with a beginners guide to RSS . You might see me contribute there in future, because I love seeing people pioneer New Media and think Carlee Potter is doing an awesome job - so go support her!

Over and out -regular scheduling to resume after this...

How Google reader can finally start making money

Today, you would have heard that Newsgator, Bloglines, Me.dium, Peepel, Talis and Ma.gnolia have joined the APML workgroup and are in discussions with workgroup members on how they can implement APML into their product lines. Bloglines created some news the other week on their intention to adopt it, and the announcement today about Newsgator means APML is now fast becoming an industry standard.

Google however, is still sitting on the side lines. I really like using Google reader, but if they don?¢‚Ǩ‚Ñ¢t announce support for APML soon, I will have to switch back to my old favourite Bloglines which is doing some serious innovating. Seeing as Google reader came out of beta recently, I thought I?¢‚Ǩ‚Ñ¢d help them out to finally add a new feature (APML) that will see it generate some real revenue.

What a Google reader APML file would look like
Read my previous post on what exactly APML is. If the Google reader team was to support APML, what they could add to my APML file is a ranking of blogs, authors, and key-words. First an explanation, and then I will explain the consequences.

In terms of blogs I read, the percentage frequency of posting I read from a particular blog will determine the relevancy score in my APML file. So if I was to read 89% of Techcrunch posts ?¢‚Ǩ‚Äú which is information already provided to users ?¢‚Ǩ‚Äú it would convert this into a relevancy score for Techcrunch of 89% or 0.89.

ranking

APML: pulling rank

In terms of authors I read, it can extract who posted the entry from the individual blog postings I read, and like the blog ranking above, perform a similar procedure. I don?¢‚Ǩ‚Ñ¢t imagine it would too hard to do this, however given it?¢‚Ǩ‚Ñ¢s a small team running the product, I would put this on a lower priority to support.

In terms of key-words, Google could employ its contextual analysis technology from each of the postings I read and extract key words. By performing this on each post I read, the frequency of extracted key words determines the relevance score for those concepts.

So that would be the how. The APML file generated from Google Reader would simply rank these blogs, authors, and key-words - and the relevance scores would update over time. Over time, the data is indexed and re-calculated from scratch so as concepts stop being viewed, they start to diminish in value until they drop off.

What Google reader can do with that APML file
1. Ranking of content
One of the biggest issues facing consumers of RSS is the amount of information overload. I am quite confident to think that people would pay a premium, for any attempt to help rank the what can be the hundreds of items per day, that need to be read by a user. By having an APML file, over time Google Reader can match postings to what a users ranked interests are. So rather than presenting the content by reverse chronology (most recent to oldest); it can instead organise content by relevancy (items of most interest to least).

This won?¢‚Ǩ‚Ñ¢t reduce the amount of RSS consumption by a user, but it will enable them to know how to allocate their attention to content. There are a lot of innovative ways you can rank the content, down to the way you extract key works and rank concepts, so there is scope for competing vendors to have their own methods. However the point is, a feature to ?¢‚ǨÀúSort by Personal Relevance?¢‚Ǩ‚Ñ¢ would be highly sort after, and I am sure quite a few people will be willing to pay the price for this God send.

I know Google seems to think contextual ads are everything, but maybe the Google Reader team can break from the mould and generate a different revenue stream through a value add feature like that. Google should apply its contextual advertising technology to determine key words for filtering, not advertising. It can use this pre-existing technology to generate a different revenue stream.

2. Enhancing its AdSense programme

blatant ads

Targeted advertising is still bloody annoying

One of the great benefits of APML is that it creates an open database about a user. Contextual advertising, in my opinion is actually a pretty sucky technology and its success to date is only because all the other types of targeted advertising models are flawed. As I explain above, the technology instead should be done to better analyse what content a user consumes, through keyword analysis. Over time, a ranking of these concepts can occur ?¢‚Ǩ‚Äú as well as being shared from other web services that are doing the same thing.

An APML file that ranks concepts is exactly what Google needs to enhance its adwords technology. Don?¢‚Ǩ‚Ñ¢t use it to analyse a post to show ads; use it to analyse a post to rank concepts. Then, in aggregate, the contextual advertising will work because it can be based off this APML file with great precision. And even better, a user can tweak it ?¢‚Ǩ‚Äú which will be the equivalent to tweaking what advertising a user wants to get. The transparency of a user being able to see what 'concept ranking' you generate for them, is powerful, because a user is likely to monitor it to be accurate.

APML is contextual advertising biggest friend, because it profiles a user in a sensible way, that can be shared across applications and monitored by the user. Allowing a user to tweak their APML file for the motivation of more targeted content, aligns their self-interest to ensure the targeted ads thrown at them based on those ranked concepts, are in fact, relevant.

3. Privacy credibility
Privacy is the inflation of the attention economy. You can?¢‚Ǩ‚Ñ¢t proceed to innovate with targeted advertising technology, whilst ignoring privacy. Google has clearly realised this the hard way by being labeled one of the worst privacy offenders in the world. By adopting APML, Google will go a long way to gain credibility in privacy rights. It will be creating open transparency with the information it collects to profile users, and it will allow a user to control that profiling of themselves.

APML is a very clever approach to dealing with privacy. It?¢‚Ǩ‚Ñ¢s not the only approach, but it a one of the most promising. Even if Google never uses an APML file as I describe above, the pure brand-enhancing value of giving some control to its users over their rightful attention data, is something alone that would benefit the Google Reader product (and Google?¢‚Ǩ‚Ñ¢s reputation itself) if they were to adopt it.

privacy

Privacy. Stop looking.

Conclusion
Hey Google - can you hear me? Let's hope so, because you might be the market leader now, but so was Bloglines once upon a time.