Goodbye XML-Deviant

I see Micah’s latest XML-Deviant is up on this week, and its also to be the last in the series. It’s a shame to see it go as I’ve enjoyed reading the column over the last few years. I also thoroughly enjoyed contributing to the column during my own period of XML-Deviancy. But all things come to an end; I’m looking forward to seeing what replaces the column in future.
Tip of the hat to the other XML-Deviants: Edd, Kendall and Micah for all of their efforts along the way; especially Edd for originally conceiving of the column.

Simple List Extensions Critique

Some thoughts on the Simple List Extensions Specification. I’ve been waiting a few days as I wanted to get a feel for what problems are being addressed by the new module; it’s not clear from the specification itself. Dare Obasanjo has summarised the issues, so I now feel better armed to comment.
My first bits of feedback on the specification are mainly editorial: include some rationale, include some examples, and include a contact address for the author/editor so feedback can be directly contributed. There’s at least one typo: where do I send comments?
The rest of the comments come from two perspectives: as an XML developer and as an RSS 1.0/RDF developer. I’ll concentrate on the XML side as others have already made most of the RDF related points.

Continue reading “Simple List Extensions Critique”

XTech Day Three

Belatedly (I only got back from Amsterdam last Monday), here are some notes from XTech Day 3.
On the Friday morning I initially attended two talks about RDF frameworks, firstly Dave Beckett’s Bootstrapping RDF applications with Redland and then David Wood’s introduction to
Kowari: A Platform for Semantic Web Storage and Analysis. I’ve not really used either of these toolkits yet, but at work we’re looking at trying out Kowari as one of the candidate triple stores for holding our massive dataset. John Barstow‘s work on the port of Redland to windows makes it more likely that I’ll be trying out Dave’s toolkit for some personal hacking projects too.

Continue reading “XTech Day Three”

Where Should XML Go?

Liam Quin has been thinking about XML 2.0 and has posted an article to Advogato titled “Where Should XML Go?“.
Quin is obviously trying to reach a wider community than just the hardcode XML users, noting in his diary that: Where would you go (or post) to ask people why they’re not using XML? There are lots of good reasons not to use XML, and lots of good reasons to use it, so I’m particularly interested in people who would like to go with XML but who feel they can’t.
Advogato seems like a good starting place to me. Of course there’s an XML-DEV thread starting on the topic already, so the usual suspects will be weighing in very shortly.
I’m not sure what my most requested improvement to the core specification(s) would be. When asked about this before I’ve often responded that I’d be happy to see the work on packaging resume. Especially as there’s work continuing in the area that could be standardised, such as the Open Office format and Rick Jelliffe’s DZIP2.
I loved Jelliffe’s From Wiki to XML via SGML article demonstrating how to use SGML SHORTREFs to parse Wiki markup as SGML interesting, and thats made me wonder whether that might be an SGML feature worth unearthing. Not likely to be a popular suggestion though! And of course one can simply use an SGML parser when one needs the extra power.
But the syntax could certainly be friendlier, and I wonder whether that might address some users dislike of XML, the format; the can still use XML tools to process their config files, Wiki markup, CSV documents, etc.

Tailored Feeds

Tim Bray posted some notes on Private Syndication, referring to this ZDNet piece by David Berlind.
I’m inclined to agree that this kind of syndication is as yet a largely untapped application agree and that it’s one with a great deal of possibilities. I’d love to have a feed of my bank balance, credit card statements, etc. Might help me curb my spending 🙂
The kind of private syndication Bray and Berlind are talking about is the opposite end of the spectrum from the public feeds that most of us are consuming. It’s important not to ignore the space in-between though: between per-user and mass-audience feeds there’s a lot of other possibilities. E.g feeds tailored to a particular community or market. There’s an uneasy relationship between advertising/marketing here.
For example a publisher may want to have one feed for content subscribers, Amazon may want separate feeds for regular purchasers, etc. The content of these feeds needn’t be entirely marketing oriented though, there’s scope for “premium” content feeds, e.g. pushing out entire articles or other relevant updates. In my own application area, it would be useful for publishers to be able to produce RSS feeds tailored to subscribers/non-subscribers. A subscriber feed may have the entire content, or direct links to it. A non-subscriber feed may have limited content and links to purchasing options instead.
Whatever these “tailored” feeds contain, and whether they’re tailored for a restricted audience or individual user, the key to their success is going to be authentication support in aggregators. SSL, HTTP Auth, etc are all pre-requisites. And not only that: web based aggregators such as Bloglines (my own favourite) will have to ensure that these feeds are not shared with the rest of the user base. I’ve heard several stories of private RSS feeds being accidentally shared with the entire Bloglines community; as I understand it, they automatically add any feed to their global directory.
In fact a move to tailored feeds may take away some of the supposed value of RSS aggregators such as Bloglines. There won’t be much they can share between users. There’s been a lot written about the network overheads of RSS, this can only get worse with more tailored feeds.
Even without tailored feeds, support for authentication and non-shareable feeds would be a useful feature. At the moment I publish several private feeds internally to our company which are being rarely used as many of my colleagues are using Bloglines, or they are mobile and their desktop aggregator doesn’t support HTTP Authorisation.
Bray closes his posting my stating his belief that Atom is best suited to producing “content-critical, all-business” feeds. It’s a bold statement and I’d like to hear more about this: what exactly makes Atom better suited to carrying personalised/tailored content than any of the other RSS flavour? Kellan Elliott-McCrea has raised one issue already.
In his article, Berlind suggests that a delivery company might produce an RSS feed for every package they ship. I wonder whether, instead of requiring each company to produce fine grained feeds for all of their actions, whether it might be easier for credit card companies to act as the point of co-ordination. Actions relating to a purchase made on a card, e.g. dispatched, delivered, warranty expired, could be sent as a notification to the card company, who could then produce a secure tailored feed which aggregates all the relevant activities.
There’s already some degree of communication between the companies (the actual transaction) so really this would only require a standard interface to exchange data suitable for packaging into a feed. I can definitely see a role for the Atom API there, but I’m not clear on the unique benefits of the Atom format.

XML Hacks

I see by the fact that my complementary copy arrived today that XML Hacks has hit the stores. This makes me incredibly pleased as my two contributed hacks mean that this is the most I’ve ever had in print, and that’s like, proper writing, not this new-fangled web malarkey.
My two hacks are #64 (“Identify Yourself with FOAF”) and #93 (“Use Cocoon to Create a Well-Formed View of a Web Page, Then Scrape It For Data”). Both are RDF flavoured. The first is basically an edited version of my article, “Introduction to FOAF“, while the second is an original piece that provides a lightning introduction to Cocoon then shows how to create a simple web service that will scrape RDF metadata from a web page using a combination of HTML Tidy, XSLT and some rummaging around in the head element.
Kudos to Michael Fitzgerald for pulling together a book that contains such a wide range of useful hacks, and having the patience to do it whilst working with a number of very, very busy people!

Programmers Are Interesting

Another great article from Sean McGrath: The mysteries of flexible software. Bang on the money.
I don’t know how many times I’ve encounted software (and yes, some of my own devising) that has all sorts of wonderful flexibility but in all the wrong places. Time spent factoring applications into layer cakes, introducing endless layers of abstraction may have some benefits, but exactly how often do you go through an application and rip out the entire persistence layer? And when you do, what’s the biggest hurdle: changing the code, or the data migration and testing involved to guarantee that you’ve not broken any of the data integrity? Exactly how often do you swap in and out XML parsers and XSLT engines?
I’ve been a keen advocate of design patterns for some time, but it’s easy to get carried away: achieving a particular design pattern (or set of patterns) becomes a requirement in itself and that in all likelihood isn’t going to affect the success of a product. The “Just Do It” aspect to XP is one obvious reaction to that experience. Renewed interest in more flexible, easy to author languages like Python is perhaps another.
Abstraction ought to be a defense mechanism. If a particular area of code or functionality is continually under change, then introduce some abstraction to help manage that change. Trust your ability to refactor. Don’t over architect too early.