Simple List Extensions Critique

Some thoughts on the Simple List Extensions Specification. I’ve been waiting a few days as I wanted to get a feel for what problems are being addressed by the new module; it’s not clear from the specification itself. Dare Obasanjo has summarised the issues, so I now feel better armed to comment.
My first bits of feedback on the specification are mainly editorial: include some rationale, include some examples, and include a contact address for the author/editor so feedback can be directly contributed. There’s at least one typo: where do I send comments?
The rest of the comments come from two perspectives: as an XML developer and as an RSS 1.0/RDF developer. I’ll concentrate on the XML side as others have already made most of the RDF related points.

The XML Perspective

First up, I think the namespace URI should resolve to the specification.

The Processing Model

The intent of the cf:treatAs element is unclear. The specification merely says that a consuming application should “treat the content of the feed as if it represents a complete, ordered list of content from the server“. One presumes that the intent is that this element is a switch: if it’s present, then the application should be prepared to apply some specific processing rules. But it’s unclear.
Dare Obasanjo explains that:
To solve the first problem Microsoft has provided the cf:treatAs element with the value “list” to be used as a signal to aggregators that whenever the feed is updated that the previous contents should be dumped or archived and replaced by the new contents of the list.

But the behaviour he describes — dumping, archiving, or replacing the contents of the list — is not in the specification. That’s a big hole in my view.
The background section in the specification also alludes to different processing models for certain types of feed, but again these are not properly described:

…a feed that contains the entire collection of items on the server should be processed differently from a feed that contains only the most recently added or updated items.

My suggestion is to remove the need for the cf:treatAs element to have content: make it an empty element. Unless there are going to be future revisions of the specification that allow for other types of treatment, in which case this ought to be described.

On Redundancy

The rest of the specification has several aims:

Identify elements that are sortable
Identify elements that are groupable
Declare the types of elements
Map element names to human-readable labels

Back in the day this is the kind of data that would go in a schema and not an instance document. This avoids redundancy in terms of repeated definition of the same data, and removes verbosity from feeds.
In other words it avoids millions of RSS feeds all having declarations that dc:created is a date and can be sorted. Weren’t we all recently worried about the internet grinding to its news about the volume of RSS traffic? Or did that problem get solved already?
RSS aggregators really ought to rely on schema, prcessed at runtime or baked into the application, to guide these decisions.

On Sorting and Typing

The specification currently allows feed authors to associate three different data types with elements: date, text, number.
XML Schema provides a way for instance elements to be labelled with types: the xsi:type attribute. I’m not immersed enough in XML Schema to seriously suggest it as an alternative — there are issues, apparently — but it did spring to mind immediately when I read the specification.
The full breadth of XML Schema Datatypes are surely overkill for RSS feeds, but one would assume that tieing descriptions to a formal type system would be a good thing, making it easer to define sort orders, legal lexical values, etc.
I think there’s also some internationalisation issues lurking here.
The specification also leaves it unclear which of the sorting options specified in the RSS feed had already been applied when the feed was produced. One must assume its one of the options, otherwise aggregators may have to add “original order” as an option as well the feed specified sort options.

Local Processing, or Is This Really Necessary?

The following may result from reading too much Walter Perry on XML-DEV (time well spent though!). Perry has always advocated the position — and I’m paraphrasing many a dense posting here, so apologies if I’m misrepresenting his views — that ultimately its the consumer who decides how data is processed, not the producers. In short, the consumer may have vary different ideas about how the data will be used, and all the producer can provide are suggestions, or cues on how the data could be processed.
In the context of the Simple List Extensions specification one has to wonder why the “sort by any element” and “completely replace all items when reloading this feed” features aren’t already provided for in RSS aggregators. There’s no need for a simple list specification at all.
The two issues that Dare describes can be implemented by adding additional options on the client, without requiring changes to feed content.
To express this differently, once an aggregator developer has extended her application to include sorting, grouping and other options, are these going to be limited solely to those feeds with an cf:treatAs element, or supported, albeit in limited form, across all feeds?
What’s needed, IMO are better controls over how aggregators manage and process list on my behalf.

The RDF/RSS 1.0 Perspective

The “Background” section opens with this statement:
A feed is a collection of items
This is true. But, to be precise, an RSS 1.0 feed is a specific type of collection: a list. So there’s some misunderstanding here already. Sure, many RSS 1.0 consumers are almost certainly ignoring the rdf:Seq in the syntax, but it’s still there and there’s a defined meaning. Even if you’re not using an RDF parser. An RSS 1.0 feed is an ordered list of items. The ordering criteria are unspecified, but it’s a list nevertheless.
Just pretend that there’s an “RDF List Extensions” module with two elements: rdf:Seq and rdf:li and go at it.
The other points to make relate to how the RDF syntax and model makes it easier to add the kind of annotation that the Simple List Extension specification is trying to make:

Describe a list: use rdf:Seq
Declare the types of elements: use rdf:datatype
Map element names to human-readable labels: use rdf:label

Only the former is intrusive in the syntax, the last two being attributes. Once you’ve assigned a type, there’s really no need to declare whether a value is sortable or groupable: everything should be ripe for sorting and grouping.

Summary

In short, I think that while the Simple List Extensions module is generally trying to solve some real problems, I don’t believe that this necessarily requires a new RSS module, just new aggregator functionality, perhaps supplemented by schema annotation.
However innovation in the RSS world seems inextricably linked to creation of new modules, so perhaps this is inevitable.
The specification itself needs work as despite being very simple, there are several grey areas that need clarifying.

One thought on “Simple List Extensions Critique”

Marc's Voice says:

July 2, 2005 at 7:04 pm

End of the month Link Roundup

Or is it teh beginning of the month? I can’t keep track anymore! Anyway – since Microsoft has propounded that Lists are a new kind of micro-content (kudos to 6A for figuring that out years ago) – let’s put out a list of links (with coy, comments attach…

Comments are closed.