I came across RDF vs XML Illustrated via both Dave Beckett’s Journal and the RDF IG IRC Scratchpad today. And its brought forward a question I’ve been meaning to ask for a couple of weeks now.
Take a look at the bottom right of the diagram (e.g. the JPEG version it says:
Some projects are better suited for XML data; others scream out for RDF. RDF will not replace XML, each has its advantages in certain scenarios.
My question is simple: what makes a project scream out for RDF? What property of an application or its data make it better suited to an RDF rather than an XML vocabulary?
I honestly don’t have any feeling for the right answers.
I’m working with RDF tools now, but thats because FOAF is an RDF vocabulary. I’m just using the right tools for the job. If I was given a task to design a new system I don’t have any feel for why I might choose RDF over XML. I haven’t had that “aha” moment yet.
We might loosely classify markup vocabularies into three types:
- Pure XML vocabularies
- RDF Friendly vocabularies, e.g. RSS 1.0
- Pure RDF vocabularies
And we could then generalise the question to: which type of vocabulary is best suited to which applications? Are RDF Friendly vocabularies just a transition step?
And if RDF will never supplant XML, then surely we’re going to have to invest a lot of time in RDF Data Mining?
I’ve been wondering whether the answer might be in Shelley’s book somewhere, but haven’t had time to get beyond the opening chapter.
Maybe I’m just being dumb, I dunno. But I’d love to know what others think.
A few things that might indicate RDF-ishness of an application area.
i) if it overlaps with lots of other similar-but-different applications, that have related but distinct need for XML-based interchange.
ii) if there are clear benefits to be had from data merging across diverse sources.
iii) if doing a big, crisp and final ‘up front design’ before deploying isn’t easy (when is it ever!).
…such characteristics imho indicate that RDF’s scruffy flexibility could suit, especially if accompanied by XML doc types that capture some particular descriptive task. RSS does this nicely; it isn’t a compromise that we have XML and RDF views of RSS. RSS is all the more useful because it lends it self to both document-typing view (you know what to expect in an RSS doc) and data-merging view (you know how to merge data from RSS docs that use unknown namespaces).
This is just a quick scribble, hope it makes some sense…
Uses of RDF
I’ve been meaning to toss some thoughts at Leigh for a few days now. He asks “When would I use RDF in preference to a non-RDF XML vocabulary?” As usual, there can’t be a hard-and-fast answer to that. I do see a few glimmerings o…
Pat Hayes actually grabbed a quote from the book and posted at the RDF WG core mail list about RDF’s usefulness:
“RDF is a technique to record statements about resources so that machines can easily pick up the statements. Not only that, but RDF is based on a domain-neutral model that allows one set of statements to be merged with another set of statements, even though the information contained in each set of statements may differ dramatically.”
XML gives us the format to record domain-neutral data, but RDF gives us the methodology to record complete domain-neutral statements — data in action as it were.
Ontologies are then domain-specific views built on top of the domain-neutral model that is RDF.
It’s all layers. Taking a cross-section:
Knowledge can be split into domain-specific views (ontology) based on complete statements (RDF) consisting of separate pieces of syntactically valid data (XML).
Sometimes you feel like RDF, sometimes you don’t
Semaview came out with this illustrated <a href=”http://www.semaview.com/d/RDFvsXML_6000x1024.jpg”>RDF vs XML</a> graphic which shows the ‘differences’ between RDF and XML. At least one assumes so. This might be confusing for some people th…
On top of the previous comments, my quick run on this question would be : use RDF when the application domain model was more like a graph than a tree (XML) or tables (RDBMS).
When you’re talking about the web environment, with data from loads of different sources and structures hard to pin down this translates into *most of the time*. ‘Course this gets watered down by practical considerations…