Florescu: Re-evaluating the Big Picture

Ken North just posted this email to XML-DEV drawing attention to a presentation by Daniela Florescu titled Declarative XML Processing with XQuery — Re-evaluating the Big Picture (Warning: PDF). It makes for interesting reading.
In the presentation, Florescu argues that XML is in a growth crisis and that there’s a need for more architectural work to tie together components of the XML landscape ranging from XQuery and XSLT through to RDF and OWL. Florescu believes that XML is about more than syntax and will in fact become the key model for information, not just bits on a wire. In short Florescu believes that XML has yet to achieve its full potential and to do that some further work needs to be done.
The presentation is worth reading in its entirety. The majority of the presentation does focus on XQuery, in particular the fact that its not really a query language: it’s a programming language and folk are already using it in this context. But there’s much more to it. Semantic web folk will find much that will have them nodding in agreement.
Florescu suggests a number of concrete areas that require work. Amongst these are:

  • Make XML a graph not a tree, by making links a first class part of the model
  • Integrate the XML data model with RDF
  • Extend programming capabilities of XQuery, e.g. to include assertions, error-handling, metadata extraction functions and continous queries. This latter area is interesting as it would allow an Xquery to run continously, acting on a stream of XML documents as they arrive
  • Integrate XQuery with OWL and RDF. E.g. to allow searching an XML document by semantic classification of nodes, rather than their names.
  • Make browsers XQuery aware, and developer a simple HTTP protocol for invoking XQuery on a remote repository. (I’ve been working with the SPARQL protocol recently and its occured to me several times that an equivalent for XQuery is an obvious area for further work)

All in all I find this to be a very thought-provoking presentation; there’s a lot of interesting ideas in there. For the Semantic Web crowd many of these will be old news: being able to query/manipulate data based on semantics is the core of RDF; linking as a first class model element is something we rely on constantly. But there’s also some new angles to consider. For example there’s a lot of work happening to tie programming languages in with XML, and XML vocabularies such as XQuery becoming more like scripting languages: what’s the equivalent in semantic web circles? Could an ontology aware version of XQuery provide a useful data manipulation environment?
I expect the XML-DEV thread to grow pretty quickly. Will be interested to see if this gets picked up and discussed by other communities also.