Developers often struggle with SPARQL queries. There aren’t always enough good examples to play with when learning the language or when trying to get to grips with a new dataset. Data publishers often overlook the need to publish examples or, if they do, rarely include much descriptive documentation.
I’ve also been involved with projects that make heavy use of SPARQL queries. These are often externalised into separate files to allow them to be easily tuned or tweaked without having to change code. Having documentation on what a query does and how it should be used is useful. I’ve seen projects that have hundreds of different queries.
It occurred to me that while we have plenty of tools for documenting code, we don’t have a tool for documenting SPARQL queries. If generating and publishing documentation was a little more frictionless, then perhaps people will do it more often. Services like SPARQLbin are useful, but provide address a slightly different use case.
Today I’ve hacked up the first version of a tool that I’m calling sparql-doc. Its primary usage is likely to be helping to publish good SPARQL examples, but might also make a good addition to existing code/project documentation tools.
The code is up on github for you to try out. Its still very rough-and-ready but already produces some useful output. You can see a short example here.
Its very simple and adopts the same approach as tools like rdoc and Javadoc: its just specifies some conventions for writing structured comments. Currently it supports adding a title, description, list of authors, tags, and related links to a query. Because the syntax is simple, I’m hoping that other SPARQL tools and IDEs will support it.
I plan to improve the documentation output to provide more ways to navigate the queries, e.g. by tag, query type, prefix, etc.
Let me know what you think!
Things to think about – basic hyperlinks are generally useful, as can be bold/underline/italics/lists, etc.
Either explicitly whitelisting HTML tags; automagically matching hyperlinks, or looking at something like markdown may be worth investigation.
Of course, you then edge closer to wanting the dublin core style attributes (title, description) expressed in RDFa on the HTML output; so you can SPARQL your SPARQL.
Argh, recursion!
The description element already supports Markdown so I think all if those features are already supported. I may need to do a little more to improve the output options though.
Markup for individual attributes is currently just plain text. Easy to change that though.
Thanks for the feedback!
Yeah – I saw once I got into the code. It’d be great if there was 1x example, it was non obvious
I’ve pushed out a change that adds dc:title, dc:description and dc:subject as embedded metadata to the page.
This can be improved later, e.g. to include license, URIs for authors, and perhaps reference to SPARQL endpoints.
Oh, and… consider automatically making prefix URIs into hyperlinks.
http://purl.org/vocab/bio/0.1/ for example does the negotiation to quickly push a user to the human docs. A one click, target=_blank hyperlink would rock.