Today’s lunchtime special involves fun with the Jena 2 rule engine.
I’ve been wondering for a while whether it’d be possible to extracting richer metadata from tagging conventions. Of course it’s possible, I’m just playing with different ways to achieve it. Quick XSLT conversions are my normal method of choice, but I wanted to have a play with RDF rules, and this seemed like an opportune time.
Actually what triggered this was something that Damian Steer said at XTech (Damian, apologies is I’m misquoting you, or misattributing the idea): “Rules are like XSLT for RDF”. It’s a loose analogy of course, as rule engines don’t typically have the power of XSLT which is a complete language. Although Jena is extensible.
So I decided to see how far I could go. This RSS 1.0 feed is being produced by del.icio.us. I’ve bookmarked some friends ‘blogs, tagging them with “Me/Friends”. I’ve also entered the name of the author in the description field.
Can I turn this into an RDF document that contains the following data:
- Names me
- Names each of my friends
- Asserts that I know each of them
- Assert that they are the maker of their blogs
- Associate them to their weblog using
foaf:weblog
It turns out I can…
I don’t have time to write this up fully, so for now, here’s the rules:
@prefix rss: <http://purl.org/rss/1.0/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix dc: <http://purl.org/dc/elements/1.1/>.
[rssItemsAreFoafDocuments:
(?C rdf:type rss:item)
->
(?C rdf:type foaf:Document) ]
[preferDCTitles:
(?C rdf:type rss:item), (?C rss:title ?title)
->
(?C dc:title ?title)]
[createPersonFromDescription:
(?C rss:description ?author),
(?C dc:creator ?myNick),
makeTemp(?person)
->
(?person rdf:type foaf:Person),
(?C foaf:maker ?person),
(?person foaf:weblog ?C),
(?person foaf:name ?author)]
[thisChannelIsMe:
(?C rdf:type rss:channel),
makeTemp(?person)
->
(?person rdf:type foaf:Person)
(?person foaf:maker ?C)
(?person foaf:nick 'ldodds')]
[iKnowMyFriends:
(?C rdf:type rss:item),
(?C foaf:maker ?friend),
(?C dc:creator ?creator),
(?me foaf:nick ?creator)
->
(?me foaf:knows ?friend)]
You can test these out using the jena.RuleMap
command-line tool thats bundled with Jena 2.2. Pass it a parameter of “-ol RDF/XML
” to get RDF/XML output and a “-d
” to only see the inferred triples and you end up with this:
<rdf:RDF
xmlns="http://purl.org/rss/1.0/"
xmlns:admin="http://webns.net/mvcb/"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:syn="http://purl.org/rss/1.0/modules/syndication/"
xmlns:dc="http://purl.org/dc/elements/1.1/" >
<rdf:Description rdf:nodeID="A0">
<foaf:knows rdf:nodeID="A1"/>
<foaf:nick>ldodds</foaf:nick>
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
<foaf:knows rdf:nodeID="A2"/>
<foaf:knows rdf:nodeID="A3"/>
<foaf:maker rdf:resource="http://del.icio.us/ldodds/Me/Friends"/>
<foaf:knows rdf:nodeID="A4"/>
<foaf:knows rdf:nodeID="A5"/>
</rdf:Description>
<rdf:Description rdf:nodeID="A1">
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
<foaf:name>Geoff Bilder</foaf:name>
<foaf:weblog rdf:resource="http://breakawayrepublic.com/blog/"/>
</rdf:Description>
<rdf:Description rdf:about="http://journal.dajobe.org/journal/">
<dc:title>Dave Beckett - Journalblog</dc:title>
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Document"/>
<foaf:maker rdf:nodeID="A2"/>
</rdf:Description>
<rdf:Description rdf:nodeID="A5">
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
<foaf:weblog rdf:resource="http://www.hackdiary.com/"/>
<foaf:name>Matt Biddulph</foaf:name>
</rdf:Description>
<rdf:Description rdf:about="http://planb.nicecupoftea.org/">
<dc:title>Plan B</dc:title>
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Document"/>
<foaf:maker rdf:nodeID="A3"/>
</rdf:Description>
<rdf:Description rdf:about="http://breakawayrepublic.com/blog/">
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Document"/>
<foaf:maker rdf:nodeID="A1"/>
<dc:title>Louche Cannon</dc:title>
</rdf:Description>
<rdf:Description rdf:nodeID="A2">
<foaf:name>Dave Beckett</foaf:name>
<foaf:weblog rdf:resource="http://journal.dajobe.org/journal/"/>
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
</rdf:Description>
<rdf:Description rdf:nodeID="A3">
<foaf:weblog rdf:resource="http://planb.nicecupoftea.org/"/>
<foaf:name>Libby Miller</foaf:name>
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
</rdf:Description>
<rdf:Description rdf:about="http://usefulinc.com/edd/blog">
<foaf:maker rdf:nodeID="A4"/>
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Document"/>
<dc:title>Edd Dumbill's Weblog: Behind the Times</dc:title>
</rdf:Description>
<rdf:Description rdf:about="http://www.hackdiary.com/">
<dc:title>hackdiary</dc:title>
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Document"/>
<foaf:maker rdf:nodeID="A5"/>
</rdf:Description>
<rdf:Description rdf:nodeID="A4">
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
<foaf:weblog rdf:resource="http://usefulinc.com/edd/blog"/>
<foaf:name>Edd Dumbill</foaf:name>
</rdf:Description>
</rdf:RDF>
Which is pretty much exactly what I wanted.
I could achieve exactly the same with XSLT, but the rules are much more succinct. Obviously this example is a bit contrived as I’m unlikely to extract my social network from an RSS feed; del.icio.us only keeps the most recent entries anyway. But as a means to create and use microcontent, e.g. book reviews, events metadata, etc, it seems pretty useful.