Monthly Archives: June 2005


Came across DinnerBuzz after reading about it on You’re It: Yummier and Yummier.
OK, cool, this is somewhere I can collect my restaurant recommendations/reviews, perhaps more usefully than just tagged as Restaurant on
So I go to add a review of the Orient Cafe in Oxford, and what do I get:

You’ve submitted a place that we’ve never heard of! We’re going to try to figure out why, and will publish your post soon.

I should add this one to my other recommendations: if you’re building a social content application that includes use of geographical data, then make sure that you’re aware of geography outside of the United States! At least 43 Places does (link via the pants people).
So another couple of silos in which I can place my microcontent.
Wonder how far until we reach the tipping point where its more cost effective and easier to build new social content sites, similar to these efforts, from data that’s already published, wild on the web, than growing a community from scratch.
Personally I don’t think we’re actually that far away.

My Web 2.0

Just came across this via Flickr (“Shiny New Toy“): Yahoo Search My Web 2.0 Beta. See also the obligatory product blog and developer APIs.
Combines social networking and search to limit your search results based on trust metrics. I’ve not been able to get very far into the site yet to try it out. At first pass it looks like its relying on your cached (you can save pages from the web) and tagged pages to seed the recommendations to others. Also looks like your social network is limited to others with Yahoo accounts. Which is a limited view of my social network.
Read more on Many-to-Many.
I’d probably say that “Act III”, to borrow Mayfield’s term, would be letting go of the ownership of the social network. Just let me import or point to contacts, various tagging systems, etc. Use the FOAF.

Simple List Extensions Critique

Some thoughts on the Simple List Extensions Specification. I’ve been waiting a few days as I wanted to get a feel for what problems are being addressed by the new module; it’s not clear from the specification itself. Dare Obasanjo has summarised the issues, so I now feel better armed to comment.
My first bits of feedback on the specification are mainly editorial: include some rationale, include some examples, and include a contact address for the author/editor so feedback can be directly contributed. There’s at least one typo: where do I send comments?
The rest of the comments come from two perspectives: as an XML developer and as an RSS 1.0/RDF developer. I’ll concentrate on the XML side as others have already made most of the RDF related points.

Continue reading

Fun with Jena Rules

Today’s lunchtime special involves fun with the Jena 2 rule engine.

I’ve been wondering for a while whether it’d be possible to extracting richer metadata from tagging conventions. Of course it’s possible, I’m just playing with different ways to achieve it. Quick XSLT conversions are my normal method of choice, but I wanted to have a play with RDF rules, and this seemed like an opportune time.

Actually what triggered this was something that Damian Steer said at XTech (Damian, apologies is I’m misquoting you, or misattributing the idea): “Rules are like XSLT for RDF”. It’s a loose analogy of course, as rule engines don’t typically have the power of XSLT which is a complete language. Although Jena is extensible.

So I decided to see how far I could go. This RSS 1.0 feed is being produced by I’ve bookmarked some friends ‘blogs, tagging them with “Me/Friends”. I’ve also entered the name of the author in the description field.

Can I turn this into an RDF document that contains the following data:

  • Names me
  • Names each of my friends
  • Asserts that I know each of them
  • Assert that they are the maker of their blogs
  • Associate them to their weblog using foaf:weblog

It turns out I can…

Continue reading

foaf:OnlineAccount Generator

Phil Wilson is also thinking about how to subscribe to someone’s brain.
Phil has hacked his aggregator to attempt to discover as many RSS feeds as possible starting from autodiscovery of someone’s FOAF description. Nice work. It’s also closer to the original lazyweb request as its applying some rules to try and find additional data not necessarily linked from the original FOAF description.
Phil writes that:

What’s really needed is a quick and simple form for people to either create a new FOAF file with their online account details in it, or which will accept a FOAF URL as well, and merge the details in…

I’d add this to the FOAF-a-Matic but have been wary about extending it due to the need to chase down all the different translations. So for now, here’s the results of todays lunchtime hack: FOAF Online Account Description Generator. Which is a long name for a very hacky JSP page and worse HTML form.
Anyway, the idea is that you fill in the location of your FOAF file, sha1sum (for smushing) and a bunch of usernames. Click the button and it generates simple metadata connecting you to each service, including links to RSS channels where these can be easily guessed from the username. Unfortunately this isn’t possible for all services, especially and Flickr that use some internal id in the links.
I’ll leave you with an example URL.
You can also pipe these URLs through yesterdays hack to generate an OPML file for importing.

Subscribe To My Brain

Over Thai food last week, Geoff and I were chatting about subscribing to RSS feeds for all of a person’s outputs. No just blogging but bookmarking, listening and other activities.
Its a topic that’s seen some previous discussion. I’ve written about my life in RDF, Jo Walsh has discussed externalising absolutely everything and Morten Frederiksen maintains his own personal planet feed which is similar to John Resig’s Life as RSS plans. Of course there’s also MeNow which aims for a more real-time view of someone’s activities. Feedburner’s ability to splice RSS feeds into a single synthetic feed operates in a similar area.
Geoff had a nice phrase for this: Subscribing to someone’s brain. As he notes in his blog entry, the problem is in discovering someone’s output then getting to the point where its easy to add to an RSS aggregator.
I’ve taken an initial stab at implementing this by writing a little JSP page that, given a FOAF URL and an mbox_sha1sum, produces an OPML document listing all the RSS channels seeAlso’d from that document. This OPML feed can then be imported directly into an aggregator.
The URL is this:
and here’s a live example.
Here’s a quick and dirty form that’ll help you along. This was just a lunchtime hack, so its still very rough. An autodiscovery bookmarklet would be nice also.
To add data to your FOAF document to enable people to use this service, you need to add sections like the following within your foaf:Person description:

<rss:channel rdf:about="">
<dc:description> bookmarks as
an RSS 1.0 news feed</dc:description>

See my FOAF file for a number of examples.
Once you’ve done this you an add a new button to your blog: My brain via OPML

Triple Store Test Suites

I was very pleased to see this post pop up on PlanetRDF: Stress test your triple store. Ten million triples from the Swoogle cache ready for download.
As it happens I’m trying to get sign off at the moment to release part of our data set for research purposes. Not confident of how far I’m going to get as there are a number of different parties that would have to agree, but I have my fingers crossed.
Katie and Priya have been doing some sterling work; a 200M triple data set ain’t that easy to work with. So far we’ve found that Jena on Postgres has proved to be the most stable. We’ve had problems with both Kowari and Sesame. In some cases we’ve been able to resolve them. Query performance times on that size of data set are (not surprisingly) really slow, but accessing resources directly (i.e. by URI) is just fine. We’ll produce a more structured report as soon as we can.
It strikes me that the search/text retrieval community benefited from having large test collections, I think the RDF community needs something similar. It’s not hard to generate synthetic triples, but they can’t compare to real data sets for comparison purposes. Seeing the Swoogle data be released is great news.

Revenge of the Sith

I have a confession to make: Revenge of the Sith is the first Star Wars film I’ve seen at the cinema. And I’m 33.
That’s probably enough to get me ostracised from geek society. I was old enough to see at least Jedi when it came out. And surely I should have been frothing at the mouth to see the recent films? I can’t put my finger on why I never bothered though.
For the earlier films it’s easy, as a family we were never great cinema goers, but we did enjoy gathering to watch the family movies on TV at Christmas. So Star Wars for me brings back memories of sitting with the rest of the extended family in a darkened room, post Xmas slap-up dinner. It was never really connected with the cinema experience.
For the later films I think my expectations were just low. And Lucas managed to fall short of even those. Episode I was pants, but Episode II was better. Simply because it was slightly darker and, frankly, more mature.
So you can see why I was grinning all over my fizzog after watching The Sith last week. And why my wife asked me, the next morning, “Are you going to do that all day?” in response to my unconsciously humming the Imperial March whilst giving the kids their breakfast.
“Dark” doesn’t quite cover it.
Sith repaints Vader as quite a different character. IMO, in the latter films Vader was more of a comic book villian: evil, but in an implied, even reserved kind of way. OK, so he offed a few people, but that’s par for the course for your average villian, let alone a Dark Lord of the Sith. In contrast, The Vader in Episode III is a nasty piece of work. He lives up to his potential.
I’m definitely going to have to go see it again. Not for the effects, as I found them to be standard Star Wars fare. The segue into the stylings of the later films was very nicely handled though. No, amazingly, I’m going to see it for the story. Really.
Here’s another final confession that will definitely seal my fate: Revenge of the Sith is better than The Empire Strikes Back. It’s. The. Best. Star. Wars. Ever. Evah I say.
Mind you, I still think Lucas should have given Dave Prowse a pop at wearing the Vader suit for old times sake.
And did you know that Vader has a blog now? Subscribed


In an attempt to put my various projects into the Public Domain, it seems that I’ve caused some confusion.
All I want to do is the following: label my code as being in the Public Domain, but require that people at least acknowledge the fact that they’re using something I wrote. I’d prefer it if people didn’t take anything I wrote and make a quick buck out of it, but I’m not adverse to my code being bundled in a payware application. But that’s a nice to have, I basically just want to give stuff away.
This lead me to start adding Creative Commons licences to my work. The Attribution-ShareALike licence seemed to exactly cover my requirements. Previously I’d either not included a licence, or labelled it as “Public Domain”. But I’d seen some code I wrote used verbatim with someone else’s name on it and that naturally upset me. I won’t go into details about who or where, but how hard is it to add an @author tag to Java source (or better yet, leave the one that’s already in there)?
So anyway, the CC licence seemed to fit. However when the Jaikoz developers contacted me a few months ago about reusing my MusicBrainz API they weren’t sure whether they could, as their application is payware. I said they could.
This week Henning Koch emailed me under similar confusion: would his application have to be similarly licensed. I didn’t think so, and that certainly wasn’t my intention. Koch pointed me at the CC FAQ entry that I’d stupidly overlooked:
CC licenses are not written for software. They should not be used for software…
But which one of the many licences should I use? Why does it have to be so difficult to give stuff away? I know creating new open source licences is discouraged but to be frank, its not that easy to pick and choose, and I’m not sure I want to wade through endless legal documents: I want to give stuff away, but be acknowledged. That’s it.
Why does open sourcing software have to be so difficult? It seems to me that the Creative Commons folk could help clean up this mess. They’re wading into scientific research, so why not software?
A nice example of how broken software licencing is, is this summary of the creative commons licences from a Debian perspective. Conclusion: they’re not free. Debian has a reputation for being particularly prescriptive, but this seems a little barking.
I guess the answer is either RTFL (Read the F’ing Licence), or just switch back to a plain “This work is in the Public Domain” statement.

Sample Sparql Queries, Updated

Back in February I posted some sample sparql queries that might be useful as additional examples of the SPARQL syntax. Since then we’ve had several new drafts including some syntax changes. In this post I’m including updated versions of all queries. Except for one that is, see later discussion). I’ve also thrown in a few more for good measure, and some notes on other things that I can’t find a way to do (so thats where you can help dear reader…).
Oh, and despite my initial grumblings I’ve not found the tweaked syntax too troublesome.

Continue reading


Get every new post delivered to your Inbox.

Join 30 other followers