Came across DinnerBuzz after reading about it on You’re It: Yummier and Yummier.
OK, cool, this is somewhere I can collect my restaurant recommendations/reviews, perhaps more usefully than just tagged as Restaurant on del.icio.us.
So I go to add a review of the Orient Cafe in Oxford, and what do I get:

You’ve submitted a place that we’ve never heard of! We’re going to try to figure out why, and will publish your post soon.

I should add this one to my other recommendations: if you’re building a social content application that includes use of geographical data, then make sure that you’re aware of geography outside of the United States! At least 43 Places does (link via the pants people).
So another couple of silos in which I can place my microcontent.
Wonder how far until we reach the tipping point where its more cost effective and easier to build new social content sites, similar to these efforts, from data that’s already published, wild on the web, than growing a community from scratch.
Personally I don’t think we’re actually that far away.

My Web 2.0

Just came across this via Flickr (“Shiny New Toy“): Yahoo Search My Web 2.0 Beta. See also the obligatory product blog and developer APIs.
Combines social networking and search to limit your search results based on trust metrics. I’ve not been able to get very far into the site yet to try it out. At first pass it looks like its relying on your cached (you can save pages from the web) and tagged pages to seed the recommendations to others. Also looks like your social network is limited to others with Yahoo accounts. Which is a limited view of my social network.
Read more on Many-to-Many.
I’d probably say that “Act III”, to borrow Mayfield’s term, would be letting go of the ownership of the social network. Just let me import or point to contacts, various tagging systems, etc. Use the FOAF.

Simple List Extensions Critique

Some thoughts on the Simple List Extensions Specification. I’ve been waiting a few days as I wanted to get a feel for what problems are being addressed by the new module; it’s not clear from the specification itself. Dare Obasanjo has summarised the issues, so I now feel better armed to comment.
My first bits of feedback on the specification are mainly editorial: include some rationale, include some examples, and include a contact address for the author/editor so feedback can be directly contributed. There’s at least one typo: where do I send comments?
The rest of the comments come from two perspectives: as an XML developer and as an RSS 1.0/RDF developer. I’ll concentrate on the XML side as others have already made most of the RDF related points.

Read More »

Fun with Jena Rules

Today’s lunchtime special involves fun with the Jena 2 rule engine.

I’ve been wondering for a while whether it’d be possible to extracting richer metadata from tagging conventions. Of course it’s possible, I’m just playing with different ways to achieve it. Quick XSLT conversions are my normal method of choice, but I wanted to have a play with RDF rules, and this seemed like an opportune time.

Actually what triggered this was something that Damian Steer said at XTech (Damian, apologies is I’m misquoting you, or misattributing the idea): “Rules are like XSLT for RDF”. It’s a loose analogy of course, as rule engines don’t typically have the power of XSLT which is a complete language. Although Jena is extensible.

So I decided to see how far I could go. This RSS 1.0 feed is being produced by del.icio.us. I’ve bookmarked some friends ‘blogs, tagging them with “Me/Friends”. I’ve also entered the name of the author in the description field.

Can I turn this into an RDF document that contains the following data:

  • Names me
  • Names each of my friends
  • Asserts that I know each of them
  • Assert that they are the maker of their blogs
  • Associate them to their weblog using foaf:weblog

It turns out I can…

Read More »

foaf:OnlineAccount Generator

Phil Wilson is also thinking about how to subscribe to someone’s brain.
Phil has hacked his aggregator to attempt to discover as many RSS feeds as possible starting from autodiscovery of someone’s FOAF description. Nice work. It’s also closer to the original lazyweb request as its applying some rules to try and find additional data not necessarily linked from the original FOAF description.
Phil writes that:

What’s really needed is a quick and simple form for people to either create a new FOAF file with their online account details in it, or which will accept a FOAF URL as well, and merge the details in…

I’d add this to the FOAF-a-Matic but have been wary about extending it due to the need to chase down all the different translations. So for now, here’s the results of todays lunchtime hack: FOAF Online Account Description Generator. Which is a long name for a very hacky JSP page and worse HTML form.
Anyway, the idea is that you fill in the location of your FOAF file, sha1sum (for smushing) and a bunch of usernames. Click the button and it generates simple metadata connecting you to each service, including links to RSS channels where these can be easily guessed from the username. Unfortunately this isn’t possible for all services, especially Upcoming.org and Flickr that use some internal id in the links.
I’ll leave you with an example URL.
You can also pipe these URLs through yesterdays hack to generate an OPML file for importing.

Subscribe To My Brain

Over Thai food last week, Geoff and I were chatting about subscribing to RSS feeds for all of a person’s outputs. No just blogging but bookmarking, listening and other activities.
Its a topic that’s seen some previous discussion. I’ve written about my life in RDF, Jo Walsh has discussed externalising absolutely everything and Morten Frederiksen maintains his own personal planet feed which is similar to John Resig’s Life as RSS plans. Of course there’s also MeNow which aims for a more real-time view of someone’s activities. Feedburner’s ability to splice RSS feeds into a single synthetic feed operates in a similar area.
Geoff had a nice phrase for this: Subscribing to someone’s brain. As he notes in his blog entry, the problem is in discovering someone’s output then getting to the point where its easy to add to an RSS aggregator.
I’ve taken an initial stab at implementing this by writing a little JSP page that, given a FOAF URL and an mbox_sha1sum, produces an OPML document listing all the RSS channels seeAlso’d from that document. This OPML feed can then be imported directly into an aggregator.
The URL is this: http://www.ldodds.com/micro-util/brain-subscribe.jsp?foaf=URL&mbox_sha1sum=SHA1
and here’s a live example.
Here’s a quick and dirty form that’ll help you along. This was just a lunchtime hack, so its still very rough. An autodiscovery bookmarklet would be nice also.
To add data to your FOAF document to enable people to use this service, you need to add sections like the following within your foaf:Person description:

<rss:channel rdf:about="http://del.icio.us/rss/ldodds">
<dc:description>del.icio.us bookmarks as
an RSS 1.0 news feed</dc:description>

See my FOAF file for a number of examples.
Once you’ve done this you an add a new button to your blog: My brain via OPML

Triple Store Test Suites

I was very pleased to see this post pop up on PlanetRDF: Stress test your triple store. Ten million triples from the Swoogle cache ready for download.
As it happens I’m trying to get sign off at the moment to release part of our data set for research purposes. Not confident of how far I’m going to get as there are a number of different parties that would have to agree, but I have my fingers crossed.
Katie and Priya have been doing some sterling work; a 200M triple data set ain’t that easy to work with. So far we’ve found that Jena on Postgres has proved to be the most stable. We’ve had problems with both Kowari and Sesame. In some cases we’ve been able to resolve them. Query performance times on that size of data set are (not surprisingly) really slow, but accessing resources directly (i.e. by URI) is just fine. We’ll produce a more structured report as soon as we can.
It strikes me that the search/text retrieval community benefited from having large test collections, I think the RDF community needs something similar. It’s not hard to generate synthetic triples, but they can’t compare to real data sets for comparison purposes. Seeing the Swoogle data be released is great news.