Monthly Archives: January 2008

Bee Node: A FOAF Tale

Detective Piotr Sparql lent back in his chair cradling a tumbler of vodka and reflected on his most recent case. It had started as a simple missing person; he’d been assigned to investigate the disappearance of Beatrice “Bee” Node:


@prefix foaf:  <http://xmlns.com/foaf/0.1> .
</person/bnode> a foaf:Person;
foaf:name "Beatrice Node";
foaf:nick "Bee"
foaf:mbox <mailto:bnode@example.com>.

His investigation had started out routinely enough: trawling his usual sources to see if any of them had word of Bee’s location:


PREFIX geo <http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX foaf <http://xmlns.com/foaf/0.1>
ASK WHERE {
{
</person/bnode>
geo:lat ?lat;
geo:long ?long.
}
UNION
{
?person
foaf:mbox <mailto:bnode@example.com>;
geo:lat ?lat;
geo:long ?long.
}
}

And then Bee turned up. Dead.


@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix foaf:  <http://xmlns.com/foaf/0.1> .
@prefix bio: <http://purl.org/vocab/bio/0.1/> .
</person/bnode> a foaf:Person;
foaf:name "Beatrice Node";
bio:event [ a bio:Death;
bio:date "2008-01-29"^^xsd:date.
].

So he’d begun leaning on his sources harder, attempting to find those that had anything on Bee that might be useful in tracking down her murderer:


ASK WHERE {
{
</person/bnode> ?p ?o.  }
UNION
{
?person
foaf:mbox <mailto:bnode@example.com>;
?p ?o.
}
}

…and then getting them to spill what they knew:


DESCRIBE </person/bnode>.

Pickings were slim. He tried a few obvious tacks:


PREFIX rel: <http://vocab.org/relationship/>
SELECT ?name ?mbox
WHERE {
?suspect rel:enemyOf </person/bnode>.
?suspect foaf:name ?name.
?suspect foaf:mbox ?mbox.
}

But Bee had had few enemies and all of them had alibis. He widened his search through the social networks:


PREFIX rel: <http://vocab.org/relationship/>
SELECT ?name, ?mbox
WHERE {
?suspect rel:enemyOf </person/bnode>.
?suspect foaf:knows ?otherSuspect.
?otherSuspect foaf:name ?name.
?otherSuspect foaf:mbox ?mbox.
}

But everyone’s alibis were water-tight. At this point he’d gone back to basics, gathering everything he could on the late lamented Bee Node. On a hunch he probed for more background on Bee’s social network. She’s been active in a number of forums and he’d figured that she may have unknowingly upset someone:


PREFIX sioc: <http://rdfs.org/sioc/ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
CONSTRUCT {
?suspect a foaf:Person;
?suspect foaf:name ?name;
?suspect foaf:mbox ?mbox.
}
WHERE {
{
?post a sioc:Post;
sioc:has_creator ?bee;
sioc:has_reply ?reply.
?bee sioc:email <mailto:bnode@example.com>.
?reply sioc:has_creator ?suspect.
?suspect sioc:name ?name;
sioc:email ?mbox.
}
UNION
{
?post a sioc:Post;
sioc:has_creator ?suspect;
sioc:has_reply ?reply.
?reply sioc:has_creator ?bee.
?bee sioc:email <mailto:bnode@example.com>.
?suspect sioc:name ?name;
sioc:email ?mbox.
}
}

Cross-referencing the email addresses on the short list of suspects, with data taken from a contact at nominet, he’d managed to gather some addresses:


PREFIX whois: <http://xml.nominet.org.uk/rdf/nom/domain#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX util: <http://www.example.org/sparql/util/>
SELECT
?name ?mbox ?line1 ?line2 ?postcode ?country
WHERE {
?suspect foaf:name ?name;
foaf:mbox ?mbox.
?d a whois:domainName;
whois:domainNameValue ?domainName;
whois:hasRegistrant ?registrant.
?registrant whois:registrantAddress ?address.
?address whois:addressline1 ?line1;
whois:addressline2 ?line2;
whois:postcode ?postcode;
whois:country ?country;
FILTER ( ?domainName = util:ExtractMailDomain(?mbox) )
}
ORDER BY ?name

The rest had come down to old fashioned legwork. He cursed himself softly as he finished his drink, pouring himself another slug of Absolut from the bottle in his desk drawer. In his haste he’d missed the obvious angles; hadn’t bothered to check out the family. after all they’d all seemed so…anonymous at first glance.

The murderer? Her relative: Uri. He’d been masquerading under an
alias.

Self-Description for Service Connection

I hate quoting myself, as I worry about it making me seem like a pompous ass, but I feel moved to do it in this instance after reading Danny’s posting about DataPortability Service Discovery, in which he discusses the current blueprint from the DataPortability group.
Danny rightly points out that FOAF already provides a means for listing all of the accounts that a person uses as part of their online activity. The vocabulary allows the service to be identified along with their account username. This is typically sufficient information to start interacting with a service API to extract useful information about the user. E.g. for importing into another site.
Here’s the example that I included in an XTech paper I presented in 2005:

<foaf:Person>
<foaf:holdsAccount>
<foaf:OnlineAccount>
<foaf:accountName>ldodds</foaf:accountName>
<foaf:accountServiceHomepage
rdf:resource="http://del.icio.us"/>
</foaf:OnlineAccount>
</foaf:holdsAccount>
</foaf:Person>

With that bit of information you can easily get access to my del.icio.us bookmarks, for example. The limitation in this kind of approach, whether its implemented using FOAF, or using the protocol outlined in the DataPortability blueprint, is that a third-party service wanting to extract data about the user needs some prior knowledge of the service it will be interacting with: it need knowledge of the API (i.e. a client) and also what kind of information it holds about the user (i.e. does it contain relevant data)?
And in my opinion this doesn’t scale. For truly distributed, ad hoc service integration, I think you need a slightly different approach to the problem. And in my opinion to achieve this means embracing a more RESTful approach, and one that ideally takes advantage of the flexibility of RDF.
Rather than simply providing a list of services, I should point to the data. Towards the end of my paper (see the section “Self-Description as Service Connectors”) I suggested that use of rdfs:seeAlso to create RDF hyperlinks between documents and appropriately typing the linked resources will bring two advantages. Firstly it avoids the need to trawl through unnecessary services in order to get at the data that’s of interest, the user can explicitly point to it. Secondly there’s no need for API specific clients beyond the need for an HTTP GET request.
Here’s the example in the paper rewritten to address a particular DataPortability use case: “Aggregate your, and your friend’s, “Status” (eg Twitter) from all the “Status” systems you belong to.”
Firstly “my friends” can be those people listed in my FOAF document. FOAF provides the basic data substrate for glueing the services together. Secondly, I point to a web resource from which my “Status” message(s) can be retrieved:


<foaf:Person>
<eg:statuses>
<eg:Status
rdf:resource="http://twitter.com/statuses/user_timeline/14813.atom"/>
</eg:statuses>
</foaf:Person>

So a third-party service that needs to find my current Status simply identifies the relevant resource and then takes that URI and does an HTTP GET on it.
Then lets say I decide to move from Twitter and use some other service. Here’s what happens:


<foaf:Person>
<eg:statuses>
<eg:Status
rdf:resource="http://example.com/status/ldodds"/>
</eg:statuses>
</foaf:Person>

See, what I did? And guess what that Status aggregator has to do: Nothing.
In my opinion this rightly shifts the emphasis away from the details of individual service APIs and encourages standardization on data formats. Surely this has to be the most important aspect to Data Portability? For example it will encourage sites that produce Status messages to agree on how these will be published onto the web, whether that involves explicit standardization or simple adoption of a standard like Atom.
As I’ve written before, RDF does have some nice properties for enabling data integration and allowing for independent evolution of community specific vocabularies which are worth exploring in this context.
I really don’t see the need for intermediary services at all to create this kind of connection beyond services that allow for maintenance of a FOAF profile. The other nice property of this form of interaction is that I don’t need to use any services. If I decide to manage my own online presence, manage my own OpenID, and publish all my public data as a collection of hand-crafted static data files on my own server, then that’s fine: its all just URIs.
If we want true ownership of our own data, and true portability, then the means of integration needs to support this at the most fundamental level.
Pompous ass mode off.

Follow

Get every new post delivered to your Inbox.

Join 30 other followers