Reading this interesting post containing lessons for ontology writers by Ian Davis this morning, it occured to me that the key lesson is applicable to open data publishing in general and not just to ontology design.
Ian’s post describes some of the techniques introduced in the Taming the Open World session at SemTech. I won’t repeat them all here. Go and read the post. The majority of the techniques relate to schema (i.e. ontology) design, e.g. identifying what types of resource a property relates, whether two types of resource are completely unrelated, etc.
I think these all boil down to a general principle to: say everything. i.e. if you know something is true, if you have a fact that you can share, then share it. Commonly in open data discussions we tend to focus on the basic facts: the data we want to see opened up, and build cool stuff against. But we mustn’t forget the the need to share the metadata too. All data has metadata, even metadata. And schemas are a form of metadata.
Some of the advice quoted in Ian’s post was new to me. It hadn’t occured to me that there were some real benefits in the additional precision. And given I’ve moaned before about performance of reasoners, I can now see where I’ve not been helping them out. As usual its all obvious in hind-sight. I’m sure there are other easy wins. For example, I rarely see RDF data that has specifies its data type. Why not? If you know what type a literal is, then why not say so?
So remember, say everything. Speak to the machine.