The Web’s Rich Tapestry

This post was originally published on the Talis “Nodalities” blog.

We’ve all read books that linger in our memories. And there are any number of reasons why they might do so; a stirring tale or thought-provoking argument, for example. One book that has stayed with me over the years is The House of Leaves by Mark Danielweski. It’s been described as “the Blair Witch” of haunted house tales, being the story of a house, the people who live there, and those who attempt to document the strange events and structure of the building. The book is quite a challenging read as it is made up of overlapping narratives, documentary evidence from the investigators, etc. As a reader you’re assembling a narrative out of the interlocking pieces of text that the author presents you with.

But, while the tale is one of those slow burrning horror stories that does linger at the back of the mind, that’s not the primary reason why the book has stayed with me. It was the actual structure of the text that was so intriguing: the author has played with the printed form, including the basic layout of the print on the page in an attempt to further promote the mythology of the story and to help convey the labyrinthine nature of the house. For example a typical page might contain several different blocks of text, and much of the story is told through footnotes and footnotes to footnotes, and footnotes to those footnotes. Certain words are coloured differently throughout the text. There are even blocks of text embedded in the page which you have to read downwards through several pages before returning to your starting point. As a reader you’re physically exploring the text much like the characters are exploring the house.

The book is basically a hypertext novel and while certainly not the first to play with the printed form in this way, it was the first that I’d personally encountered. As a hypertext the book appeals to the technologist in me: I’ve given a number of talks over the past few years and in many of these I’ve explored the evolution of hypertext systems. But I’ve also attempted to challenge people’s pre-conceptions about the medium of the web, just as the House of Leaves challenged my conceptions about the printed medium.

My most recent talk was last week at the ALPSP Internationational Conference 2008 which took place last week in Old Windsor. The talk, titled “The Web’s Rich Tapestry“, discussed the link as the basic medium of the web and reviewed how the blurring of boundaries between websites, services and data (aka “Web 2.0″) is enabled by increasingly richer linking between resources. This is part of a move from old broadcast models of information publishing to a more web-like network of interconnected peers each contributing to a dense information medium. The ultimate endpoint of this inherent in the vision of the Semantic Web, and will complete the change from a document-centric to a data-centric world. The Semantic Web, which is just a layer on top of the existing web, is still based on linking. Albeit linking of a more fine-grained and meaningful nature.

The Semantic Web, just like the existing Web, will arrive through the actions of individuals, organizations and businesses, each contributing to the whole by sharing linked data sets; this process is already happening. And, like the Web, the more data is available, the more value there will be for everyone involved. I urged society publishers to begin more openly sharing their metadata and exploring the potential inherent in the Web of Data. I also attempted to do more than just evangelize the potential benefits of the Semantic Web and also tried to provide a few pointers towards where those benefits might be realized.

One obvious benefit relates to the generation of more traffic to content and services. For many publishers a sizeable, if not the majority, of their website traffic is driven by Google referrals. This is an inherently fragile situation, but one that I believe is ultimately temporary. The scale of this traffic generation is obviously due in major part to the popularity of the Google search engine, but it isenabled by their ability to quickly and efficiently crawl websites in order to index content. This provides a large “surface area” to which Google can generate links. By publishing open data, information providers will be able to grow this surface area by at least an order of magnitude due to the more fine-grained data publishing that the Semantic Web entails. All of this data can potential generate new, highly relevant traffic to content and services.

The other area that the Semantic Web will pay off is by enabling much more sophisticated research and analysis tools, not just for academic researchers and students, but also for all of us in our every day consumption of information. In my view there is too much of a focus on search and not enough on information visualisation and analysis tools. I pointed towards some very recent experiments which I think illustrate some of this potential, including Ubiquity and Freebase Parallax. Talis’s own Project Xiphos is also exploring the innovation that can follow from re-purposing publishing metadata, a topic that was particularly relevant to the ALPSP audience. In my new role as Programme Manager for the Talis Platform, I’m excited to begin exploring how we can start helping businesses to begin drawing value from the rapidly growing Web of Data.