Dan Connolly's tinkering lab notebook

Linked Data at WWW2007: GRDDL, SPARQL, and Wikipedia, oh my!

Last Tuesday, TimBL started to gripe that the WWW2007 program had lots of stuff that he wanted to see all at the same time; we both realized pretty soon: that's a sign of a great conference.

That afternoon, Harry Halpin and I gave a GRDDL tutorial. Deploying Web-scale Mash-ups by Linking Microformats and the Semantic Web is the title Harry came up with... I was hesitant to be that sensationalist when we first started putting it together, but I think it actually lived up to the billing. It's too bad last-minute complications prevented Murray Maloney from being there to enjoy it with us.

For one thing, GRDDL implementations are springing up all over. I donated my list to the community as the GrddlImplementations wiki topic, and when I came back after the GRDDL spec went to Candidate Recommendation on May 2, several more had sprung up.

What's exciting about these new implementations is that they go beyond the basic "here's some RDF data from one web page" mechanism. They're integrated with RDF map/timeline browsers, and SPARQL engines, and so on.

The example from the GRDDL section of the semantic web client library docs (by Chris Bizer, Tobias Gauß, and Richard Cyganiak) is just "tell me about events on Dan's travel schedule" but that's just the tip of the iceberg: they have implemented the whole LinkedData algorithm (see the SWUI06 paper for details).

With all this great new stuff popping up all over, I felt I should include it in our tutorial materials. I'm not sure how long OpenLink Virtuoso has had GRDDL support (along with database integration, WEBDAV, RSS, Bugzilla support, and on and on), but it was news to me. But I also had to work through some bugs in the details of the GRDDL primer examples with Harry (not to mention dealing with some unexpected input on the HTML 5 decision). So the preparation involved some late nights...

I totally forgot to include the fact that Chime got the Semantic Technologies conference web site using microformats+GRDDL, and Edd did likewise with XTech.

But the questions from the audience showed they were really following along. I was a little worried when they didn't ask any questions about the recursive part of GRDDL; when I prompted them, they said they got it. I guess verbal explanations work; I'm still struggling to find an effective way to explain it in the spec. Harry followed up with some people in the halls about the spreadsheet example; as mnot said, Excel spreadsheets contain the bulk of the data in the enterprise.

One person was even followingn along closely enough to help me realize that the slide on monotonicity/partial understanding uses a really bad example.

The official LinkedData session was on Friday, but it spilled over to a few impromptu gatherings; on Wednesday evening, TimBL was browsing around with the tabulator, and he asked for some URIs from the audience, and in no time, we were browsing protiens and diseases, thanks to somebody who had re-packaged some LSID-based stuff as HTTP+RDF linked data.

Giovanni Tummarello showed a pretty cool back-link service for the Semantic Web. It included support for finding SPARQL endpoints relevant to various properties and classes, a contribution to the serviceDescription issue that the RDF Data Access Working Group postponed. I think I've seen a few other related ideas here and there; I'll try to put them in the ServiceDescription wiki topic when I remember the details...

Chris Bizer showed that dbpedia is the catalyst for an impressive federation of linked data. Back in March 2006, Toward Semantic Web data from Wikipedia was my wish into the web, and it's now granted. All those wikipedia infoboxes are now out there for SPARQLing. And other groups are hooking up musicbrainz and wordnet and so on. After such a long wait, it seems to be happening so fast! 

Speaking of fast, the Semantic MediaWiki project itself is starting to do performance testing with a full copy of wikipedia, Denny told us on Friday afternoon in the DevTrack.

Also speaking of fast, how did OpenLink go from not-on-my-radar to supporting every Semantic Web Technology I have ever heard of in about a year? I got part of the story in the halls... it started with ODBC drivers about a decade ago, which explains why their database integration is so good. Kingsley, here's hoping we get to play volleyball sometime. It's a shame we had just a few short moments together in the halls...

tags: (photos), grddl, www2007, travel