MadMode

Dan Connolly's tinkering lab notebook

Stitching the Semantic Web together with OWL at AAAI-06

I was pleased to find that AAAI '06 in Boston a couple weeks ago had a spectrum of people I know and don't know and work that's near and far from my own. The talk about the DARPA grand challenge was inspiring.

But closer to my work, I ran into Jeff Heflin, who I worked with on DAML and especially the OWL requirements document. Amid too many papers about ontologies for the sake of ontologies and threads like Is there real world RDF-S/OWL instance data?, his Investigation into the Feasibility of the Semantic Web is a breath of fresh air. The introduction sets out their approach this way:

Our approach is to use axioms of OWL, the de facto Semantic Web language, to describe a map for a set of ontologies. The axioms will relate concepts from one ontology to the other. ... There is a well-established body of research in the area of automated ontology alignment. This is not our focus. Instead we investigate the application of these alignments to provide an integrated view of the Semantic Web data.

(emphasis mine). The rest of the paper justifies this approach, leading up to:

We first query the knowledge base from the perspective of each of the 10 ontologies that define the concept Person. We now ask for all the instances of the concept Person. The results vary from 4 to 1,163,628. We then map the Person concept from all the ontologies to the Person concept defined in the FOAF ontology. We now issue the same query from the perspective of this map and we get 1,213,246 results. The results now encompass all the data sources that commit to these 10 ontologies. Note: a pair wise mapping would have taken 45 mapping axioms to establish this alignment instead of the 9 mapping axioms that we used. More importantly due to this network effect of the maps, by contributing just a single map, one will automatically get the benefit of all the data that is available in the network.

That's fantastic stuff.

We now pause for a word from Steve Lawrence; NEC Research Institute, to lament the lack of free online proceedings for AAAI: Articles freely available online are more highly cited. For greater impact and faster scientific progress, authors and publishers should aim to make research easy to access. OK, now back to the great paper...

Along the way, they give a definition of a knowledge function, K, that is remarkably similar to log:semantics from N3. They also define a commitment function that is basically the ontological closure pattern.

The approach to querying all this data is something they call DLDB, which comes from a paper they submitted to the ISWC Practical and Scalable Semantic Systems workshop. Darn! no full text proceedings online again. Ah... Jeff's pubs include a tech report version. To paraphrase: there's a table for each class and a table for each property that relates rows from the class tables. They use a DL reasoner to find subclass relationships, and they make views out of them. I have never seen this approach to before; it sure looks promising. I wonder if we can integrate it into our dbview work somehow and perhaps into our truth-maintenance system in the TAMI project.

This wasn't the only work at AAAI on scalable, practical knowledge representation. I caught just a glance at some other papers at the conference that exploit wikipedia as a dataset in various algorithms. I hope to study those more.

I also ran into Ben Kuipers, whose Algernon and Access-Limited Logic has long appealed to me as an approach to reasoning that might work well when scaled up to Semantic Web data sets. That work is mostly on hold; we started talking about getting it going again, but didn't get very far into the conversation. I hope to pick that up again soon.

I gather the 1.0 release of OpenCyc happened at the conference; there's a lot of great stuff in cyc, but only time will tell how well it will integrate with other Semantic Web stuff.

Meanwhile, a handy citation for Heflin's paper...

That's marked up using an XHTML/LaText/BibTex idiom that I'm working on so that we get BibTex for free:

@inproceedings{pan06a,
title = "{An Investigation into the Feasibility of the Semantic Web}",
    author = {Z. Pan and A. Qasem and J. Heflin},
    booktitle = {Proc. of the Twenty First  National Conference on Artificial Intelligence  (AAAI 2006)},
    year = {2006},
    address = {Boston, USA},
}