## Map and Territory in RDF APIs

RDF specs and APIs have made a bit of a mess out of a couple pretty basic tools of math and computing: graphs and logic formulas. With the RDF next steps workshop coming up and Pat Hayes re-thinking RDF semantics Sandro thinking out loud about RDF2, I'd like us to think about RDF in more traditional terms. The scala programming language seems to be an interesting framework to explore how they relate to RDF.

The Feb 1999 RDF spec wasnt very clear about the map and the territory. It said that statements are made out of parts in the territory, rather than features on the map, which doesnt make very much sense. RDF APIs seem to inherit this confusion; e.g. from an RDF::Value class for ruby:

''

## Examples:

Checking if a value is a resource (blank node or URI reference)

value.resource

Blank nodes and URI references are parts of the map; resources are in the territory.

Likewise in Package org.jrdf.graph:

ResourceA resource stands for either a Blank Node or a URI Reference.

The 2004 RDF specs take great pains to clarify these use/mention distinctions, but they also go on at great length.

Let's review Wikipedia on graphs:

In mathematics, a

graphis an abstract representation of a set of objects where some pairs of the objects are connected by links. ... The edges may be directed (asymmetric) or undirected (symmetric) ... and the edges are calleddirected edgesorarcs; ... graphs which have labeled edges are callededge-labeledgraphs.

With that in mind, in the swap-scala project, we summarize the RDF abstract syntax as an edge-labelled directed graph with just one or two wrinkles:

package org.w3.swap.rdf

trait RDFGraphParts {

type Arc = (SubjectNode, Label, Node)

type Node

type Literal <: Node

type SubjectNode <: Node

type BlankNode <: SubjectNode

type Label <: SubjectNode

}

The wrinkles are:

- Arcs can only start from BlankNodes or Labels, i.e. SubjectNodes
- Arcs labels may also appear as Nodes

We use another trait to relate concrete datatypes to these abstract types:

trait RDFNodeBuilder extends RDFGraphParts {

def uri(i: String): Label

type LanguageTag = Symbol

def plain(s: String, lang: Option[LanguageTag]): Literal

def typed(s: String, dt: String): Literal

def xmllit(content: scala.xml.NodeSeq): Literal

}

This doesn't pin down what a **Label** is, but in any concrete implementation, you can build one from a **String** using the **uri** method. The **RDFNodeBuilder** trait is used to implement RDF/XML, RDFa, and turtle parsers that are agnostic to the concrete implementation of an RDF graph.

Now let's look at terms of first order logic:

The set of

termsis inductively defined by the following rules:

Variables.Any variable is a term.Functions.Any expressionf(t_{1},...,t_{n}) ofnarguments (where each argumentt_{i}is a term andfis a function symbol of valencen) is a term.

This is represented straightforwardly in scala a la:

package org.w3.swap.logic1

/**

* A Term is either a Variable or an FunctionTerm.

*/

sealed abstract class Term { ... }

class Variable extends Term { ...}

abstract class FunctionTerm() extends Term {

def fun: Any

def args: List[Term]

}

The core RDF doesn't cover all of first order logic; it corresponds fairly closely to the conjunctive query fragment:

The conjunctive queries are simply the fragment of first-order logic given by the set of formulae that can be constructed from atomic formulae using conjunction and existential quantification , but not using disjunction , negation , or universal quantification .

We can then excerpt just the relevant parts of the definition of formulas:

The set of

formulasis inductively defined by the following rules:

Predicate symbols.IfPis ann-ary predicate symbol andt_{1}, ...,t_{n}are terms thenP(t_{1},...,t_{n}) is a formula.Binary connectives.If φ and ψ are formulas, then (φ ψ) is a formula. Similar rules apply to other binary logical connectives.Quantifiers.If φ is a formula andxis a variable, then and are formulas.

Our scala representation follows straightforwardly:

package org.w3.swap.logic1ec

sealed abstract class ECFormula

case class Exists(vars: Set[Variable], g: And) extends ECFormula

sealed abstract class Ground extends ECFormula

case class And(fmlas: Seq[Atomic]) extends Ground

case class Atomic(rel: Symbol, args: List[Term]) extends Ground

Now that we have scala representations for RDF graphs and conjunctive query formulas, how do we relate them? This is the fun part:

package org.w3.swap.rdflogic

import swap.rdf.RDFNodeBuilder

import swap.logic1.{Term, FunctionTerm, Variable}

import swap.logic1ec.{Exists, And, Atomic, ECProver, ECFormula}

/**

* RDF has only ground, 0-ary function terms.

*/

abstract class Ground extends FunctionTerm {

override def fun = this

override def args = Nil

}

case class Name(n: String) extends Ground

case class Plain(s: String, lang: Option[Symbol]) extends Ground

case class Data(lex: String, dt: Name) extends Ground

case class XMLLit(content: scala.xml.NodeSeq) extends Ground

/**

* Implement RDF Nodes (except BlankNode) using FOL function terms

*/

trait TermNode extends RDFNodeBuilder {

type Node = Term

type SubjectNode = Term

type Label = Name

def uri(i: String) = Name(i)

type Literal = Term

def plain(s: String, lang: Option[Symbol]) = Plain(s, lang)

def typed(s: String, dt: String): Literal = Data(s, Name(dt))

def xmllit(e: scala.xml.NodeSeq): Literal = XMLLit(e)

}

The abstract RDFGraphBuilder node types are implemented as first order logic terms. For formulas, we use a "holds" predicate:

object RDFLogic extends ... {

def atom(s: Term, p: Term, o: Term): Atomic = {

Atomic('holds, List(s, p, o))

}

def atom(arc: (Term, Term, Term)): Atomic = {

Atomic('holds, List(arc._1, arc._2, arc._3))

}

}

Then all the semantic machinery up to simple entailment between RDF graphs just falls out of conjunctive query.

I haven't done RDFS Entailment yet; the plan is to do basic rules first (N3rules or RIF BLD) and then use that for RDFS, OWL2-RL, and the like.