MadMode

Dan Connolly's tinkering lab notebook

Choosing flight itineraries using tabulator and data from Wikipedia

While planning a trip to Boston/Cambridge, I was faced with a blizzard of itinerary options from American Airlines. I really wanted to overlay them all on the same map or calendar or something. I pretty much got it to work:

That's a map view using the tabulator, which had another release today. The itinerary data in RDF is converted from HTML via grokOptions.xsl (and tidy).

I can, in fact, see all the itineraries on the same calendar view. Getting these views to be helpful in choosing between the itineraries is going to take some more work, but this is a start.

Getting a map view required getting latitude/longitude info for the airports. I think getting Semantic Web data from Wikipedia is a promising approach. A while back, I figured out how to get lat/long data for airports from wikipedia. This week, I added a Kid template, aptinfo.kid, and I figured figured out how to serve up that data live from the DIG/CSAIL web server. For example, http://dig.csail.mit.edu/2006/wikdt/airports?iata=ORD#item is a URI for the Chicago airport, and when you GET it with HTTP, a little CGI script calls aptdata.py, which fetches the relevant page from wikipedia (using an httplib2 cache) and scrapes the lat/long and a few other details and gives them back to you in RDF. Viewed with RDF/N3 glasses, it looks like:

#   Base was: http://dig.csail.mit.edu/2006/wikdt/airports?iata=ORD
@prefix : <#> .
@prefix apt: <http://www.daml.org/2001/10/html/airport-ont#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix go: <http://www.w3.org/2003/01/geo/go#> .
@prefix s: <http://www.w3.org/2000/01/rdf-schema#> .

:item a apt:Airport;
apt:iataCode "ORD";
s:comment "tz: -6";
s:label "O'Hare International Airport";
go:within_3_power_11_metres :nearbyCity;
geo:lat "41.9794444444";
geo:long "-87.9044444444";
foaf:homepage <http://en.wikipedia.org/wiki/O%27Hare_International_Airport> .

:nearbyCity foaf:homepage <http://en.wikipedia.org/wiki/wiki/Chicago%2C_Illinois>;
foaf:name "Chicago, Illinois" .

In particular, notice that:

  • I use the swig geo vocabulary, which the new GEO XG is set up to refine. The use of strings rather than datatyped floating point numbers follows the schema for that vocabulary.
  • I use distinct URIs for the airport (http://dig.csail.mit.edu/2006/wikdt/airports?iata=ORD#item) and the page about the airport (http://dig.csail.mit.edu/2006/wikdt/airports?iata=ORD).
  • I use an owl:InverseFunctionalProperty, foaf:homepage to connect the airport to its wikipedia article, and another, apt:iatacode to relate the airport to its IATA code.
  • I use the GeoOnion pattern to relate the airport to a/the city that it serves. I'm not sure I like that pattern, but the idea is to make a browseable web of linked cities, states, countries, and other geographical items.

Hmm... I use rdfs:label for the name of the airport but foaf:name for the name of the city.I don't think that was a conscious choice. I may change that.

The timezone info is in an rdfs:comment. I hope to refine that in future episodes. Stay tuned.