An idea/challenge that John had was to try to create a process that used semantic technologies to automate the process of creating a Google Map visualization of scientific locations around a specific place; the examples were showing things like notable natural places, labs, museums, etc around a given city. I haven’t really had much time to work on the idea, due to my computer being in the repair shop most of last week, and now trying to catch up on the many projects/assignments that built up, so today was really the first more in-depth look at it. I decided to spend today looking at queries, to try to get an idea of how to reliably get actual location data for scientific locations. As a sort of background guideline, my rough idea is to have the end product be able to take the name of a place, figure out what kind of place it is (probably will need user input to clarify), and then select a set of queries that will work for that kind of place to get the results to use in the map.

So, with that in mind, the first thing I did was to try to look at what sort of types are associated with locations, which is key for both finding what to center the map around as well as finding and distinguishing between different kinds of places around it.

select distinct ?name, ?lat, ?long, ?label, ?type
{
?name a ?type.
?name geo:lat ?lat.
?name geo:long ?long.
?name rdfs:label ?label.
} ORDER BY ?name

This query basically grabs all URI’s that have a latitude/longitude, their names/labels, and their type. Using this, I can see the range of different types I can try to use later to find and differentiate locations.

Next I tried to narrow down the results somewhat, using Place and Feature specifications, as well as limiting the results to ones with an English label.

select distinct ?name, ?lat, ?long, ?label
{
{?name a <http://dbpedia.org/ontology/Place>} UNION {?name a <http://dbpedia.org/ontology/Feature>}.
?name geo:lat ?lat.
?name geo:long ?long.
?name rdfs:label ?label.
FILTER langMatches( lang(?label), "EN" )
} ORDER BY ?name

However, although I see the natural places/features I would expect, I don’t see labs or colleges, which I will definitely need. I suspect that the first query returned so many results that I might just be seeing a subset of results that just happen to not include what I’m looking for, so I changed it to be more specific and just get me the various types.

select distinct ?type
{
?name a ?type.
?name geo:lat [].
?name geo:long [].
} ORDER BY ?type

Looking at the sheer amount of results, I realized that a ridiculous amount of things apparently have a latitude and longitude, and tried to narrow it down to just the overarching themes using a regex on what looks like the top level of dbpedia objects.

select distinct ?type
{
?name a ?type.
?name geo:lat [].
?name geo:long [].
FILTER regex(?type,"^http://dbpedia.org/ontology/")
} ORDER BY ?type

This looks promising…I see themes for sites of special scientific interest, educational institutions, protected areas, historical sites, and a lot of other categories which I am hoping are included under the Places/Features from earlier.

I ran a brief query just to check that these things do, in fact, all fit under Places/Features:

select distinct ?name, ?lat, ?long, ?label, ?type
{
?name a ?type
{?name a <http://dbpedia.org/ontology/Place>} UNION {?name a <http://dbpedia.org/ontology/Feature>}.
?name geo:lat ?lat.
?name geo:long ?long.
?name rdfs:label ?label.
FILTER langMatches( lang(?label), "EN" )
} ORDER BY ?name

Next, I need to look at the attributes of the different kinds of places to try to get an idea of attributes that I can use to narrow down the full queries later. I’m thinking of using queries like this:

select ?subject, ?property, ?object
{
?subject a <http://dbpedia.org/ontology/EducationalOrganization>.
?subject ?property ?object
} ORDER BY ?subject LIMIT 10

A few key issues will be finding a way to use these to make sure that they are scientific locations and to have a query that does not take too long. I am thinking of having the final code use several queries, specifically tailored to a specific kind of ‘scientific location’, and aggregating all of that data into the map one-by-one. This will allow each query to be as small as possible for its case. Based on the original process of making the maps manually, the automated process needs to be able to plot museums, learned societies (would that be like the USGS headquarters or something?), universities/colleges, libraries, and historic sites relevant to science.

I sort of did a lot of general queries this time, trying to get an idea of the kinds of areas I’ll be able to use in searching for places, but not much on the specific location types and attributes, which I’ll probably look at next time.

On a side note, you can now go here to see the W3C Semantic Web Wiki page for the ISWC 2010 Data/Demos and if you scroll down to Browsers Developed for ISWC, my Filtered Browser is listed!

Advertisements