-
Notifications
You must be signed in to change notification settings - Fork 2
Pelagios Gazetteer Interconnection Format
We no longer actively support this format. From now on, please refer to the Linked Places Format, which fully supersedes the Pelagios Gazetteer Interconnection Format, and is also the preferred and recommended format for all Pelagios Network members and tools.
The outdated documentation below remains here for reference and archival purposes only.
Gazetteers form the backbone of the Pelagios initiative. Through shared gazetteer references, we create connections between otherwise disconnected datasets.
There are many gazetteers out there, and there are good reasons for this diversity: geographical and temporal coverage, granularity, cultural focus, technical emphasis (e.g. emphasis on names vs. geometry), scholarly quality, community,...
This is why Pelagios needs different gazetteers to interoperate with each other on their basic level, so that we can build tools and infrastructure that allows everyone to:
- search across different gazetteers
- find enough information in order to identify and disambiguate places
- annotate data with stable URIs to the most appropriate gazetteer
Our goal is not to define The One unified data model to represent gazetteers. What we aim for is simply a uniform way to build links between different gazetteers, along with just enough additional metadata to support the three requirements above.
A current reference implementation of cross-gazetteer search is part of the upcoming Peripleo API. (See screenshot above, which shows the overview page for Carnuntum, as covered by the different gazetteers linked to Pelagios.)
To publish a gazetteer to Pelagios, you need to create a summary of it in RDF, and publish it online as a dump file. The example below is a "dump file" with just a single place.
@prefix cito: <http://purl.org/spar/cito> .
@prefix cnt: <http://www.w3.org/2011/content#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix geosparql: <http://www.opengis.net/ont/geosparql#> .
@prefix gn: <http://www.geonames.org/ontology#> .
@prefix lawd: <http://lawd.info/ontology/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
<http://www.mygazetteer.org/place/Athens> a lawd:Place ;
# Don't think of label and description in terms of a
# 'primary name' or detailed abstract. Think of this in
# terms of UI: what do you want users to see about your
# place in a list of search results?
rdfs:label "Athens"@en ;
dcterms:description "A major Greek city-state"@en ;
# Optional: a present-day (ISO-3166 alpha2) country code
gn:countryCode "GR" ;
# Dont' think of this in terms of 'how long your place
# existed'. Use it to specify the period your gazetteer
# is concerned with it and/or provides attestations.
# In terms of format, use ISO 8601 (YYYY[-MM-DD) or time
# interval (<start>/<end>).
dcterms:temporal "-750/640" ;
# Additionally, we encourage the use of (one or multiple)
# PeriodO identifiers to denote time periods
dcterms:temporal <http://n2t.net/ark:/99152/p03wskd389m> ; # Greco-Roman
# Use closeMatch to express 'vague' matches, e.g. to link
# to a modern-day town now located there
skos:closeMatch <http://sws.geonames.org/264371/> ;
# Use exactMatch to express (geographical, temporal, cultural)
# identity
skos:exactMatch <http://pleiades.stoa.org/places/579885> ;
# Attestions can apply to individual names (as in the example
# below). But They may also apply to the place as a whole.
# You can also provide variant names using lawd:variantForm.
# For language encoding, use RFC 5646 format.
lawd:hasName [ lawd:primaryForm "Athens"@en ];
lawd:hasName [ lawd:primaryForm "Athenae" ] ;
lawd:hasName [
lawd:primaryForm "Αθήνα"@el ;
lawd:hasAttestation <http://www.mygazetteer.org/att/0001>
] ;
# Optional: a representative point coordinate
geo:location [ geo:lat 5.16 ; geo:long 52.05 ] ;
# Optional: detail geometry as WKT string
# (alternatively, use osgeo:asGeoJSON for a GeoJSON string)
geosparql:hasGeometry [
geosparql:asWKT "LINESTRING (5.16 52.05, 5.17 52.05, 5.16 52.06)" ;
] ;
foaf:primaryTopicOf
<http://www.mygazetteer.org/place/Athens.html> ;
dcterms:isPartOf <http://www.mygazetteer.org/place/Greece> ;
.
<http://www.mygazetteer.org/att/0001> a lawd:Attestation ;
dcterms:publisher <http://www.mygazetteer.org/> ;
cito:citesAsEvidence
<http://www.mygazetteer.org/documents/01234> ;
cnt:chars "Αθήνα"
.
The example above shows just some of the very basics. If you want, you can pack in a lot more data about your places!
- Example: adding image links
- Example: publishing bibliographic references
- Adding timestamps to names or geometries
- Adding source information to names or geometries
-
Vici.org publishes a Pelagios gazetteer dump of its places. The full dump file (~17 MB) is available at http://vici.org/vici/all/rdf.
-
The Digital Atlas of the Roman Empire publishes its gazetteer in Pelagios format. The dump file (gzipped RDF/Turtle, 1.5MB, ~27.000 places) is available at http://dare.ht.lu.se/export_pelagios3.ttl.gz
-
We maintain a Python script that converts the native dump format of the iDAI gazetteer to Pelagios here. The script will not be re-usable directly, but should provide a good starting point for other conversions.
-
An on-line service that validates your .ttl file, so you know whether you've done it right.
There has been discussion whether we need to distinguish between (a) modern name(s) vs. historical names. Most historical gazetteers will usually focus on historical names only; modern names would be considered 'finding aids' for users, rather than actual gazetteer data. One suggestion, therefore, has been to use dcterms:spatial for modern names:
dcterms:spatial "Athens"@en ;
or
dcterms:spatial [ rdfs:label "Athens"@en ] ;
Note that dcterms:spatial
, by definition, mandates an object of type dcterms:Location
. I.e. the former example may not be valid RDF (?). But it seems literal objects are widely used in the wild. (See e.g. use in Europeana.)
Our hierarchy model is deliberately kept as simple as possible (using dcterms:isPartOf or dcterms:hasPart). However, there is a clear use case for constraining the relation by time. ("Place A has been part of Place B between year X and Y" - e.g. in terms administrative units.) It's an open question on how to model this as simple & straightforward as possible.
I.e essentially what's needed is something like this ("RDF pseudo-code") that carries both the parent resource and the constraint as a payload.
dcterms:isPartOf "<http://maps.cga.harvard.edu/tgaz/placename/hvd_113652> part of Xihan 西汉 from -154 to -118 " ;
.
Note (RSi): compare qualified relations.