Skip to content

Latest commit

 

History

History
1837 lines (1813 loc) · 70.5 KB

Mappings.md

File metadata and controls

1837 lines (1813 loc) · 70.5 KB

⚠️ Warning

This documentation is out of date and no longer maintained.

The reference and up to date documentation is available at: https://ec-jrc.github.io/datacite-to-dcat-ap/

Mappings defined in DataCite+DCAT-AP

Status of this document

This document is a draft meant to report work in progress concerning an exercise, carried out at the Joint Research Centre of the European Commission (Units B.6 & G.I.4), for the alignment of DataCite metadata with DCAT-AP.

As such, it can be updated any time and it must be considered as unstable.

Abstract

This documents illustrates the mappings defined in DataCite+DCAT-AP, as implemented in the datacite-to-dcat-ap.xsl XSLT.

The background and methodology for the design of DataCite+DCAT-AP are illustrated in a separate document:

DataCite+DCAT-AP: Background & methodology

Table of contents

Prefix Namespace URI Schema & documentation
dc http://purl.org/dc/elements/1.1/ Dublin Core Metadata Element Set, Version 1.1
dcat http://www.w3.org/ns/dcat# Data Catalog Vocabulary
dct http://purl.org/dc/terms/ DCMI Metadata Terms
duv http://www.w3.org/ns/duv# Dataset Usage Vocabulary
foaf http://xmlns.com/foaf/0.1/ FOAF Vocabulary
frapo http://purl.org/cerif/frapo/ FRAPO, the Funding, Research Administration and Projects Ontology
geo http://www.w3.org/2003/01/geo/wgs84_pos# W3C Basic Geo (WGS84 lat/long) vocabulary
gsp http://www.opengis.net/ont/geosparql# GeoSPARQL - A Geographic Query Language for RDF Data
locn http://www.w3.org/ns/locn# ISA Programme Core Location Vocabulary
org http://www.w3.org/ns/org# The Organization Ontology
owl http://www.w3.org/2002/07/owl# OWL Web Ontology Language Reference
prov http://www.w3.org/ns/prov# PROV-O: The PROV Ontology
rdf http://www.w3.org/1999/02/22-rdf-syntax-ns# Resource Description Framework (RDF): Concepts and Abstract Syntax
rdfs http://www.w3.org/2000/01/rdf-schema# RDF Vocabulary Description Language 1.0: RDF Schema
schema http://schema.org/ schema.org
skos http://www.w3.org/2004/02/skos/core# SKOS Simple Knowledge Organization System - Reference
vcard http://www.w3.org/2006/vcard/ns# vCard Ontology
xsd http://www.w3.org/2001/XMLSchema# XML Schema Part 2: Datatypes Second Edition
wdrs https://www.w3.org/2007/05/powder-s# Protocol for Web Description Resources (POWDER): POWDER-S Vocabulary (WDRS)
DataCite metadata elements Code list URI Code lists Status
Language http://publications.europa.eu/resource/authority/language Language register operated by the Metadata Registry of the Publications Office of the EU [MDR-LANG] stable
Format http://publications.europa.eu/resource/authority/file-type File type register operated by the Metadata Registry of the Publications Office of the EU [MDR-FT] stable
http://www.iana.org/assignments/media-types IANA MIME Media Types register testing

The following section summarises the alignments defined in DataCite+DCAT-AP.

The alignments are grouped as follows:

  • Alignments for 1st-level DataCite metadata elements
  • Alignments for 2nd-level DataCite metadata elements
  • Alignments for identifiers used in DataCite records

The alignments supported only in the extended profile of DataCite+DCAT-AP are in bold.

The mappings illustrated in this section concern the 1st-level elements in the DataCite metadata schema.

These elements specify properties / relationships that, in some cases, can be futher specialised with an attribute denoting their sub-type (e.g., the "type" of resource, the "type" of contributor, the "type" of related resource). For this reason, elements having a "type" attribute have both a default mapping for the element, and a specific mapping for the type. The default mapping is used in the following cases:

  • The element "type" is not specified
  • No mapping is specified for a given element "type"

As a rule, the domain of the mappings is the one corresponding to the ResourceType element (i.e., rdfs:Resource, dcat:Dataset, dctype:Service, or dctype:Event). However, "starred" elements - i.e., elements whose name is preceded by an asterisk ("*") - are those having as domain dcat:Distribution when the resource is modelled as a dcat:Dataset.

Element Type Mappings Mapping status Comments
Property or RDF/XML attribute Range
Identifier @rdf:about rdfs:Resource (URI reference) testing
dct:identifier xsd:anyURI testing
dcat:landingPage rdfs:Resource (URI reference) testing If the resource is modelled as a dcat:Dataset
foaf:page rdfs:Resource (URI reference) testing If the resource is not modelled as a dcat:Dataset
* dcat:accessURL rdfs:Resource (URI reference) testing If the resource is modelled as a dcat:Dataset, the domain is dcat:Distribution
Creator dct:creator foaf:Agent testing
Title default dct:title rdf:PlainLiteral testing
AlternativeTitle dct:alternative rdf:PlainLiteral testing
Subtitle ??:?? rdf:PlainLiteral unstable TBD
TranslatedTitle dct:title rdf:PlainLiteral testing
Publisher dct:publisher foaf:Agent testing
PublicationYear dct:issued xsd:gYear testing
Subject dct:subject skos:Concept testing If the subject is associated with a subject scheme
dcat:keyword rdf:PlainLiteral testing If the subject is not associated with a subject scheme
Contributor default dct:contributor foaf:Agent testing Only for the extended profile
ContactPerson dcat:contactPoint vcard:Individual testing
DataCollector ??:?? foaf:Agent unstable TBD
DataCurator ??:?? foaf:Agent unstable TBD
DataManager ??:?? foaf:Agent unstable TBD
Distributor duv:hasDistributor foaf:Agent testing Only for the extended profile
Editor schema:editor foaf:Agent testing Only for the extended profile
Funder schema:funder foaf:Agent testing

Only for the extended profile.

This element has been deprecated in DataCite 4.0, in favour of new element FundingReference.

HostingInstitution ??:?? foaf:Agent unstable TBD
Producer schema:producer foaf:Agent testing Only for the extended profile
ProjectLeader ??:?? foaf:Agent unstable TBD
ProjectManager ??:?? foaf:Agent unstable TBD
ProjectMember dct:contributor foaf:Agent testing Only for the extended profile
* foaf:member testing

Only for the extended profile

The domain of property foaf:member is class foaf:Project.

The resource is linked to foaf:Project with property prov:wasGeneratedBy.

RegistrationAgency ??:?? foaf:Agent unstable TBD
RegistrationAuthority ??:?? foaf:Agent unstable TBD
RelatedPerson ??:?? foaf:Agent unstable TBD
Researcher ??:?? foaf:Agent unstable TBD
ResearchGroup ??:?? foaf:Agent unstable TBD
RightsHolder dct:rightsHolder foaf:Agent testing Only for the extended profile
Sponsor schema:sponsor foaf:Agent testing Only for the extended profile
Supervisor ??:?? foaf:Agent unstable TBD
WorkPackageLeader ??:?? foaf:Agent unstable TBD
Other dct:contributor foaf:Agent testing Only for the extended profile
Date default dct:date xsd:date testing Only for the extended profile
Accepted dct:dateAccepted xsd:date testing Only for the extended profile
Available dct:available xsd:date testing Only for the extended profile
Copyrighted dct:dateCopyrighted xsd:date testing Only for the extended profile
Collected dct:created xsd:date unstable TBD
Created dct:created xsd:date testing Only for the extended profile
Issued dct:issued xsd:date testing
Submitted dct:dateSubmitted xsd:date testing Only for the extended profile
Updated dct:modified xsd:date testing
Valid dct:valid xsd:date testing Only for the extended profile
Language dct:language dct:LinguisticSystem testing
ResourceType default rdf:type rdfs:Resource unstable TBD
Audiovisual rdf:type dcat:Dataset testing
dct:type dctype:MovingImage testing Only for the extended profile
Collection rdf:type dcat:Dataset testing
dct:type dctype:Collection testing Only for the extended profile
DataPaper rdf:type dcat:Dataset testing Added in DataCite v4.1
dct:type ??:?? unstable TBD
Dataset rdf:type dcat:Dataset testing
dct:type dctype:Dataset testing Only for the extended profile
Event rdf:type dctype:Event testing Only for the extended profile
dct:type dctype:Event testing Only for the extended profile
Image rdf:type dcat:Dataset testing
dct:type dctype:Image testing Only for the extended profile
InteractiveResource rdf:type dcat:Dataset testing
dct:type dctype:InteractiveResource testing Only for the extended profile
Model rdf:type dcat:Dataset testing
dct:type ??:?? unstable TBD
PhysicalObject rdf:type dctype:PhysicalObject testing Only for the extended profile
dct:type dctype:PhysicalObject testing Only for the extended profile
Service rdf:type dctype:Service testing Only for the extended profile
dct:type dctype:Service testing Only for the extended profile
Software rdf:type dcat:Dataset testing
dct:type dctype:Software testing Only for the extended profile
Sound rdf:type dcat:Dataset testing
dct:type dctype:Sound testing Only for the extended profile
Text rdf:type dcat:Dataset testing
dct:type dctype:Text testing Only for the extended profile
Workflow rdf:type dcat:Dataset testing
dct:type ??:?? unstable TBD
Other rdf:type rdfs:Resource unstable TBD
dct:type ??:?? unstable TBD
AlternateIdentifier owl:sameAs URI reference testing
adms:identifier adms:Identifier testing
RelatedIdentifier default dct:relation rdfs:Resource testing
IsCitedBy ??:?? rdfs:Resource unstable TBD
Cites ??:?? rdfs:Resource unstable TBD
IsSupplementTo ??:?? rdfs:Resource unstable TBD
IsSupplementedBy ??:?? rdfs:Resource unstable TBD
IsContinuedBy ??:?? rdfs:Resource unstable TBD
Continues ??:?? rdfs:Resource unstable TBD
HasMetadata foaf:isPrimaryTopicOf dcat:CatalogRecord (URI reference) testing
IsMetadataFor foaf:primaryTopic rdfs:Resource testing Only for the extended profile
IsNewVersionOf dct:isVersionOf rdfs:Resource testing
IsPreviousVersionOf dct:hasVersion rdfs:Resource testing
IsPartOf dct:isPartOf rdfs:Resource testing Only for the extended profile
HasPart dct:hasPart rdfs:Resource testing Only for the extended profile
IsReferencedBy dct:isReferencedBy rdfs:Resource testing Only for the extended profile
References dct:references rdfs:Resource testing Only for the extended profile
IsDocumentedBy foaf:page rdfs:Resource testing
Documents ??:?? rdfs:Resource unstable TBD
IsCompiledBy ??:?? rdfs:Resource unstable TBD
Compiles ??:?? rdfs:Resource unstable TBD
IsVariantFormOf schema:isVariantOf rdfs:Resource testing Only for the extended profile
IsOriginalFormOf ??:?? rdfs:Resource unstable TBD
IsIdenticalTo owl:sameAs rdfs:Resource testing Only for the extended profile
IsReviewedBy schema:review rdfs:Resource testing Only for the extended profile
Reviews schema:itemReviewed rdfs:Resource testing Only for the extended profile
IsDerivedFrom dct:source rdfs:Resource testing
IsSourceOf prov:hadDerivation rdfs:Resource testing Only for the extended profile
Describes ??:?? rdfs:Resource unstable TBD
IsDescribedBy wdrs:describedby rdfs:Resource testing Only for the extended profile
HasVersion dct:hasVersion rdfs:Resource testing
IsVersionOf dct:isVersionOf rdfs:Resource testing
Requires dct:requires rdfs:Resource testing Only for the extended profile
IsRequiredBy dct:isRequiredBy rdfs:Resource testing Only for the extended profile
* Size dct:extent dct:SizeOrDuration testing

If the resource is modelled as a dcat:Dataset, the domain is dcat:Distribution.

Only for the extended profile.

* Format dct:format dct:MediaTypeOrExtent testing

If not specified with a IANA media type

If the resource is modelled as a dcat:Dataset, the domain is dcat:Distribution.

dcat:mediaType dct:MediaTypeOrExtent (URI reference) testing

If specified with a IANA media type

If the resource is modelled as a dcat:Dataset, the domain is dcat:Distribution.

Version owl:versionInfo rdf:PlainLiteral testing
* Rights dct:rights dct:RightsStatement testing If the resource is modelled as a dcat:Dataset, the domain is dcat:Distribution.
Description default dct:description rdf:PlainLiteral testing
Abstract dct:description rdf:PlainLiteral testing
Methods dct:provenance dct:ProvenanceStatement testing
SeriesInformation ??:?? ??:?? unstable TBD
TableOfContents dct:tableOfContents rdf:PlainLiteral testing Only for the extended profile.
Other rdfs:comment rdf:PlainLiteral testing Only for the extended profile.
GeoLocation dct:spatial dct:Location testing
FundingReference frapo:isFundedBy foaf:Project testing

Element added in DataCite 4.0.

Only for the extended profile.

The mappings illustrated in this section concern the 2nd-level elements in the DataCite metadata schema.

These elements, and the corresponding mappings, are grouped in the following classes:

  • Elements with child elements
  • Elements with attributes

Elements with child elements

Element Child elements Mappings Mapping status Comments
Domain Property or RDF/XML attribute Range
Creator creatorName foaf:Agent foaf:name rdf:PlainLiteral testing
givenName foaf:givenName rdf:PlainLiteral testing
familyName foaf:familyName rdf:PlainLiteral testing
nameIdentifier @rdf:about URI reference testing
affiliation org:memberOf foaf:Organization testing
Contributor contributorName foaf:Agent foaf:name rdf:PlainLiteral testing
vcard:Individual vcard:fn rdf:PlainLiteral testing If the contributor type is "ContactPerson"
givenName foaf:Agent foaf:givenName rdf:PlainLiteral testing
vcard:Individual vcard:given-name rdf:PlainLiteral testing If the contributor type is "ContactPerson"
familyName foaf:Agent foaf:familyName rdf:PlainLiteral testing
vcard:Individual vcard:family-name rdf:PlainLiteral testing If the contributor type is "ContactPerson"
nameIdentifier foaf:Agent @rdf:about URI reference testing
vcard:Individual testing If the contributor type is "ContactPerson"
affiliation foaf:Agent org:memberOf foaf:Organization testing
vcard:Individual vcard:organization-name rdf:PlainLiteral testing If the contributor type is "ContactPerson"
GeoLocation geoLocationPoint dct:Location geo:lat_long rdfs:Literal testing

In DataCite 4.0, this information is specified by using 2 child elements - namely, pointLatitude and pointLongitude.

Earlier versions of DataCite use a literal instead.

locn:geometry gsp:gmlLiteral testing
gsp:wktLiteral
geoLocationBox locn:geometry gsp:wktLiteral testing

In DataCite 4.0, this information is specified by using 4 child elements - namely, northBoundLatitude, eastBoundLongitude, southBoundLatitude, and westBoundLongitude.

Earlier versions of DataCite use a literal instead.

gsp:gmlLiteral
schema:box rdfs:Literal testing
geoLocationPolygon locn:geometry gsp:wktLiteral testing

Element added in DataCite 4.0.

The polygon vertices are specified by using child element geoPolygonPoint. The coordinates of each vertex are specified by using two child elements - respectively, pointLatitude and pointLongitude.

gsp:gmlLiteral
schema:polygon rdfs:Literal testing
FundingReference awardNumber foaf:Project dct:identifier xsd:string | xsd:anyURI testing
awardTitle dct:title rdf:PlainLiteral testing
* funderName foaf:Organization foaf:name rdf:PlainLiteral testing

The "funding project" (foaf:Project) is linked to the "funder" (foaf:Organization) by using property frapo:isAwardedBy.

The domain is foaf:Organization.

* funderIdentifier dct:identifier xsd:string | xsd:anyURI testing

Elements with attributes

Element Textual content & attributes Mappings Mapping status Comments
Domain Property or RDF/XML attribute Range
Subject textual content skos:Concept skos:prefLabel rdf:PlainLiteral testing
@schemeURI skos:inScheme skos:ConceptScheme (URI reference) testing
* @subjectScheme skos:ConceptScheme dct:title rdf:PlainLiteral testing The domain is skos:ConceptScheme
Rights textual content dct:RightsStatement rdfs:label rdf:PlainLiteral testing
@rightsURI @rdf:about URI reference testing
awardNumber textual content foaf:Project dct:identifier xsd:string | xsd:anyURI testing
@awardURI @rdf:about URI reference testing

DataCite supports the use of persistent identifiers to denote:

  • the described resource, and the related resources
  • resource creators and contributors
  • funders (i.e., the organisation funding the activity from which the described resource has been created)

In DataCite, such identifiers are specified as follows:

  • the identifier
  • the identifier type / scheme name (e.g., ORCID, ISNI, DOI)
  • optionally, the scheme URI (e.g., http://orcid.org/, http://www.isni.org/, https://doi.org/)

In DataCite+DCAT-AP, all these identifiers are mapped to URIs, by concatenating the identifier in the DataCite record with a URI prefix defined for each identifier type / scheme. Whenever possible, dereferenceable HTTP URIs/URLs are used; otherwise, URNs.

Notably, DataCite provides code lists for the types / schemes of identifiers used to denote resources and funders, but no code list is defined in DataCite for types / schemes of identifiers used to denote resource creators / contributors (the specification uses, as an example, "ORCID" and "ISNI").

However, DataCite does not specify a code list for scheme URIs. So, the mapping between the identifier type / scheme implemented in DataCite+DCAT-AP is based on the relevant registries and examples in the DataCite metadata schema specification. No URI prefix is of course used if the identifier is already a URI (as URLs and URNs).

The following table shows, for each identifier type / scheme, which is the URI prefix used in DataCite+DCAT-AP, along with examples of the results of such mappings. As mentioned above, all the identifier types / schemes in the table are defined as a code list in the DataCite metadata schema, with the exception of ORCID and ISNI (however, ISNI is defined in the code list for funder identifier types).

Identifier type / scheme Element(s) URI prefix used in DataCite+DCAT-AP Example Mapping status Comments
Original Transformed
ORCID nameIdentifier http://orcid.org/ 0000-0002-7285-027X http://orcid.org/0000-0002-7285-027X testing
ISNI nameIdentifier http://www.isni.org/ 0000000121032683 http://www.isni.org/0000000121032683 testing
funderIdentifier
GRID funderIdentifier https://www.grid.ac/institutes/ grid.270680.b https://www.grid.ac/institutes/grid.270680.b testing
CrossRef Funder ID funderIdentifier https://doi.org/ 10.13039/501100000900 https://doi.org/10.13039/501100000900 testing
DOI Identifier https://doi.org/ 10.1016/j.epsl.2011.11.037 https://doi.org/10.1016/j.epsl.2011.11.037 testing
AlternateIdentifier
RelatedIdentifier
ARK AlternateIdentifier http://n2t.net/ ark:/67531/metapth346793/ http://n2t.net/ark:/67531/metapth346793/ testing
RelatedIdentifier
arΧiv AlternateIdentifier http://arxiv.org/abs/ arXiv:0706.0001 http://arxiv.org/abs/0706.0001 testing The URI prefix replaces the namespace prefix arXiv: in the original identifier
RelatedIdentifier
bibcode AlternateIdentifier http://adsabs.harvard.edu/abs/ 2014Wthr...69...72C http://adsabs.harvard.edu/abs/2014Wthr...69...72C testing
RelatedIdentifier
EAN13 AlternateIdentifier urn:ean-13: 9783468111242 urn:ean-13:9783468111242 unstable
RelatedIdentifier
EISSN AlternateIdentifier urn:issn: 1562-6865 urn:issn:1562-6865 unstable
RelatedIdentifier
Handle AlternateIdentifier http://hdl.handle.net/ 10013/epic.10033 http://hdl.handle.net/10013/epic.10033 testing
RelatedIdentifier
IGSN AlternateIdentifier http://hdl.handle.net/10273/ (https://doi.org/10273/) SSH000SUA http://hdl.handle.net/10273/SSH000SUA (https://doi.org/10273/SSH000SUA) stable Identifier type added in DataCite 4.0.
RelatedIdentifier
ISBN AlternateIdentifier urn:isbn: 978-3-905673-82-1 urn:isbn:978-3-905673-82-1 unstable
RelatedIdentifier
ISSN AlternateIdentifier urn:issn: 0077-5606 urn:issn:0077-5606 unstable
RelatedIdentifier
ISTC AlternateIdentifier http://istc-search-beta.peppertag.com/ptproc/IstcSearch?tFrame=IstcListing&esfIstc= A12-2014-00013328-5 http://istc-search-beta.peppertag.com/ptproc/IstcSearch?tFrame=IstcListing&tForceNewQuery=Yes&esfIstc=A12-2014-00013328-5 testing
RelatedIdentifier
LISSN AlternateIdentifier urn:issn: 1188-1534 urn:issn:1188-1534 unstable
RelatedIdentifier
LSID AlternateIdentifier urn:lsid:ubio.org:namebank:11815 urn:lsid:ubio.org:namebank:11815 testing

LSIDs are implemented as URNs, following the pattern urn:lsid:authority:namespace:identifier:revision

URNs are URIs - no need for a URI prefix.

RelatedIdentifier
PMID AlternateIdentifier http://www.ncbi.nlm.nih.gov/pubmed/ 12082125 http://www.ncbi.nlm.nih.gov/pubmed/12082125 testing
RelatedIdentifier
PURL AlternateIdentifier http://purl.org/dc/terms/ http://purl.org/dc/terms/ testing PURLs are HTTP URIs - no need for a URI prefix.
RelatedIdentifier
UPC AlternateIdentifier urn:upc: 123456789999 urn:upc:123456789999 unstable
RelatedIdentifier
URL AlternateIdentifier http://www.heatflow.und.edu/index2.html http://www.heatflow.und.edu/index2.html testing URLs are URIs - no need for a URI prefix.
RelatedIdentifier
URN AlternateIdentifier urn:nbn:de:101:1-201102033592 urn:nbn:de:101:1-201102033592 testing URNs are URIs - no need for a URI prefix.
RelatedIdentifier