Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop the range of dcat:keyword #1585

Open
kvistgaard opened this issue Dec 6, 2023 · 6 comments
Open

Drop the range of dcat:keyword #1585

kvistgaard opened this issue Dec 6, 2023 · 6 comments
Labels
dcat future-work issue deferred to the next standardization round

Comments

@kvistgaard
Copy link

Since the range of dcat:keyword is rdfs:Literal, this makes application profile designers use alternatives such as dcterms:subject which reduces interoperability with catalogues using dcat:keyword

A common SHACL shape in EU is:

:Dataset-subject
  a sh:PropertyShape ;
  sh:path dcterms:subject ;
  sh:description "The value of this property is a keyword or tag describing the Data asset. It only allows values from the EuroVoc vocabulary http://eurovoc.europa.eu/ "@en ;
  sh:name "subject"@en ;
  sh:node [
      a sh:NodeShape ;
      sh:property [
          sh:path skos:inScheme ;
          sh:hasValue <http://eurovoc.europa.eu/100141> ;
        ] ;
    ] ;
  sh:nodeKind sh:IRI ;

It would be nicer to use the dedicated dcat:keyword.

@jakubklimek
Copy link
Contributor

Do you suggest to have a mix of literals and resources using dcat:keyword like this?

<dataset> dcat:keyword "Keyword literal"@en , <http://eurovoc.europa.eu/100141> .

If so, I do not think this will improve interoperability.

  1. Every implementation would now have to change to expect both literals, and resources, for which names would be somewhere else
  2. For your use case, there is dcat:theme, which can be used with controlled vocabularies. The difference from dcat:keyword is exactly that - keywords for free text (no controlled vocabularies) and themes for controlled vocabularies.

I think the current state is fine and we should not change that.

@kvistgaard
Copy link
Author

kvistgaard commented Dec 6, 2023

No, I only suggest to drop the range (in fact I would suggest to drop almost all ranges and leave that to application profiles).
For dcat:theme, there is a dedicated NAL http://publications.europa.eu/resource/authority/data-theme, usually one value. For keywords, always multiple values from Eurovoc, and that's is what I apply and keep suggesting.

@jakubklimek
Copy link
Contributor

Well, dropping the range effectively means supporting the case above, which in my opinion lowers interoperability.
For dcat:theme, the NAL is dedicated in DCAT-AP, not in DCAT. And, there are ongoing discussions about profiling dcat:theme in DCAT-AP:
SEMICeu/DCAT-AP#316
SEMICeu/DCAT-AP#314

@dr-shorthair
Copy link
Contributor

The distinction between

  1. dcat:keyword - range rdfs:Literal (datatype property)
  2. dcat:theme - range skos:Concept (object property)

has been in place since DCAT v1.
If you need the value to be a term from a controlled vocabulary, denoted by a URI, use dcat:theme.
If you want a text term, use dcat:keyword.

Bad habits developed in projects can't be fixed by modifying DCAT for everyone.

@kvistgaard
Copy link
Author

@dr-shorthair I'm aware of the distinction being from v1. The intention of raising this issue was to improve DCAT, not to make it suitable for a particular case. And speaking of bad habits, over-axiomatazing ontologies is definitely a bad habit in RDFS and OWL modelling in general, and not reserved for DCAT. But there is hope. A handy recent example is the range of dcterms:type dropped after being like that for much longer time than dcat:keyword. So, if anything, I might be raising this issue too early, not too late.

@bertvannuffelen
Copy link

I support the reaction from @jakubklimek. In this case the usage situation is clear and clean, and not restrictive.

In short:

  • When there is need to associate a term, not controlled by any list and in some language, and it is not the intend to add additional metadata about that term to a Dataset, then I want to express it as a literal. Hence, I use dcat:keyword.
  • When there is a need for additional control on the term, e.g. legally certified translations or an agreement by a group, then I use a controlled vocabulary. Hence, I want skos:Concepts (or similar) and not a literal. Hence I use a subproperty of dct:subject.

In the last case, dcat:theme is a special subproperty: namely the theme to which the Dataset is associated in the Catalogue. In this special case there is hopefully also not the discussion whether that could be a Literal. And note that for one profile the theme of another profile can be considered another categorisation.

So instead calling this a bad practice, in this case the range Literal versus Concept is corresponding to a business need. Both nicely address two distinct levels of harmonisation in the area of associating term to datasets to make them easiers findable in a catalogue by freetext search or facetted browsing.

By mixing, as illustrated by Jakub, DCAT states that the implementations must accept and being able to process both at the same time. It will create more implementation friction than gain. Lifting the distinction between data property and object property must be done care. And in this case it will not create added value, but more confusion.

Maybe you stumble over that the subproperty of dct:subject is not named 'keyword' when you use it in an implementation just as a keyword: that is a different discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dcat future-work issue deferred to the next standardization round
Projects
None yet
Development

No branches or pull requests

6 participants