Skip to content

Latest commit

 

History

History
54 lines (37 loc) · 2.27 KB

File metadata and controls

54 lines (37 loc) · 2.27 KB

Naturvårdsregistret Wikidata/Wikimedia Commons synchronization bot

This is a project originally written to support Wiki Loves Earth 2020. It is a bot that compares data downloaded from the Naturvårdsverket (Swedish Environmetal Protection Agency) database Naturvårdsregistret (NVR) with Wikidata for strutured data and and Wikimedia Commons for geoshapes to create or update entities that differs from NVR.

The code

  • Java 11, Maven.
  • wdtk-wikibaseapi for Wikidata, plus homegrown code for SPARQL.
  • jwbf for MediaWiki.
  • jts for geodata processing.

Keeps track of progress state. If you abort the bot, it will only process items previously already processed only if the previously failed.

Statistics is kept in the state, with specific information about each entity.

Required environment variables

mwse-bot.username="Your bots WikiMedia username"
mwse-bot.password="Your bots WikiMedia password"
mwse-bot.email="Contact email address for this bot"

External data sources

See http://mdp.vic-metria.nu/miljodataportalen/GetMetaDataById?UUID=8df63b07-46e5-45bd-aa06-3f43248617a3 for CC0.

Important notices

  • When updating data, make sure to also update download and publish dates in the bot source code!

  • Updating multi point coordinate WikiData item claims, e.g. natural monument points. Currently not a problem as they currently do not exist in WikiData and will thus only be added.

  • Updating categories at commons geoshape discussion page will overwrite any categories added by third parties. This is OK for now since we only add, but for future imports categories needs to be parsed and checked for delta!

  • All items created by the bot prior to 2020-04-10 is missing description!

  • Almost all items created by the bot prior to 2020-04-10 is missing labels!