From 4d39594f659dce189bf8afe29047cf3a854b67c8 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Fri, 8 Mar 2024 14:09:51 +0100 Subject: [PATCH 01/57] Add OFFIS as partner to README --- README.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.rst b/README.rst index 8f28a180..c71fbb80 100644 --- a/README.rst +++ b/README.rst @@ -1,5 +1,5 @@ -.. image:: https://user-images.githubusercontent.com/14353512/199113556-4b53660f-c628-4138-8d01-3719595ecda1.png +.. image:: https://private-user-images.githubusercontent.com/74312290/311242144-1992d975-c410-4cb9-8a05-117731d37084.svg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDk5MDM2MDMsIm5iZiI6MTcwOTkwMzMwMywicGF0aCI6Ii83NDMxMjI5MC8zMTEyNDIxNDQtMTk5MmQ5NzUtYzQxMC00Y2I5LThhMDUtMTE3NzMxZDM3MDg0LnN2Zz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAzMDglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMzA4VDEzMDgyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTA0MmE0YTE2ZmNiNGYzYjVhNDMwZTljMmU3NWM1YTkxMDM5MDJiYjA2YTUwMDE4ZDJjNDA5ZDc5NzgwYTIyNmUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.jtYbCYD9mbLe-U76BKpjY7p6jXKkAbvEdwp0oocx1Tw :align: left :target: https://github.com/OpenEnergyPlatform/open-MaStR :alt: MaStR logo @@ -129,7 +129,7 @@ Software | This repository is licensed under the **GNU Affero General Public License v3.0 or later** (AGPL-3.0-or-later). | See `LICENSE.md `_ for rights and obligations. | See the *Cite this repository* function or `CITATION.cff `_ for citation of this repository. -| Copyright: `open-MaStR `_ © `Reiner Lemoine Institut `_ © `fortiss `_ | `AGPL-3.0-or-later `_ +| Copyright: `open-MaStR `_ © `Reiner Lemoine Institut `_ © `fortiss `_ © `OFFIS `_ | `AGPL-3.0-or-later `_ Data ---- From 692f2eec039cc3279c90b5d31b229ff42d136f4d Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Fri, 10 May 2024 16:04:41 +0200 Subject: [PATCH 02/57] Add offis logo to readme #490 --- README.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.rst b/README.rst index c71fbb80..3f4811f9 100644 --- a/README.rst +++ b/README.rst @@ -1,5 +1,5 @@ -.. image:: https://private-user-images.githubusercontent.com/74312290/311242144-1992d975-c410-4cb9-8a05-117731d37084.svg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDk5MDM2MDMsIm5iZiI6MTcwOTkwMzMwMywicGF0aCI6Ii83NDMxMjI5MC8zMTEyNDIxNDQtMTk5MmQ5NzUtYzQxMC00Y2I5LThhMDUtMTE3NzMxZDM3MDg0LnN2Zz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAzMDglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMzA4VDEzMDgyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTA0MmE0YTE2ZmNiNGYzYjVhNDMwZTljMmU3NWM1YTkxMDM5MDJiYjA2YTUwMDE4ZDJjNDA5ZDc5NzgwYTIyNmUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.jtYbCYD9mbLe-U76BKpjY7p6jXKkAbvEdwp0oocx1Tw +.. image:: https://github-production-user-asset-6210df.s3.amazonaws.com/74312290/329603097-11e37434-fd0c-44f6-a3f0-0954799d2d79.svg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVCODYLSA53PQK4ZA%2F20240510%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240510T140340Z&X-Amz-Expires=300&X-Amz-Signature=d34978545782f50965d33bdffbac21cfc4841c3e4b7a125fff8dba39ca69696e&X-Amz-SignedHeaders=host&actor_id=74312290&key_id=0&repo_id=203598131 :align: left :target: https://github.com/OpenEnergyPlatform/open-MaStR :alt: MaStR logo From d392e628bd87ae283988634d404fe68ef378d497 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Fri, 10 May 2024 16:06:48 +0200 Subject: [PATCH 03/57] Delete third party links in README #490 I think they are placed too prominenty, especially since we have no influence on whether the stuff at those links is working. --- README.rst | 2 -- 1 file changed, 2 deletions(-) diff --git a/README.rst b/README.rst index 3f4811f9..3dc1151b 100644 --- a/README.rst +++ b/README.rst @@ -59,8 +59,6 @@ Documentation | Find the `documentation `_ hosted on ReadTheDocs. | The original API documentation can be found on the `Webhilfe des Marktstammdatenregisters `_. -| If you are interested in browsing the MaStR online, check out the privately hosted `Marktstammdatenregister.dev `_. -| Also see the `bundesAPI/Marktstammdaten-API `_ for another implementation. Installation From d8b70dcdad6aec089b6855ef50ebbadca04f64b5 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Fri, 10 May 2024 16:07:57 +0200 Subject: [PATCH 04/57] Add dashboard link to usage examples #490 --- README.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/README.rst b/README.rst index 3dc1151b..c9c13152 100644 --- a/README.rst +++ b/README.rst @@ -110,6 +110,7 @@ changes in a `Pull Request `_ - `EE-Status App `_ - `Digiplan Anhalt `_ +- `Data Quality Assessment of the MaStR `_ Collaboration From 8624abbda31b0835a2a66c8c39fd8eb9ca7e15a8 Mon Sep 17 00:00:00 2001 From: chrwm <54852694+chrwm@users.noreply.github.com> Date: Fri, 7 Jun 2024 11:40:30 +0200 Subject: [PATCH 05/57] Link actor properly #544 --- .github/workflows/extend_user_cff.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/extend_user_cff.yml b/.github/workflows/extend_user_cff.yml index 593b90ef..e7db64fb 100644 --- a/.github/workflows/extend_user_cff.yml +++ b/.github/workflows/extend_user_cff.yml @@ -85,4 +85,4 @@ jobs: Closes #${{ github.event.issue.number }} - Many thanks ${{ github.actor }}! + Many thanks @${{ github.actor }}! From 6cce34766b8ef5039efac085d72fb1c63326a119 Mon Sep 17 00:00:00 2001 From: chrwm <54852694+chrwm@users.noreply.github.com> Date: Fri, 7 Jun 2024 11:43:26 +0200 Subject: [PATCH 06/57] Fix formatting issue_template_user_kudos.md --- .github/ISSUE_TEMPLATE/issue_template_user_kudos.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/.github/ISSUE_TEMPLATE/issue_template_user_kudos.md b/.github/ISSUE_TEMPLATE/issue_template_user_kudos.md index 39fb7b5c..877d1e49 100644 --- a/.github/ISSUE_TEMPLATE/issue_template_user_kudos.md +++ b/.github/ISSUE_TEMPLATE/issue_template_user_kudos.md @@ -14,10 +14,12 @@ We will add you to the list of valued users. Please, insert your information between the double quotes below - fill out at minimum "affiliation" :purple_heart: -family-names: "" -given-names: "" -alias: "" -affiliation: "" -orcid: "" +:pencil2: **Spaces** and the following special characters are allowed: @ ? ! | . , : ; - _ [ / ( ) \ ] § $ % & = + < > + +family-names: +given-names: +alias: +affiliation: +orcid: Thank you! From eb0178f55fd0f52ce2a22a5c3e20f3ca6c70718a Mon Sep 17 00:00:00 2001 From: chrwm <54852694+chrwm@users.noreply.github.com> Date: Fri, 7 Jun 2024 11:49:18 +0200 Subject: [PATCH 07/57] Update CHANGELOG.md Introduce new unrelease section --- CHANGELOG.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index d6179261..7e095147 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,12 @@ For each version important additions, changes and removals are listed here. The format is inspired from [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/v2.0.0.html). +## [v0.XX.X] unreleased - 2024-XX-XX +### Added +### Changed +- Fix usercff workflow [#545](https://github.com/OpenEnergyPlatform/open-MaStR/issues/544) +### Removed + ## [v0.14.4] Release for the Journal of Open Source Software JOSS - 2024-06-07 ### Added - Extend documentation section `getting started` based on the JOSS Review [#523](https://github.com/OpenEnergyPlatform/open-MaStR/pull/523) From 7d615c0058ecd109a4e3627e5549162660d3c713 Mon Sep 17 00:00:00 2001 From: chrwm <54852694+chrwm@users.noreply.github.com> Date: Fri, 7 Jun 2024 12:08:37 +0200 Subject: [PATCH 08/57] Update issue_template_user_kudos.md --- .github/ISSUE_TEMPLATE/issue_template_user_kudos.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/ISSUE_TEMPLATE/issue_template_user_kudos.md b/.github/ISSUE_TEMPLATE/issue_template_user_kudos.md index 877d1e49..86890bed 100644 --- a/.github/ISSUE_TEMPLATE/issue_template_user_kudos.md +++ b/.github/ISSUE_TEMPLATE/issue_template_user_kudos.md @@ -12,7 +12,7 @@ It helps the project quite a bit! We will add you to the list of valued users. -Please, insert your information between the double quotes below - fill out at minimum "affiliation" :purple_heart: +Please, insert your information below - fill out at minimum affiliation :purple_heart: :pencil2: **Spaces** and the following special characters are allowed: @ ? ! | . , : ; - _ [ / ( ) \ ] § $ % & = + < > From 6dcf0d64b4c708b4d037df0e213a5f5869f8327e Mon Sep 17 00:00:00 2001 From: nesnoj Date: Thu, 18 Jul 2024 13:34:40 +0200 Subject: [PATCH 09/57] Fix docs on user-defined output path for csv, xml, database #549 --- docs/advanced.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/docs/advanced.md b/docs/advanced.md index 9f5589a7..90e1c9ff 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -28,6 +28,7 @@ The possible databases are: ### Project directory The directory `$HOME/.open-MaStR` is automatically created. It is used to store configuration files and save data. +You can change this default path, see [environment variables](#environment-variables). Default config files are copied to this directory which can be modified - but with caution. The project home directory is structured as follows (files and folders below `data/` just an example). @@ -87,6 +88,15 @@ The data can then be written to any sql database supported by [sqlalchemy](https For more information regarding the database see [Database settings](#database-settings). +### Environment variables + +There are some environment variables to customize open-MaStR: + +| Variable | Description | Example | +|------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------| +| `SQLITE_DATABASE_PATH` | Path to the SQLite file. This allows to use to use multiple instances of the MaStR database. The database instances exist in parallel and are independent of each other. | `/home/mastr-rabbit/.open-MaStR/data/sqlite/your_custom_instance_name.db` | +| `OUTPUT_PATH` | Path to user-defined output directory for CSV data, XML file and database. If not specified, output directory defaults to `$HOME/.open-MaStR/` | Linux: `/home/mastr-rabbit/open-mastr-user-defined-output-path`, Windows: `C:\\Users\\open-mastr-user-defined-output-path` | + ## Bulk download On the homepage [MaStR/Datendownload](https://www.marktstammdatenregister.de/MaStR/Datendownload) a zipped folder containing the whole From b032d851878fc6fbd7cb077aed1283cfaaa2cfca Mon Sep 17 00:00:00 2001 From: nesnoj Date: Thu, 18 Jul 2024 13:34:55 +0200 Subject: [PATCH 10/57] Update changelog --- CHANGELOG.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 7e095147..195ac049 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,6 +10,8 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/ ### Added ### Changed - Fix usercff workflow [#545](https://github.com/OpenEnergyPlatform/open-MaStR/issues/544) +- Fix docs on user-defined output path for csv, xml, database + [#549](https://github.com/OpenEnergyPlatform/open-MaStR/issues/549) ### Removed ## [v0.14.4] Release for the Journal of Open Source Software JOSS - 2024-06-07 From db69d0e60c6d0fd0401b8111676d95b4157c8102 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Thu, 18 Jul 2024 13:37:39 +0200 Subject: [PATCH 11/57] Add nesnoj to contributors --- CITATION.cff | 5 +++++ pyproject.toml | 4 +++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/CITATION.cff b/CITATION.cff index 12415aa3..4ecc093a 100644 --- a/CITATION.cff +++ b/CITATION.cff @@ -25,6 +25,11 @@ authors: alias: "@deniztepe" affiliation: "fortiss" orcid: " https://orcid.org/0000-0002-7605-0173" + - family-names: "Amme" + given-names: "Jonathan" + alias: "@nesnoj" + affiliation: "Reiner Lemoine Institut" + orcid: " https://orcid.org/0000-0002-8563-5261" title: "open-MaStR" type: software license: AGPL-3.0 diff --git a/pyproject.toml b/pyproject.toml index 2387333b..580de653 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -25,13 +25,15 @@ authors = [ {name = "Muschner Christoph"}, {name = "Kotthoff Florian"}, {name = "Tepe Deniz"}, + {name = "Amme Jonathan"}, {name = "Open Energy Family"}, ] maintainers = [ {name = "Ludwig Hülk", email = "datenzentrum@rl-institut.de"}, {name = "Florian Kotthoff"}, - {name = "Christoph Muschner", email = "datenzentrum@rl-institut.de"} + {name = "Christoph Muschner", email = "datenzentrum@rl-institut.de"}, + {name = "Jonathan Amme", email = "jonathan.amme@rl-institut.de"} ] description = "A package that provides an interface for downloading and processing the data of the Marktstammdatenregister (MaStR)" readme = "README.rst" From da2afa96fbf811471b9f9522aa1d5faf9a574946 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Tue, 27 Aug 2024 12:05:41 +0200 Subject: [PATCH 12/57] Set min. pandas version to v2.2.2 --- pyproject.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pyproject.toml b/pyproject.toml index 580de653..17616f96 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -6,7 +6,7 @@ build-backend = "setuptools.build_meta" name = "open_mastr" version = "0.14.4" dependencies = [ - "pandas>=2.1", # pandas 2.1 is needed for dataframe.map() + "pandas>=2.2.2", "numpy", "sqlalchemy>=2.0", "psycopg2-binary", From 5920911cb0860f8999730ffda7ebce6167fa3594 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Tue, 27 Aug 2024 12:06:16 +0200 Subject: [PATCH 13/57] Update changelog --- CHANGELOG.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 195ac049..f06f2943 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,6 +12,8 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/ - Fix usercff workflow [#545](https://github.com/OpenEnergyPlatform/open-MaStR/issues/544) - Fix docs on user-defined output path for csv, xml, database [#549](https://github.com/OpenEnergyPlatform/open-MaStR/issues/549) +- Set pandas version to >=2.2.2 for compatibility with numpy v2.0 + [#553](https://github.com/OpenEnergyPlatform/open-MaStR/issues/553) ### Removed ## [v0.14.4] Release for the Journal of Open Source Software JOSS - 2024-06-07 From 14c19a3638a8ef4639a4c7092d493789060bb934 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Tue, 27 Aug 2024 17:46:49 +0200 Subject: [PATCH 14/57] Make service_port an init param for MaStRAPI() --- open_mastr/soap_api/download.py | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/open_mastr/soap_api/download.py b/open_mastr/soap_api/download.py index 3d7a0b3f..0a52ca38 100644 --- a/open_mastr/soap_api/download.py +++ b/open_mastr/soap_api/download.py @@ -69,7 +69,7 @@ class MaStRAPI(object): wrapped SOAP queries. This is handled internally. """ - def __init__(self, user=None, key=None): + def __init__(self, user=None, key=None, service_port="Anlage"): """ Parameters ---------- @@ -80,10 +80,15 @@ def __init__(self, user=None, key=None): key : str , optional Access token of a role (Benutzerrolle). Might look like: "koo5eixeiQuoi'w8deighai8ahsh1Ha3eib3coqu7ceeg%ies..." + service_port : str , optional + Port/model to be used, e.g. "Anlage" or "Akteur", see docs for + full list: + https://www.marktstammdatenregister.de/MaStRHilfe/subpages/webdienst.html + Defaults to "Anlage". """ # Bind MaStR SOAP API functions as instance methods - client, client_bind = _mastr_bindings() + client, client_bind = _mastr_bindings(service_port=service_port) # First, all services of registered service_port (i.e. 'Anlage') for n, f in client_bind: @@ -140,19 +145,27 @@ def wrapper(*args, **kwargs): def _mastr_bindings( + service_port, + service_name="Marktstammdatenregister", + wsdl="https://www.marktstammdatenregister.de/MaStRAPI/wsdl/mastr.wsdl", max_retries=3, pool_connections=100, pool_maxsize=100, timeout=60, operation_timeout=600, - wsdl="https://www.marktstammdatenregister.de/MaStRAPI/wsdl/mastr.wsdl", - service_name="Marktstammdatenregister", - service_port="Anlage", ): """ Parameters ---------- + service_port : str + Port of service to be used. Parameters is passed to `zeep.Client.bind` + See :class:`MaStRAPI` for more information. + service_name : str + Service, defined in wsdl file, that is to be used. Parameters is + passed to zeep.Client.bind + wsdl : str + Url of wsdl file to be used. Parameters is passed to zeep.Client max_retries : int Maximum number of retries for a request. Parameters is passed to requests.adapters.HTTPAdapter @@ -168,14 +181,6 @@ def _mastr_bindings( operation_timeout : int Timeout for API requests (GET/POST in underlying requests package) in seconds. Parameter is passed to `zeep.transports.Transport`. - wsdl : str - Url of wsdl file to be used. Parameters is passed to zeep.Client - service_name : str - Service, defined in wsdl file, that is to be used. Parameters is - passed to zeep.Client.bind - service_port : str - Port of service to be used. Parameters is - passed to zeep.Client.bind Returns ------- From 81003b8e6f7565ae8697a98f2deb5ac3063430d7 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Tue, 27 Aug 2024 17:51:54 +0200 Subject: [PATCH 15/57] Update docs --- docs/advanced.md | 12 +++++++++--- open_mastr/soap_api/download.py | 3 ++- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/docs/advanced.md b/docs/advanced.md index 90e1c9ff..052bc342 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -205,10 +205,16 @@ if __name__ == "__main__": print(mastr_api.GetLokaleUhrzeit()) ``` -For API calls and their optional parameters refer to [API documentation](https://www.marktstammdatenregister. -de/MaStRHilfe/subpages/webdienst.html). +The MaStR API has different models to query from, the default are power units +("Anlage"). To change this, you can pass the desired model to +[`MaStRAPI`][open_mastr.soap_api.download.MaStRAPI]. +E.g. to query market actors instantiate it using +`MaStRAPI(service_port="Akteur")`. -???+ example "Example queries and their responses" +For API calls, models and optional parameters refer to the +[API documentation](https://www.marktstammdatenregister.de/MaStRHilfe/subpages/webdienst.html). + +???+ example "Example queries and their responses (for model 'Anlage')" === "mastr_api.GetLokaleUhrzeit()" diff --git a/open_mastr/soap_api/download.py b/open_mastr/soap_api/download.py index 0a52ca38..ff0537a1 100644 --- a/open_mastr/soap_api/download.py +++ b/open_mastr/soap_api/download.py @@ -39,7 +39,8 @@ class MaStRAPI(object): mastr_api = MaStRAPI( user="SOM123456789012", - key=""koo5eixeiQuoi'w8deighai8ahsh1Ha3eib3coqu7ceeg%ies..." + key="koo5eixeiQuoi'w8deighai8ahsh1Ha3eib3coqu7ceeg%ies...", + service_port="Anlage" ) ``` From bed6b989b4dcffbc4cbfaad754d838f626858d45 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Tue, 27 Aug 2024 17:52:00 +0200 Subject: [PATCH 16/57] Update changelog --- CHANGELOG.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index f06f2943..a0911fbc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -14,6 +14,8 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/ [#549](https://github.com/OpenEnergyPlatform/open-MaStR/issues/549) - Set pandas version to >=2.2.2 for compatibility with numpy v2.0 [#553](https://github.com/OpenEnergyPlatform/open-MaStR/issues/553) +- Allow to configure model/service port in `soap_api.download.MaStRAPI` + [#556](https://github.com/OpenEnergyPlatform/open-MaStR/issues/556) ### Removed ## [v0.14.4] Release for the Journal of Open Source Software JOSS - 2024-06-07 From c664a9bf5621fc77c045e2df6760c2cccc74d832 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Wed, 28 Aug 2024 09:32:24 +0200 Subject: [PATCH 17/57] Add Readme header with OFFIS #490 --- docs/images/README_HeaderThreePartners.svg | 127 +++++++++++++++++++++ 1 file changed, 127 insertions(+) create mode 100644 docs/images/README_HeaderThreePartners.svg diff --git a/docs/images/README_HeaderThreePartners.svg b/docs/images/README_HeaderThreePartners.svg new file mode 100644 index 00000000..2382cf9f --- /dev/null +++ b/docs/images/README_HeaderThreePartners.svg @@ -0,0 +1,127 @@ + + + + + + + + + + + + + + + + From f1f1cf95b66aa1fd3911934184637971ba89dad9 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Wed, 28 Aug 2024 09:33:23 +0200 Subject: [PATCH 18/57] Add Header Image with repo url to README #490 --- README.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.rst b/README.rst index c9c13152..4921b6b3 100644 --- a/README.rst +++ b/README.rst @@ -1,5 +1,5 @@ -.. image:: https://github-production-user-asset-6210df.s3.amazonaws.com/74312290/329603097-11e37434-fd0c-44f6-a3f0-0954799d2d79.svg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVCODYLSA53PQK4ZA%2F20240510%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240510T140340Z&X-Amz-Expires=300&X-Amz-Signature=d34978545782f50965d33bdffbac21cfc4841c3e4b7a125fff8dba39ca69696e&X-Amz-SignedHeaders=host&actor_id=74312290&key_id=0&repo_id=203598131 +.. image:: https://raw.githubusercontent.com/OpenEnergyPlatform/open-MaStR/feature-490-add-offis-as-partner/docs/images/README_HeaderThreePartners.svg :align: left :target: https://github.com/OpenEnergyPlatform/open-MaStR :alt: MaStR logo From 8a79bbf9d034c91ba35a8db1f74e859a9eaf4725 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Wed, 28 Aug 2024 09:49:37 +0200 Subject: [PATCH 19/57] Update pre-commit python version #490 --- .pre-commit-config.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 764b0476..821c6237 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -3,4 +3,4 @@ repos: rev: 22.6.0 hooks: - id: black - language_version: python3.10 + language_version: python3.11 From c713d6db913341d0402ca855ba640478f796ae32 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Wed, 28 Aug 2024 09:49:46 +0200 Subject: [PATCH 20/57] Add OFFIS and delete year at footer #490 --- mkdocs.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mkdocs.yml b/mkdocs.yml index 76a21770..26d41af3 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -93,4 +93,4 @@ site_dir: _build copyright: | - © 2023 RLI and fortiss GmbH + © RLI and fortiss GmbH and OFFIS e.V. From 33f3f6453bd94d1bb516efa0d21f919774aabb54 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Mon, 16 Sep 2024 12:15:47 +0200 Subject: [PATCH 21/57] Update CHANGELOG.md #490 --- CHANGELOG.md | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index a0911fbc..cf2b2811 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,7 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/ ## [v0.XX.X] unreleased - 2024-XX-XX ### Added +- Add OFFIS eV as partner organization [#493](https://github.com/OpenEnergyPlatform/open-MaStR/pull/493) ### Changed - Fix usercff workflow [#545](https://github.com/OpenEnergyPlatform/open-MaStR/issues/544) - Fix docs on user-defined output path for csv, xml, database From 6fd23571fc526feb5bc251ecc16c258617ecb744 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Tue, 17 Sep 2024 11:16:09 +0200 Subject: [PATCH 22/57] Make path of image relative #490 --- README.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.rst b/README.rst index 1392323f..178e766e 100644 --- a/README.rst +++ b/README.rst @@ -1,5 +1,5 @@ -.. image:: https://raw.githubusercontent.com/OpenEnergyPlatform/open-MaStR/feature-490-add-offis-as-partner/docs/images/README_HeaderThreePartners.svg +.. image:: docs/images/README_HeaderThreePartners.svg :align: left :target: https://github.com/OpenEnergyPlatform/open-MaStR :alt: MaStR logo From ca581fcd8e846f23284076800ba4a98f3d261e13 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Tue, 17 Sep 2024 11:54:51 +0200 Subject: [PATCH 23/57] Add external resources in README #490 --- README.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/README.rst b/README.rst index 178e766e..084c29b1 100644 --- a/README.rst +++ b/README.rst @@ -112,6 +112,11 @@ changes in a `Pull Request `_ - `Data Quality Assessment of the MaStR `_ +External Resources +=================== +Besides open-mastr, some other resources exist that ease the process of working with the Marktstammdatenregister: +- If you are interested in browsing the MaStR online, check out the github organisation `Marktstammdatenregister.dev `_. +- The `bundesAPI/Marktstammdaten-API `_ is another implementation to access data via an official API. Collaboration ============= From fa1c51637d3a69ed473e36c7585b1cbbe960a006 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Tue, 17 Sep 2024 15:14:55 +0200 Subject: [PATCH 24/57] Allow CSV export of storage_units #562 --- open_mastr/mastr.py | 2 +- open_mastr/utils/constants.py | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/open_mastr/mastr.py b/open_mastr/mastr.py index acd626b8..ca4ff91d 100644 --- a/open_mastr/mastr.py +++ b/open_mastr/mastr.py @@ -304,7 +304,7 @@ def to_csv( "balancing_area", "electricity_consumer", "gas_consumer", "gas_producer", "gas_storage", "gas_storage_extended", "grid_connections", "grids", "market_actors", "market_roles", - "locations_extended, 'permit', 'deleted_units' ] + "locations_extended", "permit", "deleted_units", "storage_units"] chunksize: int Defines the chunksize of the tables export. Default value is 500.000 rows to include in each chunk. diff --git a/open_mastr/utils/constants.py b/open_mastr/utils/constants.py index 18afb2c0..80bb2307 100644 --- a/open_mastr/utils/constants.py +++ b/open_mastr/utils/constants.py @@ -18,6 +18,7 @@ "deleted_units", "retrofit_units", "changed_dso_assignment", + "storage_units", ] # Possible values for parameter 'data' with API download method @@ -64,6 +65,7 @@ "deleted_units", "retrofit_units", "changed_dso_assignment", + "storage_units", ] # Possible data types for API download @@ -181,6 +183,7 @@ "deleted_units": "DeletedUnits", "retrofit_units": "RetrofitUnits", "changed_dso_assignment": "ChangedDSOAssignment", + "storage_units": "StorageUnits", } UNIT_TYPE_MAP = { From 5171ff1c12c8cc4fc762a2ecdf42f5b7c23b22b6 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Tue, 17 Sep 2024 15:15:21 +0200 Subject: [PATCH 25/57] Fix some typos and formatting --- docs/advanced.md | 2 +- open_mastr/utils/constants.py | 5 ++++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/docs/advanced.md b/docs/advanced.md index 052bc342..a0db651f 100644 --- a/docs/advanced.md +++ b/docs/advanced.md @@ -1,5 +1,5 @@ For most users, the functionalites described in [Getting Started](getting_started.md) are sufficient. If you want -to examine how you can configure the package's behavior for your own needs, check out [Cofiguration](#configuration). Or you can explore the two main functionalities of the package, namely the [Bulk Download](#bulk-download) +to examine how you can configure the package's behavior for your own needs, check out [Configuration](#configuration). Or you can explore the two main functionalities of the package, namely the [Bulk Download](#bulk-download) or the [SOAP API download](#soap-api-download). ## Configuration diff --git a/open_mastr/utils/constants.py b/open_mastr/utils/constants.py index 80bb2307..8405f54a 100644 --- a/open_mastr/utils/constants.py +++ b/open_mastr/utils/constants.py @@ -162,7 +162,10 @@ "eeg_data": "HydroEeg", "permit_data": "Permit", }, - "nuclear": {"unit_data": "NuclearExtended", "permit_data": "Permit"}, + "nuclear": { + "unit_data": "NuclearExtended", + "permit_data": "Permit" + }, "storage": { "unit_data": "StorageExtended", "eeg_data": "StorageEeg", From 8a026daa0368597d37ad85320e048fcd234324b6 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Tue, 17 Sep 2024 15:20:58 +0200 Subject: [PATCH 26/57] Fix error message in CSV export --- open_mastr/utils/helpers.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/open_mastr/utils/helpers.py b/open_mastr/utils/helpers.py index 1ac061bd..ad4f4dd8 100644 --- a/open_mastr/utils/helpers.py +++ b/open_mastr/utils/helpers.py @@ -222,7 +222,7 @@ def validate_parameter_data(method, data) -> None: ) if method == "csv_export" and value not in TECHNOLOGIES + ADDITIONAL_TABLES: raise ValueError( - "Allowed values for parameter data with API method are " + "Allowed values for CSV export are " f"{TECHNOLOGIES} or {ADDITIONAL_TABLES}" ) From d95b1823baa21267ab8e0f31a9652313876d550d Mon Sep 17 00:00:00 2001 From: nesnoj Date: Tue, 17 Sep 2024 15:28:16 +0200 Subject: [PATCH 27/57] Update changelog --- CHANGELOG.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index cf2b2811..da080b24 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -17,6 +17,8 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/ [#553](https://github.com/OpenEnergyPlatform/open-MaStR/issues/553) - Allow to configure model/service port in `soap_api.download.MaStRAPI` [#556](https://github.com/OpenEnergyPlatform/open-MaStR/issues/556) +- Allow CSV export of table `storage_units` + [#562](https://github.com/OpenEnergyPlatform/open-MaStR/issues/562) ### Removed ## [v0.14.4] Release for the Journal of Open Source Software JOSS - 2024-06-07 From 40ffd197fe4cb8789803375b7ec0c7b38ddc7daa Mon Sep 17 00:00:00 2001 From: nesnoj Date: Wed, 18 Sep 2024 13:33:51 +0200 Subject: [PATCH 28/57] Split mapping of bulk data to xml file names for storages #562 --- open_mastr/mastr.py | 1 + open_mastr/utils/constants.py | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/open_mastr/mastr.py b/open_mastr/mastr.py index ca4ff91d..47661c49 100644 --- a/open_mastr/mastr.py +++ b/open_mastr/mastr.py @@ -142,6 +142,7 @@ def download( | "nuclear" | Yes | Yes | | "gas" | Yes | Yes | | "storage" | Yes | Yes | + | "storage_units" | Yes | Yes | | "electricity_consumer"| Yes | No | | "location" | Yes | Yes | | "market" | Yes | No | diff --git a/open_mastr/utils/constants.py b/open_mastr/utils/constants.py index 8405f54a..8bc85d58 100644 --- a/open_mastr/utils/constants.py +++ b/open_mastr/utils/constants.py @@ -91,7 +91,8 @@ ], "combustion": ["anlagenkwk", "einheitenverbrennung"], "nuclear": ["einheitenkernkraft"], - "storage": ["anlageneegspeicher", "anlagenstromspeicher", "einheitenstromspeicher"], + "storage": ["anlageneegspeicher", "einheitenstromspeicher"], + "storage_units": ["anlagenstromspeicher"], "gas": [ "anlagengasspeicher", "einheitengaserzeuger", From f76208111a101cac96e48f65f326b7699f3b4d10 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Wed, 18 Sep 2024 16:18:52 +0200 Subject: [PATCH 29/57] Fix string to trigger test --- open_mastr/utils/constants.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/open_mastr/utils/constants.py b/open_mastr/utils/constants.py index 8bc85d58..88661fbf 100644 --- a/open_mastr/utils/constants.py +++ b/open_mastr/utils/constants.py @@ -79,7 +79,7 @@ "location_gas_consumption", ] -# Map bulk data to bulk download tables (xml file names) +# Map bulk data to bulk download tables (XML file names) BULK_INCLUDE_TABLES_MAP = { "wind": ["anlageneegwind", "einheitenwind"], "solar": ["anlageneegsolar", "einheitensolar"], From 7d65140e68ec485b4c37265a249e230b5270896c Mon Sep 17 00:00:00 2001 From: nesnoj Date: Wed, 18 Sep 2024 16:40:12 +0200 Subject: [PATCH 30/57] Update changelog --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index da080b24..0b75fbe3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -18,7 +18,7 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/ - Allow to configure model/service port in `soap_api.download.MaStRAPI` [#556](https://github.com/OpenEnergyPlatform/open-MaStR/issues/556) - Allow CSV export of table `storage_units` - [#562](https://github.com/OpenEnergyPlatform/open-MaStR/issues/562) + [#565](https://github.com/OpenEnergyPlatform/open-MaStR/pull/565) ### Removed ## [v0.14.4] Release for the Journal of Open Source Software JOSS - 2024-06-07 From c678821b75f6db4a157b0f3c0b359ebf02cbec05 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Wed, 18 Sep 2024 18:07:18 +0200 Subject: [PATCH 31/57] Change GH actions CI dev condition Add 'ready_for_review' --- .github/workflows/ci-develop.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/ci-develop.yml b/.github/workflows/ci-develop.yml index 6a7a6457..560dac97 100644 --- a/.github/workflows/ci-develop.yml +++ b/.github/workflows/ci-develop.yml @@ -11,7 +11,7 @@ jobs: # Jobs definition runs-on: ${{ matrix.os }} - if: ${{ !github.event.pull_request.draft }} + if: ${{ (!github.event.pull_request.draft) || (github.event.pull_request.ready_for_review) }} strategy: matrix: os: [macos-latest, ubuntu-latest, windows-latest] From 0f7f6d998e5f323e0ecac7c6794bbbe480c25dd0 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Wed, 18 Sep 2024 18:08:30 +0200 Subject: [PATCH 32/57] Revert "Change GH actions CI dev condition" This reverts commit c678821b75f6db4a157b0f3c0b359ebf02cbec05. --- .github/workflows/ci-develop.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/ci-develop.yml b/.github/workflows/ci-develop.yml index 560dac97..6a7a6457 100644 --- a/.github/workflows/ci-develop.yml +++ b/.github/workflows/ci-develop.yml @@ -11,7 +11,7 @@ jobs: # Jobs definition runs-on: ${{ matrix.os }} - if: ${{ (!github.event.pull_request.draft) || (github.event.pull_request.ready_for_review) }} + if: ${{ !github.event.pull_request.draft }} strategy: matrix: os: [macos-latest, ubuntu-latest, windows-latest] From e880a644926123efa59989c1beaeabdb022eceeb Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Thu, 26 Sep 2024 07:58:33 +0200 Subject: [PATCH 33/57] Explain cleansing process #567 --- open_mastr/mastr.py | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/open_mastr/mastr.py b/open_mastr/mastr.py index 47661c49..24e81a40 100644 --- a/open_mastr/mastr.py +++ b/open_mastr/mastr.py @@ -77,7 +77,6 @@ class Mastr: """ def __init__(self, engine="sqlite", connect_to_translated_db=False) -> None: - validate_parameter_format_for_mastr_init(engine) self.output_dir = get_output_dir() @@ -164,8 +163,10 @@ def download( Default to `None`. bulk_cleansing : bool, optional - If True, data cleansing is applied after the download (which is recommended). Default - to True. + If set to True, data cleansing is applied after the download (which is recommended). + In its original format, many entries in the MaStR are encoded with IDs. Columns like + `state` or `fueltype` do not contain entries such as "Hessen" or "Braunkohle", but instead + only contain IDs. Cleansing replaces these IDs with their corresponding original entries. api_processes : int or None or "max", optional Number of parallel processes used to download additional data. Defaults to `None`. If set to "max", the maximum number of possible processes From cd113a1eb726dfd63a70448624d3c4bcb8b3c969 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Thu, 26 Sep 2024 08:00:32 +0200 Subject: [PATCH 34/57] Update Changelog #567 --- CHANGELOG.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0b75fbe3..870a1ce8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,8 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/ ## [v0.XX.X] unreleased - 2024-XX-XX ### Added - Add OFFIS eV as partner organization [#493](https://github.com/OpenEnergyPlatform/open-MaStR/pull/493) +- Extended documentation of data cleansing process for bulk download + [#568](https://github.com/OpenEnergyPlatform/open-MaStR/pull/568) ### Changed - Fix usercff workflow [#545](https://github.com/OpenEnergyPlatform/open-MaStR/issues/544) - Fix docs on user-defined output path for csv, xml, database From 9b3c4ba79482671b12bd51cc8d87f824e5fe752b Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Wed, 9 Oct 2024 11:55:24 +0200 Subject: [PATCH 35/57] Catch error of missing table #574 Now it will not crash anymore if new tables are introduced by BNetzA. --- open_mastr/xml_download/utils_write_to_database.py | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/open_mastr/xml_download/utils_write_to_database.py b/open_mastr/xml_download/utils_write_to_database.py index 9dba5027..4ee6ef48 100644 --- a/open_mastr/xml_download/utils_write_to_database.py +++ b/open_mastr/xml_download/utils_write_to_database.py @@ -73,9 +73,13 @@ def is_table_relevant(xml_tablename: str, include_tables: list) -> bool: have it in the database.""" # few tables are only needed for data cleansing of the xml files and contain no # information of relevance - boolean_write_table_to_sql_database = ( - tablename_mapping[xml_tablename]["__class__"] is not None - ) + try: + boolean_write_table_to_sql_database = ( + tablename_mapping[xml_tablename]["__class__"] is not None + ) + except KeyError: + print(f"Table {xml_tablename} is not part of your current open-mastr version.") + return False # check if the table should be written to sql database (depends on user input) include_count = include_tables.count(xml_tablename) From 550ba3793596f5aa2c4da8009bec508972825884 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Wed, 9 Oct 2024 14:45:36 +0200 Subject: [PATCH 36/57] Add deleted_market_actors to data model #574 --- open_mastr/mastr.py | 2 +- open_mastr/utils/constants.py | 10 ++++++---- open_mastr/utils/orm.py | 13 +++++++++++++ open_mastr/xml_download/colums_to_replace.py | 2 ++ tests/test_helpers.py | 9 +++++++-- 5 files changed, 29 insertions(+), 7 deletions(-) diff --git a/open_mastr/mastr.py b/open_mastr/mastr.py index 47661c49..d97c17e4 100644 --- a/open_mastr/mastr.py +++ b/open_mastr/mastr.py @@ -77,7 +77,6 @@ class Mastr: """ def __init__(self, engine="sqlite", connect_to_translated_db=False) -> None: - validate_parameter_format_for_mastr_init(engine) self.output_dir = get_output_dir() @@ -150,6 +149,7 @@ def download( | "balancing_area" | Yes | No | | "permit" | Yes | Yes | | "deleted_units" | Yes | No | + | "deleted_market_actors"| Yes | No | | "retrofit_units" | Yes | No | date : None or `datetime.datetime` or str, optional diff --git a/open_mastr/utils/constants.py b/open_mastr/utils/constants.py index 88661fbf..e5cc476b 100644 --- a/open_mastr/utils/constants.py +++ b/open_mastr/utils/constants.py @@ -16,6 +16,7 @@ "balancing_area", "permit", "deleted_units", + "deleted_market_actors", "retrofit_units", "changed_dso_assignment", "storage_units", @@ -63,6 +64,7 @@ "market_roles", "permit", "deleted_units", + "deleted_market_actors", "retrofit_units", "changed_dso_assignment", "storage_units", @@ -106,6 +108,7 @@ "balancing_area": ["bilanzierungsgebiete"], "permit": ["einheitengenehmigung"], "deleted_units": ["geloeschteunddeaktivierteeinheiten"], + "deleted_market_actors": ["geloeschteunddeaktiviertemarktakteure"], "retrofit_units": ["ertuechtigungen"], "changed_dso_assignment": ["einheitenaenderungnetzbetreiberzuordnungen"], } @@ -125,6 +128,7 @@ "balancing_area": ["balancing_area"], "permit": ["permit"], "deleted_units": ["deleted_units"], + "deleted_market_actors": ["deleted_market_actors"], "retrofit_units": ["retrofit_units"], "changed_dso_assignment": ["changed_dso_assignment"], } @@ -163,10 +167,7 @@ "eeg_data": "HydroEeg", "permit_data": "Permit", }, - "nuclear": { - "unit_data": "NuclearExtended", - "permit_data": "Permit" - }, + "nuclear": {"unit_data": "NuclearExtended", "permit_data": "Permit"}, "storage": { "unit_data": "StorageExtended", "eeg_data": "StorageEeg", @@ -185,6 +186,7 @@ "balancing_area": "BalancingArea", "permit": "Permit", "deleted_units": "DeletedUnits", + "deleted_market_actors": "DeletedMarketActors", "retrofit_units": "RetrofitUnits", "changed_dso_assignment": "ChangedDSOAssignment", "storage_units": "StorageUnits", diff --git a/open_mastr/utils/orm.py b/open_mastr/utils/orm.py index cedbef47..b8db7f43 100644 --- a/open_mastr/utils/orm.py +++ b/open_mastr/utils/orm.py @@ -780,6 +780,14 @@ class DeletedUnits(ParentAllTables, Base): EinheitBetriebsstatus = Column(String) +class DeletedMarketActors(ParentAllTables, Base): + __tablename__ = "deleted_market_actors" + + DatumLetzteAktualisierung = Column(DateTime(timezone=True)) + MastrNummer = Column(String, primary_key=True) + MarktakteurStatus = Column(String) + + class RetrofitUnits(ParentAllTables, Base): __tablename__ = "retrofit_units" @@ -1006,6 +1014,11 @@ class ChangedDSOAssignment(ParentAllTables, Base): "__class__": DeletedUnits, "replace_column_names": None, }, + "geloeschteunddeaktiviertemarktakteure": { + "__name__": DeletedMarketActors.__tablename__, + "__class__": DeletedMarketActors, + "replace_column_names": None, + }, "marktrollen": { "__name__": MarketRoles.__tablename__, "__class__": MarketRoles, diff --git a/open_mastr/xml_download/colums_to_replace.py b/open_mastr/xml_download/colums_to_replace.py index 334f2a05..e35acf6e 100644 --- a/open_mastr/xml_download/colums_to_replace.py +++ b/open_mastr/xml_download/colums_to_replace.py @@ -57,6 +57,8 @@ "Pumpspeichertechnologie", "Einsatzort", # geloeschteunddeaktivierteEinheiten + # geloeschteunddeaktivierteMarktAkteure + "MarktakteurStatus", # lokationen # marktakteure "Personenart", diff --git a/tests/test_helpers.py b/tests/test_helpers.py index 71fbaa14..f67046d1 100644 --- a/tests/test_helpers.py +++ b/tests/test_helpers.py @@ -66,6 +66,7 @@ def parameter_dict_working_list(): "balancing_area", "permit", "deleted_units", + "deleted_market_actors", "retrofit_units", None, ["wind", "solar"], @@ -250,7 +251,12 @@ def test_validate_parameter_format_for_mastr_init(db): def test_transform_data_parameter(): - (data, api_data_types, api_location_types, harm_log,) = transform_data_parameter( + ( + data, + api_data_types, + api_location_types, + harm_log, + ) = transform_data_parameter( method="API", data=["wind", "location"], api_data_types=["eeg_data"], @@ -369,7 +375,6 @@ def test_db_query_to_csv(tmpdir, engine): os.remove(csv_path) for addit_table in addit_tables: - csv_path = join( get_data_version_dir(), f"bnetza_mastr_{addit_table}_raw.csv", From 70c9fc4830e1dbb0bd8942c51560f7ba5ffcb523 Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Wed, 9 Oct 2024 14:58:39 +0200 Subject: [PATCH 37/57] Update Changelog #574 --- CHANGELOG.md | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0b75fbe3..372ec61c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,7 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/ ## [v0.XX.X] unreleased - 2024-XX-XX ### Added +- Add `deleted_market_actors` to data model [#575](https://github.com/OpenEnergyPlatform/open-MaStR/pull/575) - Add OFFIS eV as partner organization [#493](https://github.com/OpenEnergyPlatform/open-MaStR/pull/493) ### Changed - Fix usercff workflow [#545](https://github.com/OpenEnergyPlatform/open-MaStR/issues/544) From db7be306935b048562d6cbc60f3484688c3fa4bb Mon Sep 17 00:00:00 2001 From: Florian Kotthoff <74312290+FlorianK13@users.noreply.github.com> Date: Wed, 9 Oct 2024 17:22:30 +0200 Subject: [PATCH 38/57] Add new tables to dataset docs #574 --- docs/dataset.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/dataset.md b/docs/dataset.md index 8e102511..83e0b905 100644 --- a/docs/dataset.md +++ b/docs/dataset.md @@ -81,6 +81,8 @@ After downloading the MaStR, you will find a database with a large number of tab | permit | | | storage_units | | | kwk | *short for: Combined heat and power (CHP)* | + | deleted_units | Units from all technologies that were deleted or deactivated | + | deleted_market_actors | Actors that were deleted. Downloading and parsing this table can result in a problem since v0.14.5, as it has no unique keys. | ### MaStR data model From 849430ba54e75d4e32afbced4e1286d14fb91a58 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Thu, 10 Oct 2024 11:15:01 +0200 Subject: [PATCH 39/57] Rename primary key DB column to DeletedMarketActors.MarktakteurMastrNummer #574 --- open_mastr/utils/orm.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/open_mastr/utils/orm.py b/open_mastr/utils/orm.py index b8db7f43..d84708f2 100644 --- a/open_mastr/utils/orm.py +++ b/open_mastr/utils/orm.py @@ -784,7 +784,7 @@ class DeletedMarketActors(ParentAllTables, Base): __tablename__ = "deleted_market_actors" DatumLetzteAktualisierung = Column(DateTime(timezone=True)) - MastrNummer = Column(String, primary_key=True) + MarktakteurMastrNummer = Column(String, primary_key=True) MarktakteurStatus = Column(String) From 50b75526489624caac6287a93b4d80dfe755b153 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Thu, 10 Oct 2024 11:15:32 +0200 Subject: [PATCH 40/57] Change column order #574 --- open_mastr/utils/orm.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/open_mastr/utils/orm.py b/open_mastr/utils/orm.py index d84708f2..d0d3a218 100644 --- a/open_mastr/utils/orm.py +++ b/open_mastr/utils/orm.py @@ -783,9 +783,9 @@ class DeletedUnits(ParentAllTables, Base): class DeletedMarketActors(ParentAllTables, Base): __tablename__ = "deleted_market_actors" - DatumLetzteAktualisierung = Column(DateTime(timezone=True)) MarktakteurMastrNummer = Column(String, primary_key=True) MarktakteurStatus = Column(String) + DatumLetzteAktualisierung = Column(DateTime(timezone=True)) class RetrofitUnits(ParentAllTables, Base): From 2fd11c1a3cef16398094a74ad5e1923339f8e1d1 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Thu, 10 Oct 2024 11:24:52 +0200 Subject: [PATCH 41/57] Amend error message for non-existing table #574 --- open_mastr/xml_download/utils_write_to_database.py | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/open_mastr/xml_download/utils_write_to_database.py b/open_mastr/xml_download/utils_write_to_database.py index 4ee6ef48..4917e9d9 100644 --- a/open_mastr/xml_download/utils_write_to_database.py +++ b/open_mastr/xml_download/utils_write_to_database.py @@ -78,7 +78,10 @@ def is_table_relevant(xml_tablename: str, include_tables: list) -> bool: tablename_mapping[xml_tablename]["__class__"] is not None ) except KeyError: - print(f"Table {xml_tablename} is not part of your current open-mastr version.") + print( + f"Table '{xml_tablename}' is not supported by your open-mastr version and " + f"will be skipped." + ) return False # check if the table should be written to sql database (depends on user input) include_count = include_tables.count(xml_tablename) From 9e927cf94f64fd32f88bb9bbad238bb293af62e3 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Thu, 10 Oct 2024 11:41:13 +0200 Subject: [PATCH 42/57] Change table description in docs #574 --- docs/dataset.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/dataset.md b/docs/dataset.md index 83e0b905..2063cdf0 100644 --- a/docs/dataset.md +++ b/docs/dataset.md @@ -82,7 +82,7 @@ After downloading the MaStR, you will find a database with a large number of tab | storage_units | | | kwk | *short for: Combined heat and power (CHP)* | | deleted_units | Units from all technologies that were deleted or deactivated | - | deleted_market_actors | Actors that were deleted. Downloading and parsing this table can result in a problem since v0.14.5, as it has no unique keys. | + | deleted_market_actors | Market actors that were deleted or deactivated | ### MaStR data model From acce0138a494f5bad59dab93a5762b69e66ced0d Mon Sep 17 00:00:00 2001 From: nesnoj Date: Thu, 10 Oct 2024 14:41:35 +0200 Subject: [PATCH 43/57] Update readme --- CHANGELOG.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 372ec61c..f3581f88 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,10 +8,13 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/ ## [v0.XX.X] unreleased - 2024-XX-XX ### Added -- Add `deleted_market_actors` to data model [#575](https://github.com/OpenEnergyPlatform/open-MaStR/pull/575) -- Add OFFIS eV as partner organization [#493](https://github.com/OpenEnergyPlatform/open-MaStR/pull/493) +- Add `deleted_market_actors` to data model and prevent crash on unknown tables + [#575](https://github.com/OpenEnergyPlatform/open-MaStR/pull/575) +- Add OFFIS eV as partner organization + [#493](https://github.com/OpenEnergyPlatform/open-MaStR/pull/493) ### Changed -- Fix usercff workflow [#545](https://github.com/OpenEnergyPlatform/open-MaStR/issues/544) +- Fix usercff workflow + [#545](https://github.com/OpenEnergyPlatform/open-MaStR/issues/544) - Fix docs on user-defined output path for csv, xml, database [#549](https://github.com/OpenEnergyPlatform/open-MaStR/issues/549) - Set pandas version to >=2.2.2 for compatibility with numpy v2.0 From 648661779a3bb608d221a64e8812a9625cdf06ba Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 10:17:23 +0200 Subject: [PATCH 44/57] Run bump2version for v0.14.5 #578 --- .bumpversion.cfg | 2 +- .github/workflows/ci-production.yml | 2 +- CITATION.cff | 2 +- pyproject.toml | 4 ++-- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/.bumpversion.cfg b/.bumpversion.cfg index a9d9ae3e..5b335316 100644 --- a/.bumpversion.cfg +++ b/.bumpversion.cfg @@ -1,5 +1,5 @@ [bumpversion] -current_version = 0.14.4 +current_version = 0.14.5 parse = (?P\d+)\.(?P\d+)\.(?P\d+)((?P(a|na))+(?P\d+))? serialize = {major}.{minor}.{patch}{release}{build} diff --git a/.github/workflows/ci-production.yml b/.github/workflows/ci-production.yml index 4ba916bf..6300f75a 100644 --- a/.github/workflows/ci-production.yml +++ b/.github/workflows/ci-production.yml @@ -32,7 +32,7 @@ jobs: - name: create package run: python -m build --sdist - name: import open-mastr - run: python -m pip install ./dist/open_mastr-0.14.4.tar.gz + run: python -m pip install ./dist/open_mastr-0.14.5.tar.gz - name: Create credentials file env: MASTR_TOKEN: ${{ secrets.MASTR_TOKEN }} diff --git a/CITATION.cff b/CITATION.cff index 4ecc093a..c20944c8 100644 --- a/CITATION.cff +++ b/CITATION.cff @@ -33,7 +33,7 @@ authors: title: "open-MaStR" type: software license: AGPL-3.0 -version: 0.14.4 +version: 0.14.5 doi: date-released: 2024-06-07 url: "https://github.com/OpenEnergyPlatform/open-MaStR/" diff --git a/pyproject.toml b/pyproject.toml index 17616f96..de0a2198 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "open_mastr" -version = "0.14.4" +version = "0.14.5" dependencies = [ "pandas>=2.2.2", "numpy", @@ -80,4 +80,4 @@ open_mastr = [ include = ["open_mastr", "open_mastr.soap_api", "open_mastr.soap_api.metadata", "open_mastr.utils", "open_mastr.utils.config", "open_mastr.xml_download"] # package names should match these glob patterns (["*"] by default) # from setup.py - not yet included in here -# download_url="https://github.com/OpenEnergyPlatform/open-MaStR/archive""/refs/tags/v0.14.4.tar.gz", +# download_url="https://github.com/OpenEnergyPlatform/open-MaStR/archive""/refs/tags/v0.14.5.tar.gz", From 7b08d43da123db84fe9ef93ceb56eef17a2a9e6d Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 10:20:01 +0200 Subject: [PATCH 45/57] Update changelog #578 --- CHANGELOG.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 99c2e35d..2317f04e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,7 +6,7 @@ For each version important additions, changes and removals are listed here. The format is inspired from [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/v2.0.0.html). -## [v0.XX.X] unreleased - 2024-XX-XX +## [v0.14.5] New MaStR data model, battery export, various fixes - 2024-10-11 ### Added - Add `deleted_market_actors` to data model and prevent crash on unknown tables [#575](https://github.com/OpenEnergyPlatform/open-MaStR/pull/575) @@ -25,7 +25,6 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/ [#556](https://github.com/OpenEnergyPlatform/open-MaStR/issues/556) - Allow CSV export of table `storage_units` [#565](https://github.com/OpenEnergyPlatform/open-MaStR/pull/565) -### Removed ## [v0.14.4] Release for the Journal of Open Source Software JOSS - 2024-06-07 ### Added @@ -39,7 +38,6 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/ - Fixed missing call to gen_url in case first bulk download fails as xml file for today is not yet available [#534](https://github.com/OpenEnergyPlatform/open-MaStR/pull/534) - Repair links in the documentation page [#536](https://github.com/OpenEnergyPlatform/open-MaStR/pull/536) - ## [v0.14.3] Fix Pypi Release - 2024-04-24 ### Added - Add new table `changed_dso_assignment` [#510](https://github.com/OpenEnergyPlatform/open-MaStR/pull/510) From 9b9ce57cda7037a4d3479bf1b9b0491da321133b Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 10:21:10 +0200 Subject: [PATCH 46/57] Update release date in citation.cff #578 --- CITATION.cff | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CITATION.cff b/CITATION.cff index c20944c8..591a84ea 100644 --- a/CITATION.cff +++ b/CITATION.cff @@ -35,5 +35,5 @@ type: software license: AGPL-3.0 version: 0.14.5 doi: -date-released: 2024-06-07 +date-released: 2024-10-11 url: "https://github.com/OpenEnergyPlatform/open-MaStR/" From 6d7b3f7acdfb608e49c37c725e8e815b95af689f Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 10:24:50 +0200 Subject: [PATCH 47/57] Update maintainers list #578 --- pyproject.toml | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/pyproject.toml b/pyproject.toml index de0a2198..718771b7 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -30,10 +30,9 @@ authors = [ ] maintainers = [ - {name = "Ludwig Hülk", email = "datenzentrum@rl-institut.de"}, {name = "Florian Kotthoff"}, - {name = "Christoph Muschner", email = "datenzentrum@rl-institut.de"}, - {name = "Jonathan Amme", email = "jonathan.amme@rl-institut.de"} + {name = "Jonathan Amme", email = "jonathan.amme@rl-institut.de"}, + {name = "Ludwig Hülk", email = "datenzentrum@rl-institut.de"}, ] description = "A package that provides an interface for downloading and processing the data of the Marktstammdatenregister (MaStR)" readme = "README.rst" From f47ea22da53c4b01cb95abfad5575a7bbab71d9e Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 10:31:02 +0200 Subject: [PATCH 48/57] Fix typo in release procedure #578 --- RELEASE_PROCEDURE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/RELEASE_PROCEDURE.md b/RELEASE_PROCEDURE.md index 521b3717..c432385a 100644 --- a/RELEASE_PROCEDURE.md +++ b/RELEASE_PROCEDURE.md @@ -51,7 +51,7 @@ It always has the format `YYYY-MM-DD`, e.g. `2022-05-16`. ### 5. 💠 Create a `release` branch * Checkout `develop` and branch with `git checkout -b release-v0.12.1` * Update version for test release with `bump2version --current-version --new-version patch` -* Commit version update with `git commit -am "version update v0.12.1a1"` +* Commit version update with `git commit -am "version update v0.12.1"` * Push branch with `git push --set-upstream origin release-v0.12.1` ### 6. 📝 Update the version files From 73e460d20264f07d32e9415bee4557d71d447784 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 10:44:49 +0200 Subject: [PATCH 49/57] Update maintainers list #578 --- pyproject.toml | 1 + 1 file changed, 1 insertion(+) diff --git a/pyproject.toml b/pyproject.toml index 718771b7..35a6c1e9 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -33,6 +33,7 @@ maintainers = [ {name = "Florian Kotthoff"}, {name = "Jonathan Amme", email = "jonathan.amme@rl-institut.de"}, {name = "Ludwig Hülk", email = "datenzentrum@rl-institut.de"}, + {name = "Christoph Muschner"}, ] description = "A package that provides an interface for downloading and processing the data of the Marktstammdatenregister (MaStR)" readme = "README.rst" From 8bde6afae0b21fae09ab104ae46d0698550f832b Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 10:56:52 +0200 Subject: [PATCH 50/57] Fix deprecations warnings in tests #578 --- open_mastr/soap_api/download.py | 2 +- open_mastr/soap_api/mirror.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/open_mastr/soap_api/download.py b/open_mastr/soap_api/download.py index ff0537a1..dc96266c 100644 --- a/open_mastr/soap_api/download.py +++ b/open_mastr/soap_api/download.py @@ -466,7 +466,7 @@ def __init__(self, parallel_processes=None): multiprocessing package) choose False. Defaults to number of cores (including hyperthreading). """ - log.warn( + log.warning( """ The `MaStRDownload` class is deprecated and will not be maintained in the future. To get a full table of the Marktstammdatenregister, use the open_mastr.Mastr.download diff --git a/open_mastr/soap_api/mirror.py b/open_mastr/soap_api/mirror.py index 9dda3c6e..ad8e9722 100644 --- a/open_mastr/soap_api/mirror.py +++ b/open_mastr/soap_api/mirror.py @@ -99,7 +99,7 @@ def __init__( Number of parallel processes used to download additional data. Defaults to `None`. """ - log.warn( + log.warning( """ The `MaStRMirror` class is deprecated and will not be maintained in the future. To get a full table of the Marktstammdatenregister, use the open_mastr.Mastr.download From c93b8f4e2c643c19366499b216a679db4bded670 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 11:01:04 +0200 Subject: [PATCH 51/57] Apply black #578 --- docs/conf.py | 33 ++-- open_mastr/soap_api/metadata/description.py | 70 ++++++--- open_mastr/utils/config.py | 24 +-- open_mastr/utils/credentials.py | 80 ++++++---- postprocessing/helpers.py | 39 +++-- postprocessing/orm.py | 24 ++- postprocessing/postprocessing.py | 129 ++++++++++------ postprocessing/turbine_match.py | 143 +++++++++++------- scripts/mirror_mastr_csv_export.py | 6 +- scripts/mirror_mastr_dump.py | 4 +- scripts/mirror_mastr_update_latest.py | 32 ++-- tests/preparation.py | 7 +- tests/test_helpers.py | 7 +- .../xml_download/test_utils_cleansing_bulk.py | 1 + .../xml_download/test_utils_download_bulk.py | 31 +++- 15 files changed, 399 insertions(+), 231 deletions(-) diff --git a/docs/conf.py b/docs/conf.py index 42a70728..5e578719 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -12,14 +12,15 @@ # import os import sys -sys.path.insert(0, os.path.abspath('../open_mastr')) + +sys.path.insert(0, os.path.abspath("../open_mastr")) # -- Project information ----------------------------------------------------- -project = 'open-MaStR' -copyright = '2022 Reiner Lemoine Institut and fortiss' -author = '' +project = "open-MaStR" +copyright = "2022 Reiner Lemoine Institut and fortiss" +author = "" # -- General configuration --------------------------------------------------- @@ -28,22 +29,22 @@ # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = [ - 'sphinx.ext.autosectionlabel', - 'sphinx.ext.autodoc', - 'sphinx.ext.napoleon', - 'sphinx_tabs.tabs', - 'm2r2', + "sphinx.ext.autosectionlabel", + "sphinx.ext.autodoc", + "sphinx.ext.napoleon", + "sphinx_tabs.tabs", + "m2r2", ] source_suffix = [".rst", ".md"] # Add any paths that contain templates here, relative to this directory. -templates_path = ['_templates'] +templates_path = ["_templates"] # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This pattern also affects html_static_path and html_extra_path. -exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] +exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] # -- Options for HTML output ------------------------------------------------- @@ -51,13 +52,13 @@ # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # -html_theme = 'sphinx_rtd_theme' +html_theme = "sphinx_rtd_theme" # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". -html_static_path = ['_static'] -html_css_files = ['custom.css'] +html_static_path = ["_static"] +html_css_files = ["custom.css"] -# Autodoc config -autoclass_content = 'both' +# Autodoc config +autoclass_content = "both" diff --git a/open_mastr/soap_api/metadata/description.py b/open_mastr/soap_api/metadata/description.py index 8fc55526..a4986959 100644 --- a/open_mastr/soap_api/metadata/description.py +++ b/open_mastr/soap_api/metadata/description.py @@ -33,19 +33,19 @@ def __init__(self, xml=None): self.xml = fh.read() else: # If no XML file is given, the file is read from an URL - zipurl = 'https://www.marktstammdatenregister.de/MaStRHilfe/files/' \ - 'webdienst/Dienstbeschreibung_1_2_39_Produktion.zip' + zipurl = ( + "https://www.marktstammdatenregister.de/MaStRHilfe/files/" + "webdienst/Dienstbeschreibung_1_2_39_Produktion.zip" + ) with urlopen(zipurl) as zipresp: with ZipFile(BytesIO(zipresp.read())) as zfile: - self.xml = zfile.read('xsd/mastrbasetypes.xsd') - - + self.xml = zfile.read("xsd/mastrbasetypes.xsd") # Parse XML and extract relevant data parsed = xmltodict.parse(self.xml, process_namespaces=False) - self.complex_types = parsed['schema']["complexType"] - self.simple_types = parsed['schema']["simpleType"] + self.complex_types = parsed["schema"]["complexType"] + self.simple_types = parsed["schema"]["simpleType"] # Prepare parsed data for documentational purposes abstract_types, parameters, responses, types = self._filter_type_descriptions() @@ -78,13 +78,17 @@ def _filter_type_descriptions(self): raise ValueError("Ohh...") else: # Filter all functions - if item["@name"].startswith(("Get", "Set", "Erneute", "Verschiebe", "Delete")): + if item["@name"].startswith( + ("Get", "Set", "Erneute", "Verschiebe", "Delete") + ): functions.append(item) # Further split the list of functions into paramters and responses if item["@name"].endswith("Parameter"): if "complexContent" in item.keys(): - parameters[item["@name"]] = item["complexContent"]["extension"] + parameters[item["@name"]] = item["complexContent"][ + "extension" + ] else: parameters[item["@name"]] = item elif item["@name"].endswith("Antwort"): @@ -111,12 +115,14 @@ def prepare_simple_type(self): for simple_type in self.simple_types: if "enumeration" in simple_type["restriction"]: - possible_values = [_["@value"] for _ in simple_type["restriction"]["enumeration"]] + possible_values = [ + _["@value"] for _ in simple_type["restriction"]["enumeration"] + ] else: possible_values = [] simple_types_doc[simple_type["@name"]] = { "type": simple_type["restriction"]["@base"], - "values": possible_values + "values": possible_values, } return simple_types_doc @@ -140,7 +146,9 @@ def functions_data_documentation(self): if "annotation" in fcn["sequence"]["element"]: fcn_data = [fcn["sequence"]["element"]] else: - fcn_data = self.types[fcn["sequence"]["element"]["@type"].split(":")[1]]["sequence"]["element"] + fcn_data = self.types[ + fcn["sequence"]["element"]["@type"].split(":")[1] + ]["sequence"]["element"] else: print(type(fcn["sequence"])) print(fcn["sequence"]) @@ -148,41 +156,51 @@ def functions_data_documentation(self): # Add data for inherited columns from base types if "@base" in fcn: - if not fcn["@base"] == 'mastr:AntwortBasis': - fcn_data = _collect_columns_of_base_type(self.types, fcn["@base"].split(":")[1], fcn_data) + if not fcn["@base"] == "mastr:AntwortBasis": + fcn_data = _collect_columns_of_base_type( + self.types, fcn["@base"].split(":")[1], fcn_data + ) function_docs[fcn_name] = {} for column in fcn_data: # Replace MaStR internal types with more general ones if column["@type"].startswith("mastr:"): try: - column_type = self.simple_types_prepared[column["@type"].split(":")[1]]["type"] + column_type = self.simple_types_prepared[ + column["@type"].split(":")[1] + ]["type"] except KeyError: column_type = column["@type"] else: column_type = column["@type"] if "annotation" in column.keys(): - description = column["annotation"]["documentation"].get("#text", None) + description = column["annotation"]["documentation"].get( + "#text", None + ) if description: - description = re.sub(" +", " ", description.replace("\n", "")) + description = re.sub( + " +", " ", description.replace("\n", "") + ) function_docs[fcn_name][column["@name"]] = { - "type": column_type, - "description": description, - "example": column["annotation"]["documentation"].get("m-ex", None) + "type": column_type, + "description": description, + "example": column["annotation"]["documentation"].get( + "m-ex", None + ), } else: function_docs[fcn_name][column["@name"]] = { "type": column_type, # TODO: insert information from simple type here "description": None, - "example": None + "example": None, } # Hack in a descrition for a column that gets created after download while flattening data function_docs["GetEinheitWind"]["HerstellerId"] = { "type": "str", "description": "Id des Herstellers der Einheit", - "example": 923 + "example": 923, } return function_docs @@ -193,7 +211,11 @@ def _collect_columns_of_base_type(base_types, base_type_name, fcn_data): fcn_data += type_description["extension"]["sequence"]["element"] if "@base" in type_description["extension"]: - if not type_description["extension"]["@base"] == 'mastr:AntwortBasis': - fcn_data = _collect_columns_of_base_type(base_types, type_description["extension"]["@base"].split(":")[1], fcn_data) + if not type_description["extension"]["@base"] == "mastr:AntwortBasis": + fcn_data = _collect_columns_of_base_type( + base_types, + type_description["extension"]["@base"].split(":")[1], + fcn_data, + ) return fcn_data diff --git a/open_mastr/utils/config.py b/open_mastr/utils/config.py index b1146269..40f67ec8 100644 --- a/open_mastr/utils/config.py +++ b/open_mastr/utils/config.py @@ -2,7 +2,6 @@ # -*- coding: utf-8 -*- - """ Service functions for logging @@ -26,7 +25,11 @@ import logging import logging.config -from open_mastr.utils.constants import TECHNOLOGIES, API_LOCATION_TYPES, ADDITIONAL_TABLES +from open_mastr.utils.constants import ( + TECHNOLOGIES, + API_LOCATION_TYPES, + ADDITIONAL_TABLES, +) log = logging.getLogger(__name__) @@ -57,7 +60,7 @@ def get_output_dir(): """ if "OUTPUT_PATH" in os.environ: - return os.environ.get('OUTPUT_PATH') + return os.environ.get("OUTPUT_PATH") return get_project_home_dir() @@ -76,7 +79,7 @@ def get_data_version_dir(): data_version = get_data_config() if "OUTPUT_PATH" in os.environ: - return os.path.join(os.environ.get('OUTPUT_PATH'), "data", data_version) + return os.path.join(os.environ.get("OUTPUT_PATH"), "data", data_version) return os.path.join(get_project_home_dir(), "data", data_version) @@ -230,9 +233,7 @@ def _filenames_generator(): } # Add file names of processed data - filenames["postprocessed"] = { - tech: f"{prefix}_{tech}.csv" for tech in TECHNOLOGIES - } + filenames["postprocessed"] = {tech: f"{prefix}_{tech}.csv" for tech in TECHNOLOGIES} # Add filenames for location data filenames["raw"].update( @@ -240,8 +241,13 @@ def _filenames_generator(): ) # Add filenames for additional tables - filenames["raw"].update({"additional_table": - {addit_table: f"{prefix}_{addit_table}_raw.csv" for addit_table in ADDITIONAL_TABLES}} + filenames["raw"].update( + { + "additional_table": { + addit_table: f"{prefix}_{addit_table}_raw.csv" + for addit_table in ADDITIONAL_TABLES + } + } ) # Add metadata file diff --git a/open_mastr/utils/credentials.py b/open_mastr/utils/credentials.py index c00495f4..ee818828 100644 --- a/open_mastr/utils/credentials.py +++ b/open_mastr/utils/credentials.py @@ -20,12 +20,13 @@ import keyring import logging + log = logging.getLogger(__name__) def _load_config_file(): - config_file = os.path.join(get_project_home_dir(), 'config', 'credentials.cfg') + config_file = os.path.join(get_project_home_dir(), "config", "credentials.cfg") cfg = cp.ConfigParser() # if not os.path.isdir(open_mastr_home): @@ -35,7 +36,7 @@ def _load_config_file(): cfg.read(config_file) return cfg else: - with open(config_file, 'w') as configfile: + with open(config_file, "w") as configfile: cfg.write(configfile) return cfg @@ -53,7 +54,7 @@ def get_mastr_user(): """ cfg = _load_config_file() section = "MaStR" - cfg_path = os.path.join(get_project_home_dir(), 'config', 'credentials.cfg') + cfg_path = os.path.join(get_project_home_dir(), "config", "credentials.cfg") try: user = cfg.get(section, "user") @@ -66,10 +67,12 @@ def get_mastr_user(): # except cp.NoOptionError: # raise cp.Error(f"The option 'user' could not by found in the section " # f"{section} in file {cfg_path}.") - log.warning(f"The option 'user' could not by found in the section " - f"{section} in file {cfg_path}. " - f"You might run into trouble when downloading data via the MaStR API." - f"\n Bulk download works without option 'user'.") + log.warning( + f"The option 'user' could not by found in the section " + f"{section} in file {cfg_path}. " + f"You might run into trouble when downloading data via the MaStR API." + f"\n Bulk download works without option 'user'." + ) return None @@ -79,15 +82,19 @@ def check_and_set_mastr_user(): user = get_mastr_user() if not user: - credentials_file = os.path.join(get_project_home_dir(), 'config', 'credentials.cfg') + credentials_file = os.path.join( + get_project_home_dir(), "config", "credentials.cfg" + ) cfg = _load_config_file() - user = input('\n\nCannot not find a MaStR user name in {config_file}.\n\n' - 'Please enter MaStR-ID (pattern: SOM123456789012): ' - ''.format(config_file=credentials_file)) + user = input( + "\n\nCannot not find a MaStR user name in {config_file}.\n\n" + "Please enter MaStR-ID (pattern: SOM123456789012): " + "".format(config_file=credentials_file) + ) cfg["MaStR"] = {"user": user} - with open(credentials_file, 'w') as configfile: + with open(credentials_file, "w") as configfile: cfg.write(configfile) return user @@ -115,7 +122,7 @@ def get_mastr_token(user): # Retrieving password from keyring does currently fail on headless systems # Prevent from breaking program execution with following try/except clause section = "MaStR" - cfg_path = os.path.join(get_project_home_dir(), 'config', 'credentials.cfg') + cfg_path = os.path.join(get_project_home_dir(), "config", "credentials.cfg") try: password = keyring.get_password(section, user) except: @@ -127,10 +134,12 @@ def get_mastr_token(user): try: password = cfg.get(section, "token") except (cp.NoSectionError, cp.NoOptionError): - log.warning(f"The option 'token' could not by found in the section " - f"{section} in file {cfg_path}. " - f"You might run into trouble when downloading data via the MaStR API." - f"\n Bulk download works without option 'token'.") + log.warning( + f"The option 'token' could not by found in the section " + f"{section} in file {cfg_path}. " + f"You might run into trouble when downloading data via the MaStR API." + f"\n Bulk download works without option 'token'." + ) password = None return password @@ -142,17 +151,21 @@ def check_and_set_mastr_token(user): if not password: cfg = _load_config_file() - credentials_file = os.path.join(get_project_home_dir(), 'config', 'credentials.cfg') + credentials_file = os.path.join( + get_project_home_dir(), "config", "credentials.cfg" + ) # If also no password in credentials file, ask the user to input password # Two options: (1) storing in keyring; (2) storing in config file - password = input('\n\nCannot not find a MaStR password, neither in keyring nor in {config_file}.\n\n' - "Please enter a valid access token of a role (Benutzerrolle) " - "associated to the user {user}.\n" - "The token might look like: " - "koo5eixeiQuoi'w8deighai8ahsh1Ha3eib3coqu7ceeg%ies...\n".format( - config_file=credentials_file, - user=user)) + password = input( + "\n\nCannot not find a MaStR password, neither in keyring nor in {config_file}.\n\n" + "Please enter a valid access token of a role (Benutzerrolle) " + "associated to the user {user}.\n" + "The token might look like: " + "koo5eixeiQuoi'w8deighai8ahsh1Ha3eib3coqu7ceeg%ies...\n".format( + config_file=credentials_file, user=user + ) + ) # let the user decide where to store the password # (1) keyring @@ -160,10 +173,15 @@ def check_and_set_mastr_token(user): # (0) don't store, abort # Wait for correct input while True: - choice = int(input("Where do you want to store your password?\n" - "\t(1) Keyring (default, hit ENTER to select)\n" - "\t(2) Config file (credendials.cfg)\n" - "\t(0) Abort. Don't store password\n") or "1\n") + choice = int( + input( + "Where do you want to store your password?\n" + "\t(1) Keyring (default, hit ENTER to select)\n" + "\t(2) Config file (credendials.cfg)\n" + "\t(0) Abort. Don't store password\n" + ) + or "1\n" + ) # check if choice is valid input if choice in [0, 1, 2]: break @@ -175,7 +193,7 @@ def check_and_set_mastr_token(user): keyring.set_password("MaStR", user, password) elif choice == 2: cfg["MaStR"] = {"user": user, "token": password} - with open(credentials_file, 'w') as configfile: + with open(credentials_file, "w") as configfile: cfg.write(configfile) else: log.error("No clue what happened here!?") @@ -199,4 +217,4 @@ def get_zenodo_token(): user = cfg.get(section, "token") return user except (cp.NoSectionError, cp.NoOptionError): - return None \ No newline at end of file + return None diff --git a/postprocessing/helpers.py b/postprocessing/helpers.py index 0cccf27f..5084c7f3 100644 --- a/postprocessing/helpers.py +++ b/postprocessing/helpers.py @@ -1,4 +1,5 @@ from bokeh.palettes import Category10_10 as palette + # import geoviews as gv import bokeh @@ -9,9 +10,9 @@ def plotPowerPlants(df): # size marker according to gross power output iMaxSize = 30 iMinSize = 10 - df["size"] = (df["Bruttoleistung"] - df["Bruttoleistung"].min()) / \ - (df["Bruttoleistung"].max() - df["Bruttoleistung"].min()) * \ - (iMaxSize - iMinSize) + iMinSize + df["size"] = (df["Bruttoleistung"] - df["Bruttoleistung"].min()) / ( + df["Bruttoleistung"].max() - df["Bruttoleistung"].min() + ) * (iMaxSize - iMinSize) + iMinSize # convert datetime to string df["date"] = df["Inbetriebnahmedatum"].dt.strftime("%Y-%m-%d") @@ -41,17 +42,35 @@ def plotPowerPlants(df): for group in groups: df_group = df.loc[ df["Einheittyp"] == group, - ["Name", "Standort", "Bundesland", "Land", "date", - "Einheittyp", "Bruttoleistung", "Laengengrad", "Breitengrad", "size"] + [ + "Name", + "Standort", + "Bundesland", + "Land", + "date", + "Einheittyp", + "Bruttoleistung", + "Laengengrad", + "Breitengrad", + "size", + ], ] - points = gv.Points(df_group, ["Laengengrad", "Breitengrad"], label=group).options( - aspect=2, responsive=True, tools=[hover_tool], size="size", active_tools=['wheel_zoom'], - fill_alpha=0.6, fill_color=colors[group], line_color="white", + points = gv.Points( + df_group, ["Laengengrad", "Breitengrad"], label=group + ).options( + aspect=2, + responsive=True, + tools=[hover_tool], + size="size", + active_tools=["wheel_zoom"], + fill_alpha=0.6, + fill_color=colors[group], + line_color="white", ) - overlay = (overlay * points) + overlay = overlay * points # hide group when clicking on legend overlay.options(click_policy="hide", clone=False) # return figure - return overlay \ No newline at end of file + return overlay diff --git a/postprocessing/orm.py b/postprocessing/orm.py index f8612ee4..5a98ed07 100644 --- a/postprocessing/orm.py +++ b/postprocessing/orm.py @@ -1,7 +1,18 @@ from geoalchemy2 import Geometry from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.schema import MetaData -from sqlalchemy import Column, Integer, String, Float, Sequence, DateTime, Boolean, func, Date, JSON +from sqlalchemy import ( + Column, + Integer, + String, + Float, + Sequence, + DateTime, + Boolean, + func, + Date, + JSON, +) from sqlalchemy.dialects.postgresql import JSONB cleaned_schema = "model_draft" @@ -30,7 +41,6 @@ class BasicUnit(object): StatisikFlag_basic = Column(String) - class Extended(object): EinheitMastrNummer_extended = Column(String) @@ -91,7 +101,7 @@ class Extended(object): Einspeisungsart = Column(String) PraequalifiziertFuerRegelenergie = Column(Boolean) GenMastrNummer_extended = Column(String) - geom = Column(Geometry('POINT')) + geom = Column(Geometry("POINT")) comment = Column(String) @@ -175,6 +185,7 @@ class HydroEeg(Eeg): class StorageEeg(Eeg): pass + class Kwk(object): KwkMastrNummer_kwk = Column(String) @@ -205,7 +216,7 @@ class Permit(object): class WindCleaned(Permit, WindEeg, Extended, BasicUnit, Base): - __tablename__ = 'bnetza_mastr_wind_clean' + __tablename__ = "bnetza_mastr_wind_clean" # wind specific attributes NameWindpark = Column(String) @@ -231,8 +242,7 @@ class WindCleaned(Permit, WindEeg, Extended, BasicUnit, Base): Kuestenentfernung = Column(Float) EegMastrNummer_extended = Column(String) tags = Column(JSONB) - geom_3035 = Column(Geometry('POINT', srid=3035)) - + geom_3035 = Column(Geometry("POINT", srid=3035)) class SolarCleaned(Permit, SolarEeg, Extended, BasicUnit, Base): @@ -288,7 +298,7 @@ class CombustionCleaned(Permit, Kwk, Extended, BasicUnit, Base): AnteiligNutzungsberechtigte = Column(String) Notstromaggregat = Column(Boolean) Einsatzort = Column(String) - KwkMastrNummer_extended = Column(String) # changed here + KwkMastrNummer_extended = Column(String) # changed here Technologie = Column(String) diff --git a/postprocessing/postprocessing.py b/postprocessing/postprocessing.py index e860dbd2..cbb86e26 100644 --- a/postprocessing/postprocessing.py +++ b/postprocessing/postprocessing.py @@ -16,25 +16,13 @@ log = setup_logger() -BKG_VG250 = { - "schema": "boundaries", - "table": "bkg_vg250_1_sta_union_mview" -} +BKG_VG250 = {"schema": "boundaries", "table": "bkg_vg250_1_sta_union_mview"} -OSM_PLZ = { - "schema": "boundaries", - "table": "osm_postcode" -} +OSM_PLZ = {"schema": "boundaries", "table": "osm_postcode"} -OFFSHORE = { - "schema": "model_draft", - "table": "rli_boundaries_offshore" -} +OFFSHORE = {"schema": "model_draft", "table": "rli_boundaries_offshore"} -OSM_WINDPOWER = { - "schema": "model_draft", - "table": "mastr_osm_deu_point_windpower" -} +OSM_WINDPOWER = {"schema": "model_draft", "table": "mastr_osm_deu_point_windpower"} OEP_QUERY_PATTERN = "https://openenergy-platform.org/api/v0/schema/{schema}/tables/{table}/rows?form=csv" @@ -43,7 +31,16 @@ MASTR_RAW_SCHEMA = "model_draft" OPEN_MASTR_SCHEMA = "model_draft" -TECHNOLOGIES = ["wind", "hydro", "solar", "biomass", "combustion", "nuclear", "gsgk", "storage"] +TECHNOLOGIES = [ + "wind", + "hydro", + "solar", + "biomass", + "combustion", + "nuclear", + "gsgk", + "storage", +] orm_map = { "wind": { @@ -113,15 +110,17 @@ def table_to_db(csv_data, table, schema, conn, geom_col="geom", srid=4326): query = "CREATE SCHEMA IF NOT EXISTS {schema}".format(schema=schema) conn.execute(query) - csv_data.to_sql(table, - con=conn, - schema=schema, - dtype={ - geom_col: Geometry(srid=srid), - "plz": String(), - }, - chunksize=100000, - if_exists="replace") + csv_data.to_sql( + table, + con=conn, + schema=schema, + dtype={ + geom_col: Geometry(srid=srid), + "plz": String(), + }, + chunksize=100000, + if_exists="replace", + ) def table_to_db_orm(mapper, data, chunksize=10000): @@ -144,6 +143,7 @@ def table_to_db_orm(mapper, data, chunksize=10000): # Commit each chunk separately session.commit() + def import_boundary_data_csv(schema, table, index_col="id", srid=4326): """ Import additional data for post-processing @@ -166,32 +166,43 @@ def import_boundary_data_csv(schema, table, index_col="id", srid=4326): with db_engine().connect() as con: # Check if table already exists - table_query = "SELECT to_regclass('{schema}.{table}');".format(schema=schema, table=table) + table_query = "SELECT to_regclass('{schema}.{table}');".format( + schema=schema, table=table + ) table_name = "{schema}.{table}".format(schema=schema, table=table) table_exists = table_name in con.execute(table_query).first().values() if not table_exists: # Download CSV file if it does not exist if not csv_file_exists: - log.info("Downloading table {schema}.{table} from OEP".format(schema=schema, table=table)) + log.info( + "Downloading table {schema}.{table} from OEP".format( + schema=schema, table=table + ) + ) urlretrieve( - OEP_QUERY_PATTERN.format(schema=schema, table=table), - csv_file) + OEP_QUERY_PATTERN.format(schema=schema, table=table), csv_file + ) else: log.info("Found {} locally.".format(csv_file)) # Read CSV file - csv_data = pd.read_csv(csv_file, - index_col=index_col) + csv_data = pd.read_csv(csv_file, index_col=index_col) # Prepare geom data for DB upload - csv_data["geom"] = csv_data["geom"].apply(lambda x: WKTElement(wkb_loads(x, hex=True).wkt, srid=srid)) + csv_data["geom"] = csv_data["geom"].apply( + lambda x: WKTElement(wkb_loads(x, hex=True).wkt, srid=srid) + ) # Insert to db table_to_db(csv_data, table, schema, con, srid=srid) log.info("Data from {} successfully imported to database.".format(csv_file)) else: - log.info("Table '{schema}.{table}' already exists in local database".format(schema=schema, table=table)) + log.info( + "Table '{schema}.{table}' already exists in local database".format( + schema=schema, table=table + ) + ) def add_geom_col(df, lat_col="Breitengrad", lon_col="Laengengrad", srid=4326): @@ -219,17 +230,21 @@ def add_geom_col(df, lat_col="Breitengrad", lon_col="Laengengrad", srid=4326): df_with_coords = df.loc[~(df["Breitengrad"].isna() | df["Laengengrad"].isna())] # Just select data with lat/lon in range [(-90,90), (-180,180)] - df_with_coords = df_with_coords[~((df_with_coords["Breitengrad"] < -90) - | (df_with_coords["Breitengrad"] > 90) - | (df_with_coords["Laengengrad"] < -180) - | (df_with_coords["Laengengrad"] > 180)) + df_with_coords = df_with_coords[ + ~( + (df_with_coords["Breitengrad"] < -90) + | (df_with_coords["Breitengrad"] > 90) + | (df_with_coords["Laengengrad"] < -180) + | (df_with_coords["Laengengrad"] > 180) + ) ] df_no_coords = df.loc[~df.index.isin(df_with_coords.index)] - gdf = gpd.GeoDataFrame( - df_with_coords, geometry=gpd.points_from_xy(df_with_coords[lon_col], df_with_coords[lat_col]), - crs="EPSG:{}".format(srid)) + df_with_coords, + geometry=gpd.points_from_xy(df_with_coords[lon_col], df_with_coords[lat_col]), + crs="EPSG:{}".format(srid), + ) gdf["geom"] = gdf["geometry"].apply(lambda x: WKTElement(x.wkt, srid=srid)) gdf.drop(columns=["geometry"], inplace=True) @@ -271,9 +286,15 @@ def run_sql_postprocessing(): if tech_name not in ["gsgk", "storage", "nuclear"]: log.info(f"Run post-processing on {tech_name} data") # Read SQL query from file - with open(os.path.join(os.path.dirname(__file__), - "db-cleansing", - "rli-mastr-{tech_name}-cleansing.sql".format(tech_name=tech_name))) as file: + with open( + os.path.join( + os.path.dirname(__file__), + "db-cleansing", + "rli-mastr-{tech_name}-cleansing.sql".format( + tech_name=tech_name + ), + ) + ) as file: escaped_sql = text(file.read()) # Execute query @@ -334,21 +355,29 @@ def to_csv(limit=None): with session_scope() as session: orm_tech = getattr(orm, orm_map[tech]["cleaned"]) query = session.query(orm_tech).limit(limit) - df = pd.read_sql(query.statement, query.session.bind, index_col="EinheitMastrNummer") + df = pd.read_sql( + query.statement, query.session.bind, index_col="EinheitMastrNummer" + ) csv_file = os.path.join(data_path, filenames["postprocessed"][tech]) - df.to_csv(csv_file, index=True, index_label="EinheitMastrNummer", encoding='utf-8') + df.to_csv( + csv_file, index=True, index_label="EinheitMastrNummer", encoding="utf-8" + ) if df["DatumLetzteAktualisierung"].max() > newest_date: newest_date = df["DatumLetzteAktualisierung"].max() # Save metadata along with data metadata_file = os.path.join(data_path, filenames["metadata"]) - metadata = create_datapackage_meta_json(newest_date, TECHNOLOGIES, data=["raw", "cleaned", "postprocessed"], - json_serialize=False) - - with open(metadata_file, 'w', encoding='utf-8') as f: + metadata = create_datapackage_meta_json( + newest_date, + TECHNOLOGIES, + data=["raw", "cleaned", "postprocessed"], + json_serialize=False, + ) + + with open(metadata_file, "w", encoding="utf-8") as f: json.dump(metadata, f, ensure_ascii=False, indent=4) diff --git a/postprocessing/turbine_match.py b/postprocessing/turbine_match.py index 8b400e4e..caacc537 100644 --- a/postprocessing/turbine_match.py +++ b/postprocessing/turbine_match.py @@ -17,68 +17,109 @@ import pandas as pd import os + def read_csv_turbine(csv_name): - turbines = pd.read_csv(csv_name, header=0, encoding='utf-8', sep=',', error_bad_lines=True, index_col=False, - dtype={'index': int, 'id': int,'turbine_id':int, 'manufacturer': str, 'name': str, 'turbine_type': str, - 'nominal_power': str, 'rotor_diamter': str,'rotor_area': str, 'hub_height': str, - 'max_speed_drive': str, 'wind_class_iec':str, 'wind_zone_dibt': str, - 'power_density': str, 'power_density_2': str,'calculated': str, - 'has_power_curve': str, 'power_curve_wind_speeds': str, 'power_curve_values': str, 'has_cp_curve': str, - 'power_coefficient_curve_wind_speeds': str, 'power_coefficient_curve_values': str, - 'has_ct_curve': str, 'thrust_coefficient_curve_wind_speeds': str, 'thrust_coefficient_curve_values': str, 'source': str}, + turbines = pd.read_csv( + csv_name, + header=0, + encoding="utf-8", + sep=",", + error_bad_lines=True, + index_col=False, + dtype={ + "index": int, + "id": int, + "turbine_id": int, + "manufacturer": str, + "name": str, + "turbine_type": str, + "nominal_power": str, + "rotor_diamter": str, + "rotor_area": str, + "hub_height": str, + "max_speed_drive": str, + "wind_class_iec": str, + "wind_zone_dibt": str, + "power_density": str, + "power_density_2": str, + "calculated": str, + "has_power_curve": str, + "power_curve_wind_speeds": str, + "power_curve_values": str, + "has_cp_curve": str, + "power_coefficient_curve_wind_speeds": str, + "power_coefficient_curve_values": str, + "has_ct_curve": str, + "thrust_coefficient_curve_wind_speeds": str, + "thrust_coefficient_curve_values": str, + "source": str, + }, ) return turbines + def create_dataset(df): - types = [] - for i,r in df.iterrows(): - types.append(prepare_turbine_type(r)) - df.insert(6,'turbine_type_v2',types) - write_to_csv(df, 'turbine_library_t.csv') + types = [] + for i, r in df.iterrows(): + types.append(prepare_turbine_type(r)) + df.insert(6, "turbine_type_v2", types) + write_to_csv(df, "turbine_library_t.csv") def write_to_csv(df, path): - with open(path, mode='a', encoding='utf-8') as file: - df.to_csv(file, sep=',', - mode='a', - header=file.tell() == 0, - line_terminator='\n', - encoding='utf-8') + with open(path, mode="a", encoding="utf-8") as file: + df.to_csv( + file, + sep=",", + mode="a", + header=file.tell() == 0, + line_terminator="\n", + encoding="utf-8", + ) def prepare_turbine_type(turbine): - nom_pow = turbine.nominal_power - diam = turbine.rotor_diameter - man = get_manufacturer_short(turbine.manufacturer, nom_pow, diam) - type_name = man+'-'+str(diam)+'_'+str(int(nom_pow)) - return type_name + nom_pow = turbine.nominal_power + diam = turbine.rotor_diameter + man = get_manufacturer_short(turbine.manufacturer, nom_pow, diam) + type_name = man + "-" + str(diam) + "_" + str(int(nom_pow)) + return type_name def get_manufacturer_short(manufacturer, nom_pow, diam): - man = '' - if manufacturer == 'Nordex': - man = 'N' - if int(nom_pow) == 3000 or int(nom_pow) == 1500: - if int(diam) == 140 or int(diam) ==132 or int(diam) ==125 or int(diam) ==116 or int(diam) ==100 or int(diam) == 82 or int(diam) == 77 or int(diam) == 70: - man = 'AW' - elif manufacturer == 'Adwen/Areva': - man = 'AD' - elif manufacturer == 'Senvion/REpower': - man = 'S' - if int(nom_pow) == 2050 or int(nom_pow) == 2000: - man = 'MM' - elif manufacturer == 'Enercon': - man = 'E' - elif manufacturer == 'Siemens': - man = 'SWT' - elif manufacturer == 'Vestas': - man = 'V' - elif manufacturer == 'Vensys': - man = 'VS' - elif manufacturer == 'GE Wind': - man = 'GE' - elif manufacturer == 'Eno': - man = 'ENO' - elif manufacturer == 'aerodyn': - man = 'SCD' - return man \ No newline at end of file + man = "" + if manufacturer == "Nordex": + man = "N" + if int(nom_pow) == 3000 or int(nom_pow) == 1500: + if ( + int(diam) == 140 + or int(diam) == 132 + or int(diam) == 125 + or int(diam) == 116 + or int(diam) == 100 + or int(diam) == 82 + or int(diam) == 77 + or int(diam) == 70 + ): + man = "AW" + elif manufacturer == "Adwen/Areva": + man = "AD" + elif manufacturer == "Senvion/REpower": + man = "S" + if int(nom_pow) == 2050 or int(nom_pow) == 2000: + man = "MM" + elif manufacturer == "Enercon": + man = "E" + elif manufacturer == "Siemens": + man = "SWT" + elif manufacturer == "Vestas": + man = "V" + elif manufacturer == "Vensys": + man = "VS" + elif manufacturer == "GE Wind": + man = "GE" + elif manufacturer == "Eno": + man = "ENO" + elif manufacturer == "aerodyn": + man = "SCD" + return man diff --git a/scripts/mirror_mastr_csv_export.py b/scripts/mirror_mastr_csv_export.py index 2596d429..00cf6812 100644 --- a/scripts/mirror_mastr_csv_export.py +++ b/scripts/mirror_mastr_csv_export.py @@ -1,4 +1,4 @@ -from open_mastr.utils.helpers import (reverse_fill_basic_units, create_db_query) +from open_mastr.utils.helpers import reverse_fill_basic_units, create_db_query technology = [ @@ -24,6 +24,4 @@ reverse_fill_basic_units() # to csv per tech -create_db_query( - technology=technology, additional_data=data_types, limit=None -) +create_db_query(technology=technology, additional_data=data_types, limit=None) diff --git a/scripts/mirror_mastr_dump.py b/scripts/mirror_mastr_dump.py index ca2a0b66..69a3a3d2 100644 --- a/scripts/mirror_mastr_dump.py +++ b/scripts/mirror_mastr_dump.py @@ -2,8 +2,8 @@ import datetime # Dump data -now = datetime.datetime.now().strftime('%Y-%m-%d_%H%M%S') +now = datetime.datetime.now().strftime("%Y-%m-%d_%H%M%S") dump_file = f"{now}_open-mastr-mirror.backup" mastr_refl = MaStRMirror() -mastr_refl.dump(dump_file) \ No newline at end of file +mastr_refl.dump(dump_file) diff --git a/scripts/mirror_mastr_update_latest.py b/scripts/mirror_mastr_update_latest.py index 0db0b234..40c61681 100644 --- a/scripts/mirror_mastr_update_latest.py +++ b/scripts/mirror_mastr_update_latest.py @@ -2,16 +2,27 @@ import datetime limit = None -technology = ["wind", "biomass", "combustion", "gsgk", "hydro", "nuclear", "storage", "solar"] +technology = [ + "wind", + "biomass", + "combustion", + "gsgk", + "hydro", + "nuclear", + "storage", + "solar", +] data_types = ["unit_data", "eeg_data", "kwk_data", "permit_data"] -location_types = ["location_elec_generation", "location_elec_consumption", "location_gas_generation", - "location_gas_consumption"] +location_types = [ + "location_elec_generation", + "location_elec_consumption", + "location_gas_generation", + "location_gas_consumption", +] processes = 12 mastr_mirror = MaStRMirror( - empty_schema=False, - parallel_processes=processes, - restore_dump=None + empty_schema=False, parallel_processes=processes, restore_dump=None ) # Download basic unit data @@ -21,13 +32,12 @@ for tech in technology: # mastr_mirror.create_additional_data_requests(tech) for data_type in data_types: - mastr_mirror.retrieve_additional_data(tech, data_type, chunksize=1000, limit=limit) + mastr_mirror.retrieve_additional_data( + tech, data_type, chunksize=1000, limit=limit + ) # Download basic location data -mastr_mirror.backfill_locations_basic( - limit=limit, - date="latest" -) +mastr_mirror.backfill_locations_basic(limit=limit, date="latest") # Download extended location data for location_type in location_types: diff --git a/tests/preparation.py b/tests/preparation.py index 12d34823..0f58bd3f 100644 --- a/tests/preparation.py +++ b/tests/preparation.py @@ -1,20 +1,19 @@ import os from open_mastr.utils.config import get_project_home_dir + def create_credentials_file(): """Use token and user stored in GitHub secrets for creating credentials file This is used to allow test workflow to access MaStR database. """ - credentials_file = os.path.join(get_project_home_dir(), 'config', 'credentials.cfg') + credentials_file = os.path.join(get_project_home_dir(), "config", "credentials.cfg") token = os.getenv("MASTR_TOKEN") user = os.getenv("MASTR_USER") section_title = "[MaStR]" - file_content = f"{section_title}\n" \ - f"user = {user}\n" \ - f"token = {token}\n" + file_content = f"{section_title}\n" f"user = {user}\n" f"token = {token}\n" with open(credentials_file, "w") as credentials_fh: credentials_fh.write(file_content) diff --git a/tests/test_helpers.py b/tests/test_helpers.py index f67046d1..4a19f4fb 100644 --- a/tests/test_helpers.py +++ b/tests/test_helpers.py @@ -251,12 +251,7 @@ def test_validate_parameter_format_for_mastr_init(db): def test_transform_data_parameter(): - ( - data, - api_data_types, - api_location_types, - harm_log, - ) = transform_data_parameter( + (data, api_data_types, api_location_types, harm_log,) = transform_data_parameter( method="API", data=["wind", "location"], api_data_types=["eeg_data"], diff --git a/tests/xml_download/test_utils_cleansing_bulk.py b/tests/xml_download/test_utils_cleansing_bulk.py index 38b8e41b..9a29ad76 100644 --- a/tests/xml_download/test_utils_cleansing_bulk.py +++ b/tests/xml_download/test_utils_cleansing_bulk.py @@ -30,6 +30,7 @@ def capture_wrap(): sys.stdout.close = lambda *args: None yield + @pytest.fixture(scope="module") def con(): con = sqlite3.connect(_sqlite_file_path) diff --git a/tests/xml_download/test_utils_download_bulk.py b/tests/xml_download/test_utils_download_bulk.py index 3fe351f6..b4cc0b7d 100644 --- a/tests/xml_download/test_utils_download_bulk.py +++ b/tests/xml_download/test_utils_download_bulk.py @@ -1,33 +1,52 @@ import time from open_mastr.xml_download.utils_download_bulk import gen_url + def test_gen_url(): when = time.strptime("2024-01-01", "%Y-%m-%d") url = gen_url(when) assert type(url) == str - assert url == "https://download.marktstammdatenregister.de/Gesamtdatenexport_20240101_23.2.zip" + assert ( + url + == "https://download.marktstammdatenregister.de/Gesamtdatenexport_20240101_23.2.zip" + ) when = time.strptime("2024-04-01", "%Y-%m-%d") url = gen_url(when) assert type(url) == str - assert url == "https://download.marktstammdatenregister.de/Gesamtdatenexport_20240401_23.2.zip" + assert ( + url + == "https://download.marktstammdatenregister.de/Gesamtdatenexport_20240401_23.2.zip" + ) when = time.strptime("2024-04-02", "%Y-%m-%d") url = gen_url(when) assert type(url) == str - assert url == "https://download.marktstammdatenregister.de/Gesamtdatenexport_20240402_24.1.zip" + assert ( + url + == "https://download.marktstammdatenregister.de/Gesamtdatenexport_20240402_24.1.zip" + ) when = time.strptime("2024-10-01", "%Y-%m-%d") url = gen_url(when) assert type(url) == str - assert url == "https://download.marktstammdatenregister.de/Gesamtdatenexport_20241001_24.1.zip" + assert ( + url + == "https://download.marktstammdatenregister.de/Gesamtdatenexport_20241001_24.1.zip" + ) when = time.strptime("2024-10-02", "%Y-%m-%d") url = gen_url(when) assert type(url) == str - assert url == "https://download.marktstammdatenregister.de/Gesamtdatenexport_20241002_24.2.zip" + assert ( + url + == "https://download.marktstammdatenregister.de/Gesamtdatenexport_20241002_24.2.zip" + ) when = time.strptime("2024-12-31", "%Y-%m-%d") url = gen_url(when) assert type(url) == str - assert url == "https://download.marktstammdatenregister.de/Gesamtdatenexport_20241231_24.2.zip" + assert ( + url + == "https://download.marktstammdatenregister.de/Gesamtdatenexport_20241231_24.2.zip" + ) From ec619f326b35b82364ff3a7b31acb33fc57c0016 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 11:04:21 +0200 Subject: [PATCH 52/57] Update copyright in docs conf.py #578 --- docs/conf.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/conf.py b/docs/conf.py index 5e578719..93b52ff0 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -19,7 +19,7 @@ # -- Project information ----------------------------------------------------- project = "open-MaStR" -copyright = "2022 Reiner Lemoine Institut and fortiss" +copyright = "2024 Reiner Lemoine Institut gGmbH and fortiss GmbH and OFFIS e.V." author = "" From a744ddafe8fb90df04cb5cf46f703c16048e7b78 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 11:18:06 +0200 Subject: [PATCH 53/57] Fix deprecations warning from PyPI publish test #578 --- .github/workflows/test-pypi-publish.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/test-pypi-publish.yml b/.github/workflows/test-pypi-publish.yml index 24afe432..2abdf735 100644 --- a/.github/workflows/test-pypi-publish.yml +++ b/.github/workflows/test-pypi-publish.yml @@ -35,4 +35,4 @@ jobs: uses: pypa/gh-action-pypi-publish@release/v1 with: password: ${{ secrets.PYPI_TEST }} - repository_url: https://test.pypi.org/legacy/ \ No newline at end of file + repository-url: https://test.pypi.org/legacy/ \ No newline at end of file From 9598db99e96cfaa8aaac01353f04f491025c3853 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 11:48:28 +0200 Subject: [PATCH 54/57] Add testing and linting to release procedure #578 --- RELEASE_PROCEDURE.md | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/RELEASE_PROCEDURE.md b/RELEASE_PROCEDURE.md index c432385a..5cd7684b 100644 --- a/RELEASE_PROCEDURE.md +++ b/RELEASE_PROCEDURE.md @@ -48,13 +48,17 @@ It always has the format `YYYY-MM-DD`, e.g. `2022-05-16`. * On release day, start the release early to ensure sufficient time for reviews * Merge everything on the `develop` branch -### 5. 💠 Create a `release` branch +### 5. Run tests and apply code linting +* Run tests locally with `pytest` and fix errors +* Apply linting with `pre-commit run -a` and fix errors + +### 6. 💠 Create a `release` branch * Checkout `develop` and branch with `git checkout -b release-v0.12.1` * Update version for test release with `bump2version --current-version --new-version patch` * Commit version update with `git commit -am "version update v0.12.1"` * Push branch with `git push --set-upstream origin release-v0.12.1` -### 6. 📝 Update the version files +### 7. 📝 Update the version files * `📝CHANGELOG.md` * All Pull Request are included * Add a new section with correct version number @@ -62,8 +66,7 @@ It always has the format `YYYY-MM-DD`, e.g. `2022-05-16`. * `📝CITATION.cff` * Update `date-released` -### 7. Optional: Check release on Test-PyPI - +### 8. Optional: Check release on Test-PyPI * Check if the release it correctly displayed on [Test-PyPI](https://test.pypi.org/project/open-mastr/#history) * You can trigger the release manually within github actions using the `run workflow` button on branch `release-v0.12.1` on the workflow `Build and release on pypi tests` * Note: Pre-releases on Test-PyPI are only shown under `Release history` in the navigation bar. @@ -72,7 +75,7 @@ It always has the format `YYYY-MM-DD`, e.g. `2022-05-16`. * Note: The release on Test-PyPI might fail, but it will be the correct release version for the PyPI server. * Push commits to the `release-*` branch -### 8. 🐙 Create a `Release Pull Request` +### 9. 🐙 Create a `Release Pull Request` * Use `📝PR_TEMPLATE_RELEASE` (❗ToDo❗) * Merge `release` into `production` branch * Assign reviewers to check the release @@ -81,7 +84,7 @@ It always has the format `YYYY-MM-DD`, e.g. `2022-05-16`. * Wait for reviews and tests * Merge PR -### 9. 💠 Set the `Git Tag` +### 10. 💠 Set the `Git Tag` * Checkout `production` branch and pull * Check existing tags `git tag -n` * Create new tag: `git tag -a v0.12.1 -m "open-mastr release v0.12.1 with PyPI"` @@ -91,7 +94,7 @@ It always has the format `YYYY-MM-DD`, e.g. `2022-05-16`. * Delete local tag: `git tag -d v0.12.1` * Delete remote tag: `git push --delete origin v0.12.1` -### 10. 🐙 Publish `Release` on GitHub and PyPI +### 11. 🐙 Publish `Release` on GitHub and PyPI * Navigate to your [releases](https://github.com/OpenEnergyPlatform/open-MaStR/releases/) on GitHub and open your draft release. * Summarize key changes in the description * Use the `generate release notes` button provided by github (This only works after the release branch is merged on production) @@ -103,7 +106,7 @@ It always has the format `YYYY-MM-DD`, e.g. `2022-05-16`. ▶️ In the background the GitHub workflow (pypi-publish.yml) will publish the package 📦 on PyPI! -### 11. 🐙 Set up new development +### 12. 🐙 Set up new development * Create a Pull request from `release-*` to `develop` * Create a new **unreleased section** in the `📝CHANGELOG.md` ``` From ada296cb99cdb874ae8f54baf3331af916630406 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 11:49:03 +0200 Subject: [PATCH 55/57] Add linting to contributing docs #578 --- CONTRIBUTING.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index a998de13..8c023f6a 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -129,6 +129,13 @@ git status ``` #### 2.3. Commit your changes +First, make sure you have the pre-commit hooks installed to have your code +automatically checked on commit for programmatic and stylistic errors: +```bash +pre-commit install +``` + +Now, let's add some file. If the file does not exist on the remote server yet, use: ```bash git add filename.md From 555e5ae9585e63843aa56648175e3c017adc0a9a Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 13:54:33 +0200 Subject: [PATCH 56/57] Replace values in NetzbetreiberpruefungStatus with their entries from Katalogwerte #582 --- open_mastr/xml_download/colums_to_replace.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/open_mastr/xml_download/colums_to_replace.py b/open_mastr/xml_download/colums_to_replace.py index e35acf6e..f35a30a9 100644 --- a/open_mastr/xml_download/colums_to_replace.py +++ b/open_mastr/xml_download/colums_to_replace.py @@ -110,4 +110,6 @@ "Seelage", "ClusterNordsee", "ClusterOstsee", + # various tables + "NetzbetreiberpruefungStatus", ] From 536eab954fa409edeac4178e25e454affe18bad6 Mon Sep 17 00:00:00 2001 From: nesnoj Date: Fri, 11 Oct 2024 13:56:26 +0200 Subject: [PATCH 57/57] Update changelog #582 --- CHANGELOG.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 99c2e35d..0cc8ce29 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,9 @@ and the versioning aims to respect [Semantic Versioning](http://semver.org/spec/ ## [v0.XX.X] unreleased - 2024-XX-XX ### Added +- Replace values in NetzbetreiberpruefungStatus with their entries from + Katalogwerte + [#583](https://github.com/OpenEnergyPlatform/open-MaStR/pull/583) - Add `deleted_market_actors` to data model and prevent crash on unknown tables [#575](https://github.com/OpenEnergyPlatform/open-MaStR/pull/575) - Extended documentation of data cleansing process for bulk download