Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New DataRequest #81

Merged
merged 57 commits into from
Dec 11, 2024
Merged

New DataRequest #81

merged 57 commits into from
Dec 11, 2024

Conversation

pgierz
Copy link
Member

@pgierz pgierz commented Nov 29, 2024

Rework of DataRequest

This introduces new DataRequest, DataRequestVariable, and DataRequestTable classes. The old ones are copies directly from Ruby and are not very nice to work with.

Here, we implement a generic abstract base class (ABC) for DataRequestVariable. This allows us to define how a DataRequestVariable should look like from the pymorize Python side, without caring about how the information is ultimately extracted from the JSON tables. CMIP6DataRequestVariable is the first direct implementation used for JSON tables. This has the key feature of allowing us to be flexible when new CMIP standards inevitably come out, for example with CMIP7.

A DataRequestVariable therefore must implement certain properties and methods to be correctly recognised, otherwise you get Python TypeErrors when subclassing.

TODO List

  • DataRequestVariable (and concrete CMIP6DataRequestVariable)
  • DataRequestTable (and concrete CMIP6DataRequestTable, specifically CMIP6JSONDataRequestTable -- yes, I want to build two subclass levels here, since it might happen in the future that the source of the DataRequestTable isn't a JSON file. Maybe it's a HTTP request. Maybe it's a post-it note on the coffee machine. I don't know)
  • DataRequest (maybe named DataRequestCollection?) -- some gathering of various Variables and/or Tables into a bundle.
  • Rework all the tests

Still to come

  • Integration of @siligam's FlatTable design (those parts that fit together with this)

Copilot Summary

This pull request introduces a new abstract base class DataRequestVariable and its concrete implementation CMIP6DataRequestVariable in the src/pymorize/data_request/variable.py file. Additionally, it includes a unit test for the CMIP6DataRequestVariable class in tests/unit/test_drv.py.

Key changes include:

New Class Definitions:

  • src/pymorize/data_request/variable.py:
    • Added DataRequestVariable abstract base class with various properties and methods to define a generic data request variable.
    • Added CMIP6DataRequestVariable class as a concrete implementation of DataRequestVariable for CMIP6 variables, utilizing the dataclass decorator for automatic method generation.
    • Implemented class methods for constructing instances from dictionaries and JSON files, and methods for converting instances to dictionary representations.

Unit Testing:

  • tests/unit/test_drv.py:
    • Added a test case to verify the creation of a CMIP6DataRequestVariable instance from a CMIP6 JSON table file.

This commit introduces the `DataRequestVariable` abstract base class and
its concrete implementation `CMIP6DataRequestVariable`. The new classes
use the `dataclass` decorator to automatically generate special methods
like `__init__` and `__repr__`.

The `DataRequestVariable` class outlines the necessary properties and
methods that any variable class should implement. The
`CMIP6DataRequestVariable` class is a concrete implementation for CMIP6
variables.

Additionally, this commit includes methods for constructing
`DataRequestVariable` instances from dictionaries and JSON files, as
well as converting instances to dictionary representations.

Unit tests for `CMIP6DataRequestVariable` have been added to ensure
correct functionality.
@pgierz pgierz requested a review from siligam November 29, 2024 10:28
@pgierz
Copy link
Member Author

pgierz commented Dec 2, 2024

Tests now fail at the integration level, which is what I had expected.

@pgierz pgierz marked this pull request as ready for review December 6, 2024 15:05
@pgierz pgierz requested a review from mandresm December 6, 2024 15:05
@pgierz
Copy link
Member Author

pgierz commented Dec 6, 2024

This pull request includes significant updates to the testing configuration, dependency management, and the CMORizer class in the pymorize module. The most important changes are grouped by theme and summarized below:

Testing Configuration:

  • Modified the pytest command in the CI workflow to include all Python files in the tests/unit directory. (.github/workflows/CI-test.yaml)
  • Commented out unused imports and pytest hooks in conftest.py to clean up the code. (conftest.py) [1] [2]
  • Added a warning filter to pytest.ini to ignore specific matplotlib-related warnings. (pytest.ini)

Dependency Management:

  • Added semver to the list of required dependencies in setup.py. (setup.py)
  • Added pytest-mock to the list of testing dependencies in setup.py. (setup.py)

CMORizer Class Enhancements:

  • Introduced _SUPPORTED_CMOR_VERSIONS to the CMORizer class to specify supported CMOR versions. (src/pymorize/cmorizer.py)
  • Added a destructor method to the CMORizer class to ensure the cluster is closed gracefully. (src/pymorize/cmorizer.py)
  • Updated the _post_init_read_bare_tables method to use CMIP6IgnoreTableFiles instead of IgnoreTableFiles. (src/pymorize/cmorizer.py)
  • Added validation for cmor_version in the _post_init_create_data_request method and updated the data request creation process. (src/pymorize/cmorizer.py)

Data Request Module:

  • Introduced a new collection.py file with the DataRequest and CMIP6DataRequest classes, including methods for creating data requests from tables, directories, and git repositories. (src/pymorize/data_request/collection.py)
  • Added a new factory.py file to implement the MetaFactory metaclass and a factory creation function for managing data request subclasses. (src/pymorize/data_request/factory.py)

These changes collectively enhance the functionality, maintainability, and testing capabilities of the pymorize module.

@pgierz pgierz enabled auto-merge December 6, 2024 15:07
@pgierz pgierz mentioned this pull request Dec 9, 2024
1 task
mandresm
mandresm previously approved these changes Dec 10, 2024
Copy link
Contributor

@mandresm mandresm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome @pgierz. I have a headache but it's very nice. I cannot say I understood everything but I could more or less follow. There are no many suggestions from my side, more questions, than anything. Once the suggestions are addressed feel free to merge right away.

tests/unit/data_request/__init__.py Outdated Show resolved Hide resolved
src/pymorize/cmorizer.py Outdated Show resolved Hide resolved
src/pymorize/cmorizer.py Outdated Show resolved Hide resolved
src/pymorize/data_request/collection.py Show resolved Hide resolved
src/pymorize/data_request/factory.py Show resolved Hide resolved
src/pymorize/data_request/collection.py Show resolved Hide resolved
src/pymorize/data_request/collection.py Show resolved Hide resolved
src/pymorize/data_request/collection.py Show resolved Hide resolved
src/pymorize/data_request/table.py Show resolved Hide resolved
src/pymorize/rule.py Show resolved Hide resolved
Co-authored-by: Miguel <[email protected]>
Co-authored-by: Miguel <[email protected]>
@pgierz pgierz mentioned this pull request Dec 11, 2024
2 tasks
@pgierz pgierz disabled auto-merge December 11, 2024 13:10
@pgierz pgierz merged commit 8cac4c0 into main Dec 11, 2024
5 checks passed
@mandresm mandresm deleted the feat/dataclasses-for-data-request branch January 10, 2025 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants