Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auxiliary Files for Rule #97

Merged
merged 8 commits into from
Jan 10, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/CI-test.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python

Check warning on line 1 in .github/workflows/CI-test.yaml

View workflow job for this annotation

GitHub Actions / check_format (3.9)

1:81 [line-length] line too long (118 > 80 characters)

name: Run Basic Tests

Check warning on line 3 in .github/workflows/CI-test.yaml

View workflow job for this annotation

GitHub Actions / check_format (3.9)

3:1 [document-start] missing document start "---"
on:

Check warning on line 4 in .github/workflows/CI-test.yaml

View workflow job for this annotation

GitHub Actions / check_format (3.9)

4:1 [truthy] truthy value should be one of [false, true]
push:
branches: ["main"]
pull_request:
Expand All @@ -28,10 +28,10 @@
- name: Install dependencies
run: |
python -m pip install --upgrade pip
if ${{ matrix.python-version == '3.12' }}; then pip install --upgrade setuptools; fi

Check warning on line 31 in .github/workflows/CI-test.yaml

View workflow job for this annotation

GitHub Actions / check_format (3.9)

31:81 [line-length] line too long (94 > 80 characters)
- name: Install package
run: |
python -m pip install .[dev]
python -m pip install .[dev,fesom]
- name: Test if data will work (Meta-Test)
run: |
export HDF5_DEBUG=1
Expand Down
1 change: 1 addition & 0 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Contents
including_custom_steps
including_subcommand_plugins
pymorize_configuration
pymorize_aux_files
developer_guide
developer_setup
API
Expand Down
45 changes: 45 additions & 0 deletions doc/pymorize_aux_files.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
==================================
``pymorize`` Using auxiliary files
==================================

At times, your post-processing will require additional files beyond the actual data.
For example, say your are analyzing FESOM output, and need to know the computational mesh
in order to calculate transport across a particular edge. In Python, the common way to do this
is to use the ``pyfesom2`` library to load the mesh. For a ``Rule`` to be aware of the mesh, you
can use auxiliary files.


You can add additional files to your ``Rule`` objects by specifying them in the
pgierz marked this conversation as resolved.
Show resolved Hide resolved
``aux`` element of the rule. These files are loaded when the ``Rule`` object is
initialized, and can be accessed in your steps.

For example, consider the following YAML configuration::


rules:
- name: My First Rule
aux:
- name: My Aux Data
path: /path/to/aux/data.csv


You can then access this in a step like so::

def my_step(data, rule):
aux_data = rule.aux["My Aux Data"]
print(aux_data)
return data

By default, the program assumes you just have a text file which you can
read in. However, you may also want to use something else. Here is how
you can include a FESOM mesh object representation in ``pyfesom2``::


rules:
- name: My Other Rule
aux:
- name: mesh
path: /some/path/to/a/mesh
loader: pyfesom2.read_mesh_data.load_mesh

In Python, you get back the already loaded mesh object.
4 changes: 4 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,10 @@ def read(filename):
"yamllint",
],
"doc": docs_require,
"fesom": [
# FIXME(PG): We should talk with Nikolay, this is not optimal...
"pyfesom2 @ git+https://github.com/fesom/[email protected]",
],
},
entry_points={
"console_scripts": [
Expand Down
116 changes: 116 additions & 0 deletions src/pymorize/aux_files.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
"""
Auxiliary files that can be attached to a Rule
"""

from .utils import get_callable


class AuxiliaryFile:
"""
A class to represent an auxiliary file.

Attributes
----------
name : str
The name of the file.
path : str
The path to the file.
loader : callable, optional
A callable to load the file.
loader_args : list, optional
Arguments to pass to the loader.
loader_kwargs : dict, optional
Keyword arguments to pass to the loader.

Methods
-------
load():
Loads the file using the specified loader or reads the file content.
from_dict(d):
Creates an AuxiliaryFile instance from a dictionary.
"""

def __init__(self, name, path, loader=None, loader_args=None, loader_kwargs=None):
"""
Constructs all the necessary attributes for the AuxiliaryFile object.

Parameters
----------
name : str
The name of the file.
path : str
The path to the file.
loader : callable, optional
A callable to load the file.
loader_args : list, optional
Arguments to pass to the loader.
loader_kwargs : dict, optional
Keyword arguments to pass to the loader.
"""
self.name = name
self.path = path
self.loader = loader
if loader_args is None:
loader_args = []
self.loader_args = loader_args
if loader_kwargs is None:
loader_kwargs = {}
self.loader_kwargs = loader_kwargs

def load(self):
"""
Loads the file using the specified loader or reads the file content.

Returns
-------
str
The content of the file if no loader is specified.
object
The result of the loader if a loader is specified.
"""
if self.loader is None:
with open(self.path, "r") as f:
return f.read()
else:
loader = get_callable(self.loader)
return loader(self.path, *self.loader_args, **self.loader_kwargs)

@classmethod
def from_dict(cls, d):
"""
Creates an AuxiliaryFile instance from a dictionary.

Parameters
----------
d : dict
A dictionary containing the attributes of the AuxiliaryFile.

Returns
-------
AuxiliaryFile
An instance of AuxiliaryFile.
"""
return cls(
d["name"],
d["path"],
d.get("loader"),
d.get("loader_args"),
d.get("loader_kwargs"),
)


# NOTE(PG): Think about this...maybe it should be a method of Rule...
def attach_files_to_rule(rule):
"""
Attaches extra files to the rule

Mutates
-------
rule :
The Rule object is modified to include the loaded auxiliary files
"""
loaded_aux = {}
for aux_file_spec in rule.get("aux", []):
aux_file = AuxiliaryFile.from_dict(aux_file_spec)
loaded_aux[aux_file.name] = aux_file.load()
rule.aux = loaded_aux
8 changes: 8 additions & 0 deletions src/pymorize/cmorizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
from prefect.futures import wait
from rich.progress import track

from .aux_files import attach_files_to_rule
from .cluster import (
CLUSTER_ADAPT_SUPPORT,
CLUSTER_MAPPINGS,
Expand Down Expand Up @@ -123,6 +124,7 @@ def __init__(
self._post_init_create_data_request()
self._post_init_populate_rules_with_tables()
self._post_init_populate_rules_with_dimensionless_unit_mappings()
self._post_init_populate_rules_with_aux_files()
self._post_init_populate_rules_with_data_request_variables()
logger.debug("...post-init done!")
################################################################################
Expand Down Expand Up @@ -249,6 +251,11 @@ def _post_init_populate_rules_with_data_request_variables(self):
self._rules_expand_drvs()
self._rules_depluralize_drvs()

def _post_init_populate_rules_with_aux_files(self):
"""Attaches auxiliary files to the rules"""
for rule in self.rules:
attach_files_to_rule(rule)

def _post_init_populate_rules_with_dimensionless_unit_mappings(self):
"""
Reads the dimensionless unit mappings from a configuration file and
Expand Down Expand Up @@ -489,6 +496,7 @@ def from_dict(cls, data):
instance._post_init_create_data_request()
instance._post_init_populate_rules_with_data_request_variables()
instance._post_init_populate_rules_with_dimensionless_unit_mappings()
instance._post_init_populate_rules_with_aux_files()
logger.debug("Object creation done!")
return instance

Expand Down
43 changes: 43 additions & 0 deletions tests/unit/test_aux_files.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# import pytest
from pyfesom2.load_mesh_data import fesom_mesh

from pymorize.aux_files import attach_files_to_rule


def test_aux_files_attach_without_aux(pi_uxarray_temp_rule):
rule = pi_uxarray_temp_rule
attach_files_to_rule(rule)
assert rule.aux == {}


def test_aux_files_attach_simple_file(pi_uxarray_temp_rule, tmp_path):
# Create a temporary file
temp_file = tmp_path / "temp_file.txt"
temp_file.write_text("Hello, pytest!")

rule = pi_uxarray_temp_rule
rule.aux = [
{
"name": "aux1",
"path": str(temp_file),
},
]
attach_files_to_rule(rule)
assert rule.aux == {"aux1": "Hello, pytest!"}


def test_aux_files_attach_fesom_mesh(
fesom_2p6_esmtools_temp_rule, fesom_2p6_pimesh_esm_tools_data
):
mesh = fesom_2p6_pimesh_esm_tools_data / "input/fesom/mesh/pi"
rule = fesom_2p6_esmtools_temp_rule
rule.aux = [
{
"name": "mesh",
"path": str(mesh),
"loader": "pyfesom2.load_mesh_data.load_mesh",
},
]
attach_files_to_rule(rule)
print(f'PG DEBUG >>> {rule.aux["mesh"]}')
assert isinstance(rule.aux["mesh"], fesom_mesh)
Loading