Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow enhancements #43

Open
1 of 7 tasks
orianac opened this issue Nov 23, 2021 · 1 comment
Open
1 of 7 tasks

Workflow enhancements #43

orianac opened this issue Nov 23, 2021 · 1 comment

Comments

@orianac
Copy link
Member

orianac commented Nov 23, 2021

Below are a list of things we (might) need to do to improve the workflows. Each improvement here might not be required for each downscaling method.

  • How to pass around connection credentials (e.g. connection_string or stores)
  • Improving functionality for working with multiple variables at once. This is connected to the question of whether we want to be working in Datasets or DataArrays, since Datasets allow us to work on multiple variables at once.
  • Updating rechunk_zarr_array to process multiple variables at once. To do this we need to pass a list of variables instead of a string to it.
  • Figuring out how much of the prep is generalizable across downscaling methods
  • Change the regridding obs step from being keyed to a specific GCM and rather to a specific grid, (and include a test to make sure that the grids actually have the same coordinates, not just that they have the same number of coordinates)

Prefect features to add

  • Add in caching routines.
  • Setting up the workflow to read in a config file and set those variables as the context. Then call those context variables instead of passing them around as strings.
@orianac
Copy link
Member Author

orianac commented Nov 23, 2021

A few more to-dos from other comments in PR:

  • Change observations.py to training.py
  • Add testing - a starting point would be to create a dummy one-year dataset and run it through and assert that it looks the same as that. Could start with a sample output that I have for one year. This would be a longer-to-run test case, but it would be a good check that everything is working as expected before we enter production.
  • Add clean up routines (connected to Issue Data organization #45 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant