Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

anomalies function modifying original cube in-place #2223

Closed
rebeccaherman1 opened this issue Oct 9, 2023 · 11 comments · Fixed by #2191
Closed

anomalies function modifying original cube in-place #2223

rebeccaherman1 opened this issue Oct 9, 2023 · 11 comments · Fixed by #2191
Labels
bug Something isn't working

Comments

@rebeccaherman1
Copy link
Contributor

rebeccaherman1 commented Oct 9, 2023

Description
The preprocessor functions generally do not modify the cubes in-place, and instead return the new cube. But in the most up-to-date development version of ESMValCore, calling anomalies on a cube adds a time_weights coordinate to the original cube, preventing future applications of anomalies to that cube by causing a duplicate coordinate error.

@valeriupredoi this discovered while trying to run my old code with my new environment. I'm not sure if the source of the change is ESMValCore or Iris.

Traceback

---------------------------------------------------------------------------
CannotAddError                            Traceback (most recent call last)
Cell In[8], line 1
----> 1 pp.anomalies(zPacific_all[-1], period='full')

File ~/ESMValCore/esmvalcore/preprocessor/_time.py:828, in anomalies(cube, period, reference, standardize, seasons)
    826 else:
    827     reference_cube = extract_time(cube, **reference)
--> 828 reference = climate_statistics(reference_cube,
    829                                period=period,
    830                                seasons=seasons)
    831 if period in ['full']:
    832     metadata = copy.deepcopy(cube.metadata)

File ~/ESMValCore/esmvalcore/preprocessor/_time.py:755, in climate_statistics(cube, operator, period, seasons)
    749 if operator_accept_weights(operator):
    750     time_weights_coord = AuxCoord(
    751         get_time_weights(cube),
    752         long_name='time_weights',
    753         units=cube.coord('time').units,
    754     )
--> 755     cube.add_aux_coord(time_weights_coord, cube.coord_dims('time'))
    756     clim_cube = cube.collapsed(
    757         'time',
    758         operator_method,
    759         weights=time_weights_coord,
    760     )
    761 else:

File /work/bd1083/b309244/conda/envs/dev/lib/python3.10/site-packages/iris/cube.py:1191, in Cube.add_aux_coord(self, coord, data_dims)
   1169 """
   1170 Adds a CF auxiliary coordinate to the cube.
   1171 
   (...)
   1188 
   1189 """
   1190 if self.coords(coord):  # TODO: just fail on duplicate object
-> 1191     raise iris.exceptions.CannotAddError(
   1192         "Duplicate coordinates are not permitted."
   1193     )
   1194 self._add_unique_aux_coord(coord, data_dims)

CannotAddError: Duplicate coordinates are not permitted.

Minimum Reproducible Code

  • start with a new environment using the most recent version of ESMValTool/ESMValCore and python 3.10.
  • take any cube (like the small example from before)
  • examine the cube C -- there will be no Auxiliary coordinate called time_weights
  • run the following code:
import esmvalcore.preprocessor as pp
pp.anomalies(C, period='full')
  • examine C again; there will now be an Auxiliary coordinate called time_weights
  • run pp.anomalies(C, period='full') again
@valeriupredoi
Copy link
Contributor

@schlunma this was the thing I was dreading and was on about just an hour ago during the review of your PR - you got a lot more experience with actually running these preprocessors - could you help here please? 🍺

@valeriupredoi valeriupredoi added the bug Something isn't working label Oct 9, 2023
@rebeccaherman1
Copy link
Contributor Author

rebeccaherman1 commented Oct 9, 2023

@valeriupredoi if you had to guess whether this was due to using iris version 3.7.0 (instead of 3.6.1) or to updates to ESMValCore, what would you guess? If the latter, any chance you could tell me which commit you used when you made the environment for me so I can go back to it while this is being sorted?

@rebeccaherman1
Copy link
Contributor Author

Update: I downgraded my iris and the problem persists, so I do believe it is on the ESMValCore side @schlunma
@valeriupredoi a commit number would be greatly appreciated 🙏 maybe I can guess by looking at the timing and commit history?

@valeriupredoi
Copy link
Contributor

@rebeccaherman1 - have a look at this 4baa8aa#diff-ec2da1a859c8d6359e23421a3de9d444e455ed3008152eaa67bbdd01f4cadfb8 - you can downgrade to esmvalcore=2.8 and see if you still hit the issue

@rebeccaherman1
Copy link
Contributor Author

@valeriupredoi does esmvalcore=2.8 have the patch you made for me for the iris problem?

anyway another UPDATE:
Editable installs are very cool. This is probably not the most efficient fix to the problem, but I added cube = copy.deepcopy(cube) to the beginning of the climate_statistics function in _time.py and it seems to solve the problem!

If it seems like a reasonable solution to the problem, then perhaps it's time for me to learn how to do a pull request

Though it seems from your first comment that maybe you have reason to believe this is a larger issue than just this one function?

@rebeccaherman1
Copy link
Contributor Author

actually a further update (sorry for writing before I've investigated much):
my first attempt at a hack does solve the problem of modifying the original cube in-place, but it still throws what seems to me to be an unnecessary error if you try to take the anomalies twice, i.e. climate_statistics(climate_statistics(cube))
Seems like the aux coord should simply not be added if it is already present

@valeriupredoi
Copy link
Contributor

valeriupredoi commented Oct 9, 2023

TBF I've not dug deep enough yet to identify it as a real issue - so prob best for you to use your patch locally (well done btw) and then if you got time to open a PR, then we can spend time to investigate if it's a real bug, and if so, what's the best fix 👍

@schlunma
Copy link
Contributor

schlunma commented Oct 9, 2023

This should be fixed with PR #2191, so hopefully in time for v2.10:

https://github.com/ESMValGroup/ESMValCore/pull/2191/files#diff-ec2da1a859c8d6359e23421a3de9d444e455ed3008152eaa67bbdd01f4cadfb8R817-R818

@rebeccaherman1
Copy link
Contributor Author

@schlunma that's an exciting update!

My full patch for consideration in the meantime:

#prevents modifying cube in-place
cube = copy.deepcopy(cube)
#allows repeated applications of climate statistics functions
if 'time_weights' in [c.name() for c in cube.aux_coords]:
    cube.remove_coord('time_weights')

I have been using the preprocessor functions in a notebook and do occasionally apply anomalies again after further analysis; I'm not sure that's even possible in the official ESMValTool preprocessor. Nevertheless I think it should be possible in a python script.

@valeriupredoi
Copy link
Contributor

@schlunma cheers! If it's allowed and (scientifically relevant) to call anomalies after other anomalies have been created beforehand, we should probably include the info in the preprocessor's API docs - I didn't think it'd make much sense to call them twice, but that's why I am not a climate scientist 😁

@rebeccaherman1
Copy link
Contributor Author

@valeriupredoi calling anomalies twice in a row is theoretically not harmful but entirely unnecessary, as the second call would make no change. However I think what I did in my notebook was call anomalies, then do further processing (like finding the max and min over an area) and then I wanted to call anomalies again on the resulting data. This second call to anomalies has different logic than the first, and thus is still scientifically-relevant. I could imagine doing such a thing in pre-processing if I overrode the default order and was allowed to use a function more than once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants