Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the "state of ground" to bring in the extra snow depth reports #1368

Draft
wants to merge 17 commits into
base: develop
Choose a base branch
from

Conversation

jiaruidong2017
Copy link
Collaborator

@jiaruidong2017 jiaruidong2017 commented Nov 13, 2024

This PR uses the python BUFR2IODA API to read the GTS bufr snow observations and adds to read the state of ground (SOGR) for bringing in the extra snow depth reports. If the state of the ground says there is no snow, but the snow depth observation is missing, we should change the missing values to be 0 so that the reports can be assimilated.

This PR also incorporates the latest changes from submodule from NOAA-EMC/bufr-query.

This PR contributes to NOAA-EMC/global-workflow#3093

@@ -2,3 +2,4 @@ mkdir:
- '{{ DATA }}/obs'
copy:
- ['{{PARMgfs}}/gdas/snow/obs/config/bufr_sfcsno_mapping.yaml', '{{ DATA }}/obs/']
- ['{{PARMgfs}}/gdas/snow/obs/config/ioda_bufr_python_encoder.py', '{{ DATA }}']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the python script need copied? and should it go here? we might need to talk to @emilyhcliu about the progress of SPOC

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CoryMartin-NOAA for your suggestions. Do you agree if I put this python code to sorc/gdas.cd/parm/snow/, so that this code can be used directly?
@emilyhcliu Any suggestions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this py code to the parm/gdas/snow/ directory for a temporary solution. @emilyhcliu Please let me know if you have any other suggestions. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scripts shouldn't be in parm/, until we move to SPOC, it should probably go here: https://github.com/NOAA-EMC/GDASApp/tree/develop/ush/ioda/bufr2ioda

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. Wait for @emilyhcliu suggestions.

Copy link
Collaborator

@ClaraDraper-NOAA ClaraDraper-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this @jiaruidong2017 . I have requested one small change.


sogr = container.get('variables/groundState')
snod = container.get('variables/totalSnowDepth')
snod[(sogr <= 11.0) | (sogr == 15.0)] = 0.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without having seen the data, I think there should be an if statement here - so that we only replace the snod when it was undefined / missing.

i.e., if snod > 0. (or some, slighty higher threshold), and sogr indictaes no snow cover, I assume we would want to use the snod value, not 0.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above (with an if statement) is all consistent with how the PR description.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestions. I will investigate this.

@rmclaren Do you have any suggestions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for the good findings. I modified to only replace the filled/missing snod.

- bounding:
variable: totalSnowDepth
lowerBound: 0
upperBound: 10000000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding:

  • is this introducing a 10 m limit on snow depth obs?
  • if they are above 10 m, are they discarded, or set to 10.?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I previously put this large number (10000m) here for removing the missing values. Here, we don't need this any more because we need to set the missing snod values to 0 when sogr satisfies the defined conditions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might still have missing values though wouldn't we?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The missing values will be removed after applying sogr conditions.


return new_container

def create_obs_group(input_path):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments here would be helpful to explain what we're doing - specifically what snogr is, and why we're selecting out the values that we are.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent suggestions. I will add the comments here later. Thanks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the comments to explain the steps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiaruidong2017 move this to https://github.com/NOAA-EMC/GDASApp/tree/develop/ush/ioda/bufr2ioda but change its name to something more descriptive/specific until we can move things to SPOC

rmclaren
rmclaren previously approved these changes Nov 14, 2024
Comment on lines 40 to 41
encoder = Encoder(YAML_PATH)
data = next(iter(encoder.encode(masked_container).values()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be 1 liner data = next(iter(Encoder(YAML_PATH).encode(masked_container).values())). Up to you :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rmclaren for your suggestions. Will change it.

Comment on lines 11 to 12
input_path: '{{ DATA }}/obs/{{ OPREFIX }}sfcsno.tm00.bufr_d'
obsfile: '{{ DATA }}/obs/{{ OPREFIX }}sfcsno.tm00.bufr_d'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to add an issue so they do not assume the formatting of the YAML file (obsfile variable shoould not be necessary). Reading the backend engines YAML breaks encapsulation (external modules should not be dependent on the private configuration for a backend type. Some backends may not have any kind of file associated with them (like a db...).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes thanks @rmclaren I'm aware of that issue, I/we/others need to figure out how best to handle it on the workflow side.

type: script
script file: "{{ USHgfs }}/bufr2ioda_sfcsno_bufr_encoder.py"
args:
input_path: '{{ DATA }}/obs/{{ OPREFIX }}sfcsno.tm00.bufr_d'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:
Make the mapping file as input to the Python script for create_obs_group

args:
 input_path:  '{{ DATA }}/obs/{{ OPREFIX }}sfcsno.tm00.bufr_d'
 mapping_path: = "./obs/bufr_sfcsno_mapping.yaml"

This will make it clear that the script backend requires two inputs: bufr file and the mapping file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @emilyhcliu for the suggestions. I made the changes as suggested.

@emilyhcliu
Copy link
Collaborator

@CoryMartin-NOAA For v17 global-workflow, do we want to read the bufr as the backend (this PR), or do we want to convert the bufr info IODA format netCDF in the obsproc step?

@CoryMartin-NOAA
Copy link
Contributor

@emilyhcliu I think my preference would be backend for this (and other) case where the input file is BUFR.

@emilyhcliu
Copy link
Collaborator

@jiaruidong2017 Have you run a cycle in global-workflow with the changes from this PR?

@emilyhcliu
Copy link
Collaborator

@jiaruidong2017 Could you point me where to get a sfcsno bufr data for testing? Thanks.

@jiaruidong2017
Copy link
Collaborator Author

@jiaruidong2017 Could you point me where to get a sfcsno bufr data for testing? Thanks.

@emilyhcliu You can find the data on hera from the dump directory at: /scratch1/NCEPDEV/global/glopara/dump/gdas.$YYYY$MM$DD/$HH/atmos/gdas.t${HH}z.sfcsno.tm00.bufr_d for the period from 2021-07-01 to 2022-05-31.

@jiaruidong2017
Copy link
Collaborator Author

@jiaruidong2017 Have you run a cycle in global-workflow with the changes from this PR?

Yes, it worked as expected.

@emilyhcliu
Copy link
Collaborator

emilyhcliu commented Nov 24, 2024

@jiaruidong2017 Could you point me where to get a sfcsno bufr data for testing? Thanks.

@emilyhcliu You can find the data on hera from the dump directory at: /scratch1/NCEPDEV/global/glopara/dump/gdas.$YYYY$MM$DD/$HH/atmos/gdas.t${HH}z.sfcsno.tm00.bufr_d for the period from 2021-07-01 to 2022-05-31.

Oh, no wonder I could not find the near-real-time data. Thanks!!

@emilyhcliu
Copy link
Collaborator

@jiaruidong2017 I tested the Python script for converter for 2021080100 sfcsno bufr

  • The totalSnowDepth data count` for all data (missing + valid) is 138359
  • The totalSnowDepth data count` for the masked data (based on missing + state of the ground condition) is 3685

The final IODA output file has data account = 3685

This is the result of my test. Do those data count values look right to you?

@jiaruidong2017
Copy link
Collaborator Author

@jiaruidong2017 Could you point me where to get a sfcsno bufr data for testing? Thanks.

@emilyhcliu You can find the data on hera from the dump directory at: /scratch1/NCEPDEV/global/glopara/dump/gdas.$YYYY$MM$DD/$HH/atmos/gdas.t${HH}z.sfcsno.tm00.bufr_d for the period from 2021-07-01 to 2022-05-31.

Oh, no wonder I could not find the near-real-time data. Thanks!!

The real-time data will be available soon (expected in next January) after the next release of obsproc v1.3.

@jiaruidong2017
Copy link
Collaborator Author

@jiaruidong2017 I tested the Python script for converter for 2021080100 sfcsno bufr

  • The totalSnowDepth data count` for all data (missing + valid) is 138359
  • The totalSnowDepth data count` for the masked data (based on missing + state of the ground condition) is 3685

The final IODA output file has data account = 3685

This is the result of my test. Do those data count values look right to you?

Yeah, it looks good to me. Thanks.

@jiaruidong2017
Copy link
Collaborator Author

@emilyhcliu You can find the near-real-time sfcsno dump file on wcoss2 at below:

/lfs/h2/emc/stmp/iliana.genkova/CRON/R13/com/obsproc/v1.3/gdas.20241124/00/atmos/gdas.t00z.sfcsno.tm00.bufr_d

emilyhcliu
emilyhcliu previously approved these changes Nov 26, 2024
ush/ioda/bufr2ioda/bufr2ioda_sfcsno_bufr_encoder.py Outdated Show resolved Hide resolved
@@ -65,11 +62,18 @@ encoder:
coordinates: "longitude latitude"
source: variables/stationIdentification
longName: "Identification of Observing Location"
units: "m"
units: "index"

# ObsValue
- name: "ObsValue/totalSnowDepth"
coordinates: "longitude latitude"
source: variables/totalSnowDepth
longName: "Total Snow Depth"
units: "mm"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

@jiaruidong2017 jiaruidong2017 Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to keep using the mm here in the observation files to match the unit used in the UFS model for the snow depth.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiaruidong2017 Is it possible to read in the totalSnowDepth in m unit from the IODA/NetCDF file and then convert it to mm in the UFS model?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible, but will require additional efforts. We can discuss this issue with the physics team.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiaruidong2017 I guess only a one-line change is necessary for UFS. You just need to convert totalSnowDepth from m to mm where it is read from IODA obs file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue (mismatch between units proscribed by JEDI and those used in the UFS) is bigger than this PR - as all other snow depth IODA files are in mm too. If we change these observations to mm, we'll need to change the others. I suggest that we leave this in mm for now, and create a separate issue to address the unit mismatch (by introducing a unit transform somewhere).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Let's keep it as "mm" for now.

Comment on lines 27 to 28
transforms:
- scale: 1000.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on IODA data convention, the unit of totalSnowDepth is m. We should remove the conversion of it from meter to millimeter.

Copy link
Collaborator Author

@jiaruidong2017 jiaruidong2017 Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above to keep use the conversion.

@@ -65,11 +62,18 @@ encoder:
coordinates: "longitude latitude"
source: variables/stationIdentification
longName: "Identification of Observing Location"
units: "m"
units: "index"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiaruidong2017 The unit of stationIdentification is not defined (empty) in the IODA convention table, and it is left for users to decide. Have you discussed the unit for stationIdentification with JCSDA (Ben Ruston)?
Let's leave it as "index" for now and I will open an issue in IODA repository and check if we can use "index" as unit for this variable.

Comment on lines 36 to 37
snod[(sogr <= 11.0) & snod.mask] = 0.0
snod[(sogr == 15.0) & snod.mask] = 0.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was your intention to remove these values (treat them as missing) or just to set them to 0 as you have done here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to update the mask you should do something like this (instead of zeroing things out):

snod.mask = (sogr <= 11.0) & snod.mask
snod.mask = (sogr == 15.0) & snod.mask 

Copy link
Collaborator Author

@jiaruidong2017 jiaruidong2017 Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to set the snod to 0 here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just FYI I don't think this code does exactly what you think it does. The result of the container get returns a numpy masked array. snod.mask == True values indicate those values that should be masked out (invalid/missing values).

I think the following might be more correct...

snod[(sogr <= 11.0) & (~snod.mask)] = 0.0

I don't think you really need to manually mask out the masked values like this (since you are removing them later anyways). You could probably just do this:

snod[(sogr <= 11.0) | (sogr == 15.0)] = 0.0

Copy link
Collaborator Author

@jiaruidong2017 jiaruidong2017 Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rmclaren
We just want to change the snod to zero for the missing snod values. If the snod has the valid values, we will ignore the sogr conditions and keep to use the snod values (probably is 0 or some other numbers).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants