This is Jenna Baughman and Sara Stoudt's' project for UC Berkeley's Data Science for the 21st Century Training program.
Our clients are researchers interested in incorporating climate models into their work but who are daunted by the many climate products available. Our tool will be most useful to researchers who have expertise in areas other than climate or data science, in particular. We envision researchers of other fields using our visualizations to compare and choose appropriate climate models for incorporation into their work and to more easily extract time series for one location of interest.
Non-climate focused scientists may find themselves wanting to incorporate future climate projections into their work but overwhelmed by the volume of data and the many climate model choices. These researchers may not know which of the many models to choose for their work and want to better understand the differences between them for their geographic location of interest. The process of synthesizing all 21+ models of historical and future projected climate into a digestible format is time-consuming and computationally intensive. Furthermore, learning the techniques to perform these preliminary data analyses might not be within the researcher’s scope of interest. Our strategy is to make this “pseudo-optimization” process easier and more accessible so that a researcher can invest in making a targeted choice of models instead of merely a convenient one while avoiding the overhead discussed above.
Nasa Earth Exchange (NEX) has consolidated 21 climate products that contain historical minimum and maximum temperature and precipitation as well as climate projections of these variables under two climate scenarios. NEX has downscaled all of these products onto a common global grid (0.25 degrees; 25 km x 25 km) at the daily level from 1950 to 2100. Our pre-processing pipeline consists of scraping the data from NEX, aggregating it into digestible snapshots, and visualizing the consolidated output. We provide reproducible code to scrape data from NEX, including code to check completeness of download and re-attempt download of files that failed the first time as we suspect that there are time or size limits to the connections with the NEX server that limit success of downloads in a single sitting.
We expect our user to first look through the climate projections to pick one or a few that are suitable for their question of interest in their location of interest. It is known that each climate product is not uniformly suitable for each location of the globe. However, this suitability may be defined by many measures. For example, a user might want to test an extreme scenario such as a maximum increase in annual precipitation for a country. We choose two simple metrics to start: trend and variance. A user could easily adapt our code to define their own metrics of interest, but they would have to re-run much of the pre-processing code to do so. We create a global snapshot of trend in future scenarios by calculating the yearly mean at each grid cell and applying a robust linear model to obtain a slope. These slopes are then visualized via a heatmap. Similarly, a variance heatmap is created by calculating the variance of the yearly means at each location. This information can indicate predicted climates that have extreme weather, such as colder winters and hotter summers with no change in annual average temperature and no directional trend over time.
Once a user has chosen their climate models based on these climate scenario snapshots, they can use their domain knowledge of their area of interest by checking some simple metrics of the historical data to make sure that they match their intuition and/or knowledge of the area. We calculate a yearly average and standard deviation at each location and stitch together heatmaps over time for this purpose.
Our primary deliverable includes two climate-related data visualization tools of the simple analyses described above as well as the code used to create them. In order to provide the user with a method for comparing models of historical climate in their region of interest, the first visualization allows researchers to select models based on their own understanding of past climate for their study area in the form of an interactive map of downscaled historical climate models. The user will be able to scroll through maps over time of either precipitation or temperature (maximum and minimum) on a yearly granularity over the world map. The user will also be able to select which of the models they would like to see and compare on a visual basis (side-by-side). Ideally, the researcher would be able to choose from all 21 models but, due to limited computational power, we will only be able to provide visualization for two models. We will provide the code to build the rest of them though.
Our second visualization depicts the time trend for the climate projections as a static global heatmap. Again, the user will be able to choose from a drop-down list of models. Ideally, a user would be able to click on a grid cell of this map and see the time series that led to the particular trend coefficient. We can do this for a few grid cells as a proof of concept, but to do this for every grid cell will be beyond our computational resources. A future direction would be to incorporate these time series into the Berkeley Tree Database (BTrDB) framework that allows for large amounts of multiple scale time series to be easily stored and displayed. We are talking with David Culler about this, but this aspect will not be a deliverable in our time frame.
These visualizations and preliminary analyses will provide researchers with a streamlined, coherent way to quickly scan and compare the many downscaled models under different climate scenarios. We hope that access to these comparisons will allow researchers to confidently and efficiently incorporate future climate projections into their projects.
See presentation.pdf for slides about this project.