Study looking at how climate change may impact loss rates for insurance by simulating maize outcomes via a neural network and Monte Carlo. This includes interactive tools to understand results and a pipeline to build a paper discussing findings.
This repository contains three components for a study looking at how crop insurance claims rates may change in the future within the US Corn Belt using SCYM and CHC-CMIP6.
- Pipeline: Contained within the root of this repository, this Luigi-based pipeline trains neural networks and runs Monte Carlo simulations to project future insurance claims under various different parameters, outputting data to a workspace directory.
- Tools: Within the
paper/viz
subdirectory, the source code for an explorable explanation built using Sketchingpy both creates the static visualizations for the paper and offers web-based interactive tools released to ag-adaptation-study.pub which allow users to iteratively engage with these results. - Paper: Within the
paper
subdirectory, a manuscript is built from the output data from the pipeline. This describes these experiments in detail with visualizations.
These are described in detail below.
The easiest way to engage with these results is through the web-based interactive explorable explanation which is housed for the public at ag-adaptation-study.pub. The paper preprint can also be found at https://arxiv.org/abs/2408.02217. We also publish our raw pipeline output. Otherwise, see local setup.
For those wishing to extend this work, you can execute this pipeline locally by checking out this repository (git clone [email protected]:SchmidtDSE/maize-loss-climate-experiment.git
).
First, get access to the SCYM and CHC-CMIP6 datasets and download all of the geotiffs to an AWS S3 Bucket or another location which can be accessed via the file system. This will allow you to choose from two execution options:
- Setup for AWS: This will execute if the
USE_AWS
environment variable is set to 1. This assumes data are hosted remotely in an AWS bucket defined by theSOURCE_DATA_LOC
environment variable and we use Coiled to execute the computation. After setting the environment variables for access credientials (AWS_ACCESS_KEY
andAWS_ACCESS_SECRET
) and setting up Coiled, simply execute the Luigi pipeline as described below. - Setup for local: If the
USE_AWS
environment variable is set to 0, this will run using a local Dask cluster. This assumes thatSOURCE_DATA_LOC
is a path to the directory housing the input geotiffs. After setting up Coiled, simply execute the Luigi pipeline as described below.
You can then execute either by:
- Run directly: First, install the Python requirements (
pip install -r requirements.txt
) optionally within a virtual environment. Then, simply executebash run.sh
to execute the pipeline from start to finish. See alsobreakpoint_tasks.py
for Luigi targets for running subsets of the pipeline. - Run through Docker: Simply execute
bash run_docker.sh
to execute the pipeline from start to finish. See alsobreakpoint_tasks.py
for Luigi targets for running subsets of the pipeline and updaterun.sh
which is executed within the container. Note that this will operate on theworkspace
directory.
A summary of the pipeline is created in stats.json
. See local package below for use in other repository components such as the interactive tools or paper rendering. Users may optionally skip some expensive steps by placing the files from https://zenodo.org/records/14533227 into the workspace
directory.
Written in Sketchingpy, the tools can be executed locally on your computer, in a static context for building the paper, or through a web browser. First, one needs to get data from the pipeline or download prior results:
- Download prior results: Retrieve the latest results and move them into the viz directory (
paper/viz/data
). Simply use wget to gather model outputs when in thepaper/viz directory
as so:wget https://ag-adaptation-study.pub/archive/data.zip; unzip data.zip
. If using prior sweep results, download full sweep information like so:cd data; wget http://ag-adaptation-study.pub/data/sweep_ag_all.csv; cd ..
. - Use your own results: Update the output data per instructions regarding local package below.
There are two options for executing the tools:
- Docker: You can run the web-based visualizations through a simple Docker file in the
paper/viz
directory (bash run_docker.sh
). - Local apps: You can execute the visualizations manually by running them directly as Python scripts. The entry points are
hist_viz.py
,history_viz.py
,results_viz_entry.py
, andsweep_viz.py
. Simply run them without any command line arguments for defaults. Note you may need to install python dependencies (pip install -r requirements.txt
).
Note that the visualizations are also invoked through paper/viz/render_images.sh
for the paper.
Due to the complexities of the software install, the only officially supported way to build the paper is through the Docker image. First update the data:
- Download prior results: Retrieve the latest results and move them into the paper directory (
paper/outputs
). - Use your own results: Update the output data per instructions regarding local package below.
Then, execute render_docker.sh
to drop the results into the paper_rendered
directory.
Instead of retrieving data from https://ag-adaptation-study.pub, you can use your own pipeline data outputs by running bash package.sh
. This will produce the data
and outputs
sub-directories inside of a new package
directory which can be used for the interactive tools and paper rendering respectively.
As part of CI / CD and for local development, the following are required to pass for both the pipeline in the root of this repository and the interactives written in Python at paper/viz
:
- pyflakes: Run
pyflakes *.py
to check for likely non-style code isses. - pycodestyle: Run
pycodestyle *.py
to enforce project coding style guidelines.
The pipeline also offers unit tests (nose2
in root) for the pipeline. For the visualizations, tests happen by running the interactives headless (bash render_images.sh; bash script/check_images.sh
).
To deploy changes to production, CI / CD will automatically release to ag-adaptation-study.pub once merged on main
.
Where possible, please follow the Python Google Style Guide unless an override is provided in setup.cfg
. Docstrings and type hints are required for all top-level or public members but not currently enforced for private members. JSDoc is required for top level members. Docstring / JSDoc not required for "utility" code.
The pipeline, interactives, and paper can be executed independently and have segregated dependencies. We thank all of our open source dependencies for their contribution.
The pipeline uses the following open source dependencies:
- bokeh under the BSD 3-Clause License.
- boto3 under the Apache v2 License.
- dask under the BSD 3-Clause License.
- fiona under the BSD License.
- geolib under the MIT License.
- geotiff under the LGPL License.
- geotiff under the LGPL License.
- imagecodecs under the BSD 3-Clause License.
- keras under the Apache v2 License.
- libgeohash under the MIT License.
- Luigi under the Apache v2 License.
- NumPy under the BSD License.
- Pandas under the BSD License.
- Pathos under the BSD License.
- requests under the Apache v2 License.
- scipy under the BSD License.
- shapely by Sean Gillies, Casper van der Wel, and Shapely Contributors under the BSD License.
- tensorflow under the Apache v2 License.
- toolz under the BSD License.
Use of Coiled is optional.
Both the interactives and static visualization generation use the following:
- Jinja under the BSD 3-Clause License.
- Matplotlib under the PSF License.
- NumPy under the BSD License.
- Pandas under the BSD License.
- Pillow under the HPND License.
- pygame-ce under the LGPL License.
- Sketchingpy under the BSD 3-Clause License.
- toolz under the BSD License.
The web version also uses:
- es.js under the ISC License (Andrea Giammarchi).
- micropip under the MPL 2.0 License.
- packaging under the BSD License.
- Popper under the MIT License.
- Pyodide under the MPL 2.0 License.
- Pyscript under the Apache v2 License.
- Sketchingpy under the BSD 3-Clause License.
- Tabby under the MIT License.
- Tippy.js under the MIT License.
- toml (Jak Wings) under the MIT License.
- ua-parser 1.0.36 under the MIT License.
The paper uses the following open source dependencies to build the manuscript:
- Jinja under the BSD 3-Clause License.
- Matplotlib under the PSF License.
- NumPy under the BSD License.
- Pandas under the BSD License.
- Pillow under the HPND License.
- Sketchingpy under the BSD License including the packages included in its stand alone hosting archive.
- toolz under the BSD License.
Users may optionally leverage Pandoc as an executable (not linked) under the GPL but any tool converting markdown to other formats is acceptable or the paper can be built as Markdown only without Pandoc. That said, for those using Pandoc, scripts may also use pandoc-fignos under the GPL License and pandoc-tablenos under the GPL License.
Some executions may also use:
- Docker under the Apache v2 License.
- Docker Compose under the Apache v2 License.
- Nginx under a BSD-like License.
- OpenRefine under the BSD License.
We also use:
- Color Brewer under the Apache v2 License.
- Public Sans under the CC0 License.
Code is released under BSD 3-Clause and data under CC-BY-NC 4.0 International. Please see LICENSE.md
.