-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First draft of output schema for the Rt postprocessing #1
base: main
Are you sure you want to change the base?
Conversation
[data.cdc.gov](), and to [DCIPHER](). By convention, the pipeline always | ||
generates outputs for publication, even if the input data is from an experiment, | ||
test run, or backfill exercise that is not intended for release. The publication | ||
outputs are not costly to generate and our person-driven releae process ensures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outputs are not costly to generate and our person-driven releae process ensures | |
outputs are not costly to generate and our person-driven release process ensures |
I feel like R would be easier since a lot of the post processing is currently in R but I have not preferences
I like the idea of running in vap and writing to/from blob. When Kingsley or I recreate the output figures (to process state exclusions for example), ~75% of the time it takes is for the nodes to spin up in Azure Batch
My initial thought is that this should start off as a repo or two and then perhaps a package later in its life cycle if a clear path for its use in other pipelines/teams has been established
Here's my attempt, lets see how close I am to the consensus
Regarding a potential dashboard, I'm wondering if we might want to create some of the anomaly figures or create RDS files encoding those figures within the post processing pipeline? The current anomaly report takes a while to generate because of the computational demand of reading and processing the input files (model RDS files, gold parquet files, latent.csv files). Any dynamic dashboard would benefit from reading in already processed figures/data so we don't have to wait |
@natemcintosh, I think that the more up to date version is in @zsusswein's issue: #2. @zsusswein could you update this readme to reflect the final plan and submit for review? |
ea011c3
to
535ffe8
Compare
Here is the rendered readme with a proposed output structure. Please review and comment! https://github.com/CDCgov/cfa-rt-postprocessing/tree/output-structure?tab=readme-ov-file#cfa-r_t-postprocesing
Things we need to decide to move forward:
In scope
v1: Proposed scope of this project: refactor to generate basic files as outlined here
Out of scope (for now)
v2: Add a dashboard
v3: Re-write the JS so that the website can also read from merged_release.csv