Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged upstream changes #7

Merged
merged 22 commits into from
Aug 2, 2024
Merged

Merged upstream changes #7

merged 22 commits into from
Aug 2, 2024

Conversation

davereinhart
Copy link
Collaborator

Merging with upstream repo to capture changes related to clade 24C.

joverlee521 and others added 22 commits April 18, 2024 17:05
Bumps [conda-incubator/setup-miniconda](https://github.com/conda-incubator/setup-miniconda) from 2 to 3.
- [Release notes](https://github.com/conda-incubator/setup-miniconda/releases)
- [Changelog](https://github.com/conda-incubator/setup-miniconda/blob/main/CHANGELOG.md)
- [Commits](conda-incubator/setup-miniconda@v2...v3)

---
updated-dependencies:
- dependency-name: conda-incubator/setup-miniconda
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [actions/checkout](https://github.com/actions/checkout) from 2 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@v2...v4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
This reduces the threshold for a clade to be included (and not wrapped into "other") from 5000 total sequences to 2000 total sequences. This causes clade 24B to be broken out in the current data.
This commit drops location count threshold (ie the number of sequences collected in the past 30 days) from 100 to 50 for clade-level analysis and from 300 to 150 for lineage-level analysis.

With current data this goes from 8 locations included for clades to 11 locations included.

With current data this goes from 5 locations included for lineages to 7 locations included.

To support these thresholds, I looked at location count for different countries analyzed in https://bedford.io/papers/abousamra-ncov-forecasting-fit/ to get specific count thresholds. We see:
- Trinidad and Tobago with 2.3k sequences collected in 2022 and a median 30-day sequence count of 43 with a mean absolute forecasting error of 12%
- Vietnam with 6k sequences collected in 2022 and a median 30-day sequence count of 30 with a mean absolute forecasting error of 11%
- South Africa with 16k sequences collected in 2022 and a median 30-day sequence count of 170 with a mean absolute forecasting error of 7%

I believe this suggests that a threshold of 50 sequences in previous 30 days should be roughly consistent with a ~10% forecasting error. This seems like an okay threshold for public display.

It's less certain what count threshold to use for lineages where we have significantly larger number of labels than we do for clades. Keeping a 3x ratio here for now.
Update the numbers in the viz app to match the threshold numbers
changed in previous commit.
… sessions

I'll also be removing the corresponding repository secrets.  Both that
and this commit are required to effect the switch.

Related-to: <nextstrain/private#110>
The case counts scripts used csvtk before csvtk was officially
added to nextstrain/docker-base so I worked around this by just running
them directly in the GH Action workflow.

Our push to use short-lived AWS credentials has finally pushed me to
put this into a proper Snakemake workflow.
Refactored to use the shared `pathogen-repo-build` GH Action workflow
so that it can use the short-lived AWS credentials that are
automatically set up within the workflow.
Follow up on new Nextstrain clades added in
nextstrain/ncov#1117

Updated clade colors. Rather than changing and expanding color gradient,
I just removed the last clade in the list (23A) to keep colors consistent.
@davereinhart davereinhart merged commit d3b0471 into main Aug 2, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants