Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update methodology vignettes #919

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

update methodology vignettes #919

wants to merge 5 commits into from

Conversation

sbfnk
Copy link
Contributor

@sbfnk sbfnk commented Jan 10, 2025

Description

This PR closes #916.

Initial submission checklist

  • My PR is based on a package issue and I have explicitly linked it.
  • I have tested my changes locally (using devtools::test() and devtools::check()).
  • I have added or updated unit tests where necessary.
  • I have updated the documentation if required and rebuilt docs if yes (using devtools::document()).
  • I have followed the established coding standards (and checked using lintr::lint_package()).
  • I have added a news item linked to this PR.

After the initial Pull Request

  • I have reviewed Checks for this PR and addressed any issues as far as I am able.

@@ -33,6 +33,7 @@
- Brought the docs on `alpha_sd` up to date with the code change from prior PR #853. By @zsusswein in #862 and reviewed by @jamesmbaazam.
- The `...` argument in `estimate_secondary()` has been removed because it was not used. By @jamesmbaazam in #894 and reviewed by @.
- All examples now use the natural parameters of distributions rather than the mean and standard deviation when specifying uncertain distributions. This is to eliminate warnings and encourage best practice. By @jamesmbaazam in #893 and reviewed by @sbfnk.
- Updated the methodology vignettes, By @sbfnk in #919 and reviewed by.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Updated the methodology vignettes, By @sbfnk in #919 and reviewed by.
- Updated the methodology vignettes, By @sbfnk in #919 and reviewed by @seabbs.

\end{align}

where $I_{t}$ is the number of latent infections on day $t$, $r$ is the estimate of the initial growth rate, and $I_\mathrm{obs}$ and $r_\mathrm{obs}$ are estimated from the first week of observed data, respectively, as as the point estimates of intercept and slope from fitting a linear regression model to the first 7 days of data (or all data if fewer than 7 days of data are given),
where $I_{t}$ is the number of latent infections on day $t$, $r$ is the estimate of the initial growth rate, $\xi$ is the proportoin reported (see [Delays and scaling]) and $I_\mathrm{init}$ and $r_\mathrm{init}$ are estimated, respectively, as the point estimates of intercept and slope from fitting a linear regression model to the first 7 days of data (or all data if fewer than 7 days of data are given),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
where $I_{t}$ is the number of latent infections on day $t$, $r$ is the estimate of the initial growth rate, $\xi$ is the proportoin reported (see [Delays and scaling]) and $I_\mathrm{init}$ and $r_\mathrm{init}$ are estimated, respectively, as the point estimates of intercept and slope from fitting a linear regression model to the first 7 days of data (or all data if fewer than 7 days of data are given),
where $I_{t}$ is the number of latent infections on day $t$, $r$ is the estimate of the initial growth rate, $\xi$ is the proportion reported (see [Delays and scaling]) and $I_\mathrm{init}$ and $r_\mathrm{init}$ are estimated, respectively, as the point estimates of intercept and slope from fitting a linear regression model to the first 7 days of data (or all data if fewer than 7 days of data are given),

\end{align}

where $I_{t}$ is the number of latent infections on day $t$, $r$ is the estimate of the initial growth rate, and $I_\mathrm{obs}$ and $r_\mathrm{obs}$ are estimated from the first week of observed data, respectively, as as the point estimates of intercept and slope from fitting a linear regression model to the first 7 days of data (or all data if fewer than 7 days of data are given),
where $I_{t}$ is the number of latent infections on day $t$, $r$ is the estimate of the initial growth rate, $\xi$ is the proportoin reported (see [Delays and scaling]) and $I_\mathrm{init}$ and $r_\mathrm{init}$ are estimated, respectively, as the point estimates of intercept and slope from fitting a linear regression model to the first 7 days of data (or all data if fewer than 7 days of data are given),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
where $I_{t}$ is the number of latent infections on day $t$, $r$ is the estimate of the initial growth rate, $\xi$ is the proportoin reported (see [Delays and scaling]) and $I_\mathrm{init}$ and $r_\mathrm{init}$ are estimated, respectively, as the point estimates of intercept and slope from fitting a linear regression model to the first 7 days of data (or all data if fewer than 7 days of data are given),
where $I_{t}$ is the number of latent infections on day $t$, $r$ is the estimate of the initial growth rate, $\xi$ is the proportion reported (see [Delays and scaling]) and $I_\mathrm{init}$ and $r_\mathrm{init}$ are estimated, respectively, as the point estimates of intercept and slope from fitting a linear regression model to the first 7 days of data (or all data if fewer than 7 days of data are given),

r &\sim \mathrm{Normal}(r_\mathrm{obs}, 0.2)\\
I_{0 < t \leq t_\mathrm{seed}} &= I_0 \exp \left(r t \right)
I_0 &\sim \mathrm{LogNormal}(I_\mathrm{init}, \sqrt{I_\mathrm{init}}) \\
r &\sim r_\mathrm{init} + (I_\mathrm{init} - I_0) \\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
r &\sim r_\mathrm{init} + (I_\mathrm{init} - I_0) \\
r &\sim r_\mathrm{init} + (I_\mathrm{init} - I_0) \\

looking at this reminds me to ask did you look at this both normalised by the standard deviation and not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially had it divided by the seeding time but that was fairly poorly motivated. I assumed that we'd replace this by the R->r solution anyway so didn't dwell on it too much but if you can think of a more appropriate scaling factor here then this would probably be a good thing to include.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my suggestion was the standard deviation so the scaling is the same regardless of the magnitude of the initial infections. I don't think we expect it to scale with the count magnitude do we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the best thing is to just discard the approach and go straight with #920 (comment) rather than trying to come up with something good here.

\end{equation}

where $g(\tau|\mu_{g}, \sigma_{g})$ is the distribution of generation times (with discretised gamma or discretised log normal distributions available as options) with mean (or log mean in the case of lognormal distributions) $\mu_g$, standard deviation (or log standard deviation in the case of lognormal distributions) $\sigma_g$ and maximum $g_\mathrm{max}$.
Generation times can either be specified as coming from a distribution with uncertainty by giving mean and standard deviations of normal priors, weighted by default by the number of observations (although this can be changed by the user) and truncated to be positive where relevant for the given distribution; or they can be specified as the parameters of a fixed distribution, or as fixed values.
where $g(\tau|\mu_{g}, \sigma_{g})$ is the discretised distribution of generation times with parameters $\theta_g$ and maximum $g_\mathrm{max}$.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
where $g(\tau|\mu_{g}, \sigma_{g})$ is the discretised distribution of generation times with parameters $\theta_g$ and maximum $g_\mathrm{max}$.
where $g(\tau | \theta_g)$ is the discretised distribution of generation times with parameters $\theta_g$ and maximum $g_\mathrm{max}$.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

update methodology vignettes
3 participants