Collection of functions and scripts to investigate how clinical features can be used for prediction.
- Fit a BART propensity score model (function bartMachine::bartMachine) with all variable available.
- Perform BART variable selection (function bartMachine::var_selection_by_permute) to choose variables for the PS model.
- Re-fit a BART propensity score model (function bartMachine::bartMachine) with the selected variables.
- Fit a SparseBCF model (function SparseBCF::SparseBCF) on complete data and perform variable selection:
- Only include variable with ~ 20% missingness (discard higher amounts)
- Select variables with an inclusion proportion above 1/n (n = number of variables used)
- Fit a BCF model (function bcf::bcf) on complete data with the selected variables.
- Fit two versions of the model:
- With propensity scores included in the "control" or mu(x) (use include_pi = "control")
- Without propensity scores included in the model (use include_pi = "none")
- Compare individual predictions from both models in order to decide on the use of propensity scores.
- Fit two versions of the model:
- Check model fit for the outcomes:
- Plot standardised results to check for any structure in the residuals.
- Check model fit for the treatment effects: (model fitted in observational data)
- Plot predicted CATE vs ATE for several ntiles of predicted treatment effect:
- Propensity score matching 1:1 (check whether matched individuals are well balanced)
- Propensity score matching 1:1 whilst adjusting for all variables used in the BCF model.
- Adjust for all variables used in the BCF model.
- Plot predicted CATE vs ATE for several ntiles of predicted treatment effect:
(Developed in CPRD: GOLD download)
Files:
-
0.0: Functions
- .1: Functions for plotting and some calculations.
- .2: Functions for bartMachine tree analysis. Slightly modified bartMan R package (not kept up to date) (package requirements were modified).
-
1.0: Detailed explanation of steps taken in the selection of patients for our cohorts.
-
2.0: Descriptive analysis of data
- .1: Collection of plots demonstrating specific details/quirks of the dataset.
- .2: Table description of Development and Validation data
- .3: Treatment Effects/Variable description table/plot (Model 4.5)
-
3.0: R packages to model causal treatment effect.
- .1: Fitting of a causal model using grf R package. This includes an evaluation of model fit.
- .2: Fitting of a causal model using bcf R package. This includes an evaluation of model fit.
-
4.0: bartMachine models for treatment heterogeneity.
- .1: Fitting a BART model with variable selection for propensity score and outcome model. This includes an evaluation of model fit.
- .2: Fitting a BART model with routine variables in propensity score model and biomarkers in outcome model. This includes an evaluation of model fit.
- .3: Fitting a BART model with variable selection for propensity score and variable selection using BART + grf for the outcome model. This includes an evaluation of model fit.
- .4: Fitting a BART model with variable selection using BART + grf for the outcome model. This includes an evaluation of model fit.
- .5: Fitting a BART model with variable selection for using BART + grf for the outcome model. This includes an evaluation of model fit. (Change from 4.4 - instead of 'score', we use 'score.excl.mi')
- .6: Fitting a BART propensity score model, variable selection, matching individuals, BART model with all variables.
- .7: Fitting a BART propensity score model, variable selection, refit propensity score model, BART HbA1c model + propensity score as covariate, variable selection, refit BART HbA1c model.
-
5.0: bartMachine models using no methodical procedure.
- .1: Fitting a collection of naive Bart models for HbA1c outcome using routine clinical variables / all variables / propensity scores, alternating between them.
-
6.0: Comparing models.
- .1: Collection of plots comparing naive BART models in 5.0.
- .2: Collection of plots comparing bcf and bartMachine with the same variables. Head-to-head comparisons of treatment effect for 3.2. vs 5.1. model 1 (Complete/Routine)
- .3: Collection of plots comparing 4.1-4.4 models.
- .4: Differential treatment effect.
-
7.0: Sensitivity analysis.
- .1: Exclusion of GLP1 patients before 2013.
- .2: Variable importance model 4.4.
- .3: Modelling predicted treatment effect against model variables.
-
8.0: Presentations/Slides
- .1: MRC: 29th September London
- .2: SGLT2 vs GLP1 paper for publish
-
9.0: Shiny App
- .1: Model 4.4 - probability of achieving target HbA1c.
-
10.0: Modelling accompanying data
- .1: Weight reduction
- .2: Discontinuation
(Developed in CPRD: Aurum download)
- 11.00: Aurum download modelling
- .01: Functions used specifically for this portion.
- .02: Detailed explanation of the selection of cohorts.
- .03: Descriptive analysis of datasets.
- .04: Propensity score model.
- .05: Model heterogeneity.
- .06: Risks/Benefits: hba1c change, eight change, eGFR change, discontinuation, CVD/HF/CKD outcomes, microvascular complications.
- .07: Differential treatment effects.
- .08: Paper plots.
- .1: Main plots of paper
- .2: Supplementary plots of paper
- .3: Plots for DUK.
- .09: Comparison of SGLT2vsGLP1 BCF model to SGLT2vsDPP4 linear model (John Dennis).
- .10: Validation of treatment effects splitting by ethnicity.
- .11: Validation of the excluded individuals that were prescribed semaglutide.
- .12: Validation of treatment effects in those insulin treated.
- .13: Validation of treatment effects in those with/without baseline CVD.