-
Notifications
You must be signed in to change notification settings - Fork 17
/
Copy path11-tmle3survival.Rmd
144 lines (117 loc) · 4.55 KB
/
11-tmle3survival.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
# One-Step TMLE for Time-to-Event Outcomes
Based on Cai W, van der Laan MJ. One-step targeted maximum likelihood estimation
for time-to-event outcomes [published online ahead of print, 2019 Nov 15].
Biometrics. 2019;10.1111/biom.13172. doi:10.1111/biom.13172
Moore, K. and van der Laan, M.J. (2009) Application of time-to-event methods in
the assessment of safety in clinical trials. Design and Analysis of Clinical
Trials with Time-to-Event Endpoints. London: Taylor & Francis, pp. 455–482.
## Learning Objectives
1. Use tmle3 to estimate counterfactual marginal survival functions
## Introduction
The first step in the estimation procedure is an initial estimate of the
conditional survival function of failure, conditional survival function of
censoring and the propensity scores. For this initial estimation, we use the
super learner [@vdl2007super], as described in previous sections.
With the initial estimate of relevant parts of the data-generating distribution
necessary to evaluate the target parameter, we are ready to construct the
one-step TMLE.
## Example
We'll go through an example to illustrate how to estimate the treatment specific
survival curve using tmle3.
### Load the Data
```{r}
library(tmle3)
library(sl3)
```
```{r}
vet_data <- read.csv(
paste0(
"https://raw.githubusercontent.com/tlverse/deming2019-workshop/",
"master/data/veteran.csv"
)
)
vet_data$trt <- vet_data$trt - 1
# make fewer times for illustration
vet_data$time <- ceiling(vet_data$time / 20)
k_grid <- seq_len(max(vet_data$time))
head(vet_data)
```
The variables in vet_data are * trt: treatment type (0 = standard, 1 = test), *
celltype: histological type of the tumor, * time: time to death or censoring in
days, * status: censoring status, * karno: Karnofsky performance score that
describes overall patient status at the beginning of the study (100 = good), *
diagtime: months from diagnosis to randomization, * age: age of patient in
years, and * prior: prior therapy (0 = no, 10 = yes).
### Specify right-censored survival data arguments
First, we use function tmle_survival to create a tmle_spec specific for the
survival data estimation. To estimate the conditional survival function, we need
to estimate the conditional hazard of the failure event, and then transform it
into the conditional survival function. The definition of the conditional hazard
is
\[\lambda_{N}(t|A, W)=P(N(t)=1|N(t-1)=0, A_c(t-1)=0, A, W)\]
where $N(t)=I(\widetilde{T}\le t, \Delta = 1), A_c(t)=I(\widetilde{T}\le t,
\Delta = 0)$. We will transform the original data into the required long format
by calling transform_data function of the survival_spec. You will need to
speficy W (covariates), A (treatment/intervention), T_tilde (time-to-event
vector), Delta (censoring indicator vector), and row ids for the nodelist.
```{r}
var_types <- list(
T_tilde = Variable_Type$new("continuous"),
t = Variable_Type$new("continuous"),
Delta = Variable_Type$new("binomial")
)
survival_spec <- tmle_survival(
treatment_level = 1, control_level = 0,
target_times = intersect(1:10, k_grid),
variable_types = var_types
)
node_list <- list(
W = c("celltype", "karno", "diagtime", "age", "prior"),
A = "trt", T_tilde = "time", Delta = "status", id = "X"
)
long_data_tuple <- survival_spec$transform_data(vet_data, node_list)
df_long <- long_data_tuple$long_data
long_node_list <- long_data_tuple$long_node_list
```
### Obtain initial estimates
We define the learners and tmle_task as described in chapter 5.
```{r}
lrnr_mean <- make_learner(Lrnr_mean)
lrnr_glm <- make_learner(Lrnr_glm)
lrnr_gam <- make_learner(Lrnr_gam)
sl_A <- Lrnr_sl$new(learners = list(lrnr_mean, lrnr_glm, lrnr_gam))
learner_list <- list(A = sl_A, N = sl_A, A_c = sl_A)
```
```{r}
tmle_task <- survival_spec$make_tmle_task(df_long, long_node_list)
initial_likelihood <- survival_spec$make_initial_likelihood(
tmle_task,
learner_list
)
```
### Perform TMLE adjustment of the initial conditional survival estimate
For the updater, we create a tmle3_Update_survival object. By setting the
fit_method to l2, it would execute the one-step TMLE using a logistic ridge
regression.
```{r}
up <- tmle3_Update_survival$new(
maxit = 3e1,
cvtmle = TRUE,
convergence_type = "scaled_var",
delta_epsilon = 1e-2,
fit_method = "l2",
use_best = TRUE,
verbose = FALSE
)
targeted_likelihood <- Targeted_Likelihood$new(initial_likelihood,
updater = up
)
tmle_params <- survival_spec$make_params(tmle_task, targeted_likelihood)
tmle_fit_manual <- fit_tmle3(
tmle_task, targeted_likelihood, tmle_params,
targeted_likelihood$updater
)
```
```{r}
print(tmle_fit_manual)
```