Skip to contents

A function to simulate time-to-event data with multiple competing causes of failure and one or multiple confounders. The user can specify both the relationship between the covariates and the cause-specific survival time and the relationship between the covariates and the treatment assignment probability. Random censoring based on a custom function may also be introduced. Can be used for simulation studies or to showcase the usage of the adjusted CIF methodology presented in this package.

Usage

sim_confounded_crisk(n=500, lcovars=NULL, outcome_betas=NULL,
                     group_beta=c(1, 0), gamma=c(1.8, 1.8),
                     lambda=c(2, 2), treatment_betas=NULL,
                     intercept=-0.5, gtol=0.001,
                     cens_fun=function(n){stats::rweibull(n, 1, 2)},
                     cens_args=list(), max_t=1.7)

Arguments

n

An integer specifying the sample size of the simulated data set.

lcovars

A named list to specify covariates. Each list element should be a vector containing information on the desired covariate distribution. See details.

outcome_betas

A list of numeric vectors of beta coefficients for the cause-specific time-to-event outcome. The list has to be of the same length as the lcovars list and every entry of it has to be a numeric vector with one entry for each cause of failure. See details.

group_beta

A numeric vector containing specifying the beta coefficients of the grouping variable on the cause-specific survival time. Should contain one entry for every cause of failure.

gamma

A numeric parameter for the simulation of the survival time using a weibull distribution. See details.

lambda

A numeric parameter for the simulation of the survival time using a weibull distribution. See details.

treatment_betas

A named numeric vector of beta coefficients for the treatment assignment model.

intercept

The intercept of the treatment assignment model.

gtol

Tolerance at which estimated treatment assignment probabilities are truncated.

cens_fun

A function to generate censoring times. The function needs to take at least one argument called n. Additional arguments are allowed and can be supplied using the cens_args argument.

cens_args

A list of named arguments passed to cens_fun.

max_t

A number denoting the maximum follow-up time. Every event time bigger than this threshold are censored. In contrast to the single event survival simulation, this value actually has to be supplied as it is used in the numerical inversion step. Theoretically Inf can be used, but this might not work in practice.

Details

The simulation of the confounded competing risks data has five main steps: (1) Generation of covariates, (2) Assigning the treatment variable, (3) Generating a cause-specific survival time (4) Generating the corresponding cause of failure and (5) Introducing censoring.

First, covariates are generated by taking independent n random samples from the distributions defined in lcovars.

In the second step the generated covariates are used to estimate the probability of receiving treatment (the propensity score) for each simulated person in the dataset. This is done using a logistic regression model, using the values in treatment_betas as coefficients and interecept as the intercept. By changing the intercept, the user can vary the proportion of cases that end up in each treatment group on average. The estimated probabilities are then used to generate the treatment variable ("group"), making the treatment assignment dependent on the covariates.

Next, survival times are generated based on the method described in Beyersman et al. (2009) using the causal coefficients defined in outcome_betas and group_beta. After a survival time has been generated a corresponding cause of failure is drawn from a multinomial distribution with probabilities defined by the all cause hazard and the cause-specific hazards. More details can be found in the cited literature. Both the independently generated covariates and the covariate-dependent treatment variable are used in this step. This introduces confounding.

Independent right-censoring is introduced by taking n independent random draws from some distribution defined by cens_fun and censoring every individual whose censoring time is smaller than its simulated survival time. The whole process is based on work from Chatton et al. (2020).

Currently only supports binary treatments and does not allow dependent censoring.

Value

Returns a data.frame object containing the simulated covariates, the event indicator ("event"), the survival/censoring time ("time") and the group variable ("group").

References

Jan Beyersmann, Arélien Latouche, Anika Buchholz, and Martin Schumacher (2009). "Simulating Competing Risks Data in Survival Analysis". In: Statistics in Medicine 28, pp. 956-971

D. Morina and A. Navarro (2017). "Competing Risks Simulation with the survsim R Package". In: Communications in Statistics: Simulation and Computation 46.7, pp. 5712-5722

Arthur Chatton, Florent Le Borgne, Clémence Leyrat, and Yohann Foucher (2020). G-Computation and Inverse Probability Weighting for Time-To-Event Outcomes: A Comparative Study. arXiv:2006.16859v1

Author

The code for step (3) and (4) described in the details was taken from the survsim R-Package, written by David Morina Soler (with slight modifications). The rest of the function was written by Robin Denz.

Examples

library(adjustedCurves)

set.seed(42)

# simulate data with default values
sim_dat <- sim_confounded_crisk(n=10)

# set group betas to 0
sim_dat <- sim_confounded_crisk(n=10, group_beta=c(0, 0))

# set some custom values
outcome_betas <- list(c(0.03, 0.4),
                      c(1.1, 0.8),
                      c(0, 0),
                      c(-0.2, -0.4),
                      c(log(1.3), log(1.3)/3),
                      c(0, 0))

treatment_betas <- c(x1=0, x2=log(3), x3=log(1.2),
                     x4=0, x5=log(1.1), x6=log(1.4))

lcovars <- list(x1=c("rbinom", 1, 0.3),
                x2=c("rbinom", 1, 0.7),
                x3=c("rbinom", 1, 0.5),
                x4=c("rnorm", 0, 1),
                x5=c("rnorm", 0, 1.1),
                x6=c("rnorm", 0, 0.9))

sim_dat <- sim_confounded_crisk(n=10,
                                treatment_betas=treatment_betas,
                                outcome_betas=outcome_betas,
                                lcovars=lcovars)