Add an indicator whether an event was occuring at baseline to a match_time object

In some cases, some relevant events occur before the time at which a case was included during matching using the match_time function. This might be an unrelated treatment or any other time-dependent variable with a duration. An indicator whether this type of "event" was currently ongoing at the inclusion time might be required for further confounder adjustment or other analysis. If these variables are already included in the start-stop data supplied to match_time, they will be included in the matched data automatically, in which case this function is not needed. Otherwise, this function offers an easy and fast way to add such an indicator to the matched data.

Usage

add_previous_event(x, data, id=x$id, time=x$time,
                   duration, include_same_t=FALSE,
                   units="auto", name=".prev_event")

Arguments

x: A match_time object created using the match_time function.
data: A data.table like object including exactly two columns: id (the unique case identifier), time (the time at which an "event" occurred). May also be any object that can be coerced to be a data.table, such as a data.frame or a tibble. If multiple events per person exist, they should be included in the long-format (multiple rows per id). If any of the supplied events are currently ongoing at inclusion time, the added indicator will be TRUE, otherwise it will be FALSE. Whether an event is still ongoing is defined by the duration argument.
id: A single character string specifying a column in data, specifying the unique case identifier. By default the same name that was used in the original match_time is used here.
time: A single character string specifying a column in data, specifying the column containing the event times. By default the same name that was used in the original match_time is used here.
duration: A single positive number or other scala value specifying the duration of the events listed in data. If the time is a Date object or something similar, this duration should usually be given in days. Internally, whether an event lies in the duration is determined using the + S3 method, which defaults to days for Date objects. Users may use functions from packages like lubridate to use other units by specifying this argument accordingly.
units: Corresponds to the argument of the same name in the difftime function. This argument is only used when the time column corresponds to a Date (or similar) variable. It should be used to indicate the time-scale of the duration (seconds, days, years, ...).
name: A single character string specifying the name of the column containing the indicator that will be added to the data object contained in x. Defaults to .prev_event. If the name is already present, an error message is returned instead.
include_same_t: Either TRUE or FALSE (default), specifying whether the time of inclusion (.treat_time in x$data) should be included when adding the next indicator. If TRUE, an event starting or ending exactly at the time of inclusion will be considered as a previous event, resulting in the added indicator being TRUE. If FALSE, an event starting or ending exactly at the time of inclusion will not be considered a previous event.

Details

In most cases it is easier and cleaner to just add all variables to the start-stop data supplied to the data argument in match_time, regardless of whether matching should be performed on these variables or not. This way, they will be present in the output data without any further function calls. In some cases, however, the dataset may be too large to allow all variables to be present. This might be the case when the variable changes at many points in time, requiring many rows per id. In these cases it might be necessary to add them later to make the matching process possible.

Value

Returns a modified match_time object. It is essentially the same object as the supplied x, but it also contains a new column: name (the indicator of whether an event was currently ongoing at inclusion time).

Author

Robin Denz

Examples

library(data.table)
library(MatchTime)

# only execute if packages are available
if (requireNamespace("survival") & requireNamespace("MatchIt")) {

library(survival)
library(MatchIt)

# set random seed to make the output replicably
set.seed(1234)

# load "heart" data from survival package
data("heart")
heart <- heart[, c("id", "start", "stop", "transplant", "age", "surgery")]

# suppose we had an extra dataset with events that looks like this
# NOTE: these are not actual events in the real "heart" data and is merely used
#       for showcasing the functionality of add_previous_event()
d_events <- data.table(id=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
                       time=c(5, 2, 12, 39, 2, 665, 675, 4, 1, 23))

# perform nearest neighbor time-dependent matching on "age" and "surgery"
# (plus exact matching on time)
m_obj <- match_time(transplant ~ age + surgery, data=heart, id="id",
                    match_method="nearest")

# add the previous event indicator to match_time object,
# assuming they always have a duration of 20
m_obj <- add_previous_event(m_obj, data=d_events, time="time", duration=20)
head(m_obj$data)
}
#> Warning: glm.fit: algorithm did not converge
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: algorithm did not converge
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Key: <id>
#>       id .id_new .id_pair .treat .treat_time .next_treat_time .fully_matched
#>    <num>   <int>   <char> <lgcl>       <num>            <num>         <lgcl>
#> 1:     1      72       36  FALSE          27               NA           TRUE
#> 2:     3       1        1   TRUE           1               NA           TRUE
#> 3:     4      10        5  FALSE           2               36           TRUE
#> 4:     4      89       45   TRUE          36               NA           TRUE
#> 5:     7      86       43  FALSE          32               51           TRUE
#> 6:     7     105       54   TRUE          51               NA          FALSE
#>    .weights        age surgery .prev_event
#>       <num>      <num>   <num>      <lgcl>
#> 1:        1 -17.155373       0       FALSE
#> 2:        1   6.297057       0       FALSE
#> 3:        1  -7.737166       0       FALSE
#> 4:        1  -7.737166       0       FALSE
#> 5:        1   2.869268       0       FALSE
#> 6:        0   2.869268       0       FALSE

Add an indicator whether an event was occuring at baseline to a `match_time` object