
Add an indicator whether an event was occuring at baseline to a match_time
object
add_previous_event.Rd
In some cases, some relevant events occur before the time at which a case was included during matching using the match_time
function. This might be an unrelated treatment or any other time-dependent variable with a duration. An indicator whether this type of "event" was currently ongoing at the inclusion time might be required for further confounder adjustment or other analysis. If these variables are already included in the start-stop data supplied to match_time
, they will be included in the matched data automatically, in which case this function is not needed. Otherwise, this function offers an easy and fast way to add such an indicator to the matched data.
Usage
add_previous_event(x, data, id=x$id, time=x$time,
duration, include_same_t=FALSE,
units="auto", name=".prev_event")
Arguments
- x
A
match_time
object created using thematch_time
function.- data
A
data.table
like object including exactly two columns:id
(the unique case identifier),time
(the time at which an "event" occurred). May also be any object that can be coerced to be adata.table
, such as adata.frame
or atibble
. If multiple events per person exist, they should be included in the long-format (multiple rows perid
). If any of the supplied events are currently ongoing at inclusion time, the added indicator will beTRUE
, otherwise it will beFALSE
. Whether an event is still ongoing is defined by theduration
argument.- id
A single character string specifying a column in
data
, specifying the unique case identifier. By default the same name that was used in the originalmatch_time
is used here.- time
A single character string specifying a column in
data
, specifying the column containing the event times. By default the same name that was used in the originalmatch_time
is used here.- duration
A single positive number or other scala value specifying the duration of the events listed in
data
. If the time is aDate
object or something similar, this duration should usually be given in days. Internally, whether an event lies in the duration is determined using the+
S3 method, which defaults to days forDate
objects. Users may use functions from packages like lubridate to use other units by specifying this argument accordingly.- units
Corresponds to the argument of the same name in the
difftime
function. This argument is only used when thetime
column corresponds to aDate
(or similar) variable. It should be used to indicate the time-scale of theduration
(seconds, days, years, ...).- name
A single character string specifying the name of the column containing the indicator that will be added to the
data
object contained inx
. Defaults to.prev_event
. If the name is already present, an error message is returned instead.- include_same_t
Either
TRUE
orFALSE
(default), specifying whether the time of inclusion (.treat_time
inx$data
) should be included when adding the next indicator. IfTRUE
, an event starting or ending exactly at the time of inclusion will be considered as a previous event, resulting in the added indicator beingTRUE
. IfFALSE
, an event starting or ending exactly at the time of inclusion will not be considered a previous event.
Details
In most cases it is easier and cleaner to just add all variables to the start-stop data supplied to the data
argument in match_time
, regardless of whether matching should be performed on these variables or not. This way, they will be present in the output data without any further function calls. In some cases, however, the dataset may be too large to allow all variables to be present. This might be the case when the variable changes at many points in time, requiring many rows per id
. In these cases it might be necessary to add them later to make the matching process possible.
Value
Returns a modified match_time
object. It is essentially the same object as the supplied x
, but it also contains a new column: name
(the indicator of whether an event was currently ongoing at inclusion time).
Examples
library(data.table)
library(MatchTime)
# only execute if packages are available
if (requireNamespace("survival") & requireNamespace("MatchIt")) {
library(survival)
library(MatchIt)
# set random seed to make the output replicably
set.seed(1234)
# load "heart" data from survival package
data("heart")
heart <- heart[, c("id", "start", "stop", "transplant", "age", "surgery")]
# suppose we had an extra dataset with events that looks like this
# NOTE: these are not actual events in the real "heart" data and is merely used
# for showcasing the functionality of add_previous_event()
d_events <- data.table(id=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
time=c(5, 2, 12, 39, 2, 665, 675, 4, 1, 23))
# perform nearest neighbor time-dependent matching on "age" and "surgery"
# (plus exact matching on time)
m_obj <- match_time(transplant ~ age + surgery, data=heart, id="id",
match_method="nearest")
# add the previous event indicator to match_time object,
# assuming they always have a duration of 20
m_obj <- add_previous_event(m_obj, data=d_events, time="time", duration=20)
head(m_obj$data)
}
#> Warning: glm.fit: algorithm did not converge
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: algorithm did not converge
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Key: <id>
#> id .id_new .id_pair .treat .treat_time .next_treat_time .fully_matched
#> <num> <int> <char> <lgcl> <num> <num> <lgcl>
#> 1: 1 72 36 FALSE 27 NA TRUE
#> 2: 3 1 1 TRUE 1 NA TRUE
#> 3: 4 10 5 FALSE 2 36 TRUE
#> 4: 4 89 45 TRUE 36 NA TRUE
#> 5: 7 86 43 FALSE 32 51 TRUE
#> 6: 7 105 54 TRUE 51 NA FALSE
#> .weights age surgery .prev_event
#> <num> <num> <num> <lgcl>
#> 1: 1 -17.155373 0 FALSE
#> 2: 1 6.297057 0 FALSE
#> 3: 1 -7.737166 0 FALSE
#> 4: 1 -7.737166 0 FALSE
#> 5: 1 2.869268 0 FALSE
#> 6: 0 2.869268 0 FALSE