
Transform a start-stop dataset into the long-format
start_stop2long.RdGiven a data.table like object in the start-stop format, this function returns a data.table in the long-format.
Usage
start_stop2long(data, id, events=NULL, start="start",
stop="stop", fill_gaps=FALSE,
include_last_t=FALSE, time_name="time",
...)Arguments
- data
A
data.tablelike object including at least three columns:id(the unique case identifier),start(the beginning of the time-interval) andstop(the end of the time-interval). May also be any object that can be coerced to be adata.table, such as adata.frameor atibble. Intervals should be right-open (coded as[start, stop)) and thus overlapping . May contain any number of additional columns.- id
A single character string specifying a column in
dataspecifying the unique case identifier.- events
Either
NULL(default) or a character vector specifying variable names indata. The columns specified by this argument should be logical and are considered to be event indicators, meaning that they are not coded as time-varying variables. Instead they should be coded as occurring exactly onstopand have no duration themselves. In the long-format output, these columns will only beTRUEon the time of occurrence, not during the interval in which they were coded.- start
A single character string specifying a column in
dataspecifying the beginning of a time-interval. Defaults to"start".- stop
A single character string specifying a column in
dataspecifying the ending of a time-interval. Defaults to"stop".- fill_gaps
Either
TRUEorFALSE(default), specifying whether intervals that are missing fromdatashould still be present in the output. If set toTRUE, thefill_gaps_start_stopfunction is called on the inputdatafirst.- include_last_t
Whether to include the last value of
stopperidin the output. Whether this should be done or not depends on how the intervals are coded.- time_name
A single character string, specifying the name of the "time" column in the output.
- ...
Further arguments passed to
fill_gaps_start_stopiffill_gaps=TRUE, ignored otherwise.
Value
Returns a single data.table containing the long-format data. The start and stop columns from the input are replaced by a single time_name column.
Examples
library(MatchTime)
library(data.table)
# define some example start-stop data
data <- data.table(id=c(1, 1, 1, 2, 2, 3),
start=c(0, 14, 26, 0, 18, 0),
stop=c(14, 26, 30, 18, 32, 51),
A=c(0.1, 0.2, 0.3, 0.4, 0.5, 0.6),
B=c(1L, 1L, 2L, 3L, 5L, 6L),
C=c(TRUE, TRUE, TRUE, FALSE, FALSE, FALSE),
D=c("A", "B", "C", "D", "E", "F"))
# transform to long-format
out <- start_stop2long(data, id="id")
head(out)
#> Key: <id, time>
#> id A B C D time
#> <num> <num> <int> <lgcl> <char> <int>
#> 1: 1 0.1 1 TRUE A 0
#> 2: 1 0.1 1 TRUE A 1
#> 3: 1 0.1 1 TRUE A 2
#> 4: 1 0.1 1 TRUE A 3
#> 5: 1 0.1 1 TRUE A 4
#> 6: 1 0.1 1 TRUE A 5
# if C was coded as an event instead, we would want to use:
out <- start_stop2long(data, id="id", events="C")
head(out)
#> Key: <id, time>
#> id time A B D C
#> <num> <int> <num> <int> <char> <lgcl>
#> 1: 1 0 0.1 1 A FALSE
#> 2: 1 1 0.1 1 A FALSE
#> 3: 1 2 0.1 1 A FALSE
#> 4: 1 3 0.1 1 A FALSE
#> 5: 1 4 0.1 1 A FALSE
#> 6: 1 5 0.1 1 A FALSE