The two data classes id_tbl and ts_tbl, used by ricu to represent ICU patient data, consist of a data.table alongside some meta data. This includes marking columns that have special meaning and for data representing measurements ordered in time, the step size. The following utility functions can be used to extract columns and column names with special meaning, as well as query a ts_tbl object regarding its time series related meta data.

id_vars(x)

id_var(x)

id_col(x)

index_var(x)

index_col(x)

dur_var(x)

dur_col(x)

dur_unit(x)

meta_vars(x)

data_vars(x)

data_var(x)

data_col(x)

interval(x)

time_unit(x)

time_step(x)

time_vars(x)

Arguments

x

Object to query

Value

Mostly column names as character vectors, in case of id_var(), index_var(), data_var() and time_unit() of length 1, else of variable length. Functions id_col(), index_col() and data_col() return table columns as vectors, while interval() returns a scalar valued difftime

object and time_step() a number.

Details

The following functions can be used to query an object for columns or column names that represent a distinct aspect of the data:

  • id_vars(): ID variables are one or more column names with the interaction of corresponding columns identifying a grouping of the data. Most commonly this is some sort of patient identifier.

  • id_var(): This function either fails or returns a string and can therefore be used in case only a single column provides grouping information.

  • id_col(): Again, in case only a single column provides grouping information, this column can be extracted using this function.

  • index_var(): Suitable for use as index variable is a column that encodes a temporal ordering of observations as difftime vector. Only a single column can be marked as index variable and this function queries a ts_tbl object for its name.

  • index_col(): similarly to id_col(), this function extracts the column with the given designation. As a ts_tbl object is required to have exactly one column marked as index, this function always returns for ts_tbl objects (and fails for id_tbl objects).

  • dur_var(): For win_tbl objects, this returns the name of the column encoding the data validity interval.

  • dur_col(): Similarly to index_col(), this returns the difftime vector corresponding to the dur_var().

  • meta_vars(): For ts_tbl objects, meta variables represent the union of ID and index variables (for win_tbl, this also includes the dur_var()), while for id_tbl objects meta variables consist pf ID variables.

  • data_vars(): Data variables on the other hand are all columns that are not meta variables.

  • data_var(): Similarly to id_var(), this function either returns the name of a single data variable or fails.

  • data_col(): Building on data_var(), in situations where only a single data variable is present, it is returned or if multiple data column exists, an error is thrown.

  • time_vars(): Time variables are all columns in an object inheriting from data.frame that are of type difftime. Therefore in a ts_tbl object the index column is one of (potentially) several time variables. For a win_tbl, however the dur_var() is not among the time_vars().

  • interval(): The time series interval length is represented a scalar valued difftime object.

  • time_unit(): The time unit of the time series interval, represented by a string such as "hours" or "mins" (see difftime).

  • time_step(): The time series step size represented by a numeric value in the unit as returned by time_unit().

Examples

tbl <- id_tbl(a = rep(1:2, each = 5), b = rep(1:5, 2), c = rnorm(10),
              id_vars = c("a", "b"))

id_vars(tbl)
#> [1] "a" "b"
tryCatch(id_col(tbl), error = function(...) "no luck")
#> [1] "no luck"
data_vars(tbl)
#> [1] "c"
data_col(tbl)
#>  [1] -1.67729715  3.20206629  1.19129155  0.82700078  0.67559457 -0.50015815
#>  [7] -1.01176763  2.06611666  0.92338063  0.01363021

tmp <- as_id_tbl(tbl, id_vars = "a")
id_vars(tmp)
#> [1] "a"
id_col(tmp)
#>  [1] 1 1 1 1 1 2 2 2 2 2

tbl <- ts_tbl(a = rep(1:2, each = 5), b = hours(rep(1:5, 2)), c = rnorm(10))
index_var(tbl)
#> [1] "b"
index_col(tbl)
#> Time differences in hours
#>  [1] 1 2 3 4 5 1 2 3 4 5

identical(index_var(tbl), time_vars(tbl))
#> [1] TRUE

interval(tbl)
#> Time difference of 1 hours
time_unit(tbl)
#> [1] "hours"
time_step(tbl)
#> [1] 1