For concept loading, item callback functions are used in order to handle item-specific post-processing steps, such as converting measurement units, mapping a set of values to another or for more involved data transformations, like turning absolute drug administration rates into rates that are relative to body weight. Item callback functions are called by load_concepts() with arguments x (the data), a variable number of name/ string pairs specifying roles of columns for the given item, followed by env, the data source environment as src_env object. Item callback functions can be specified by their name or using function factories such as transform_fun(), apply_map() or convert_unit().

transform_fun(fun, ...)

binary_op(op, y)

comp_na(op, y)

set_val(val)

apply_map(map, var = "val_var")

convert_unit(fun, new, rgx = NULL, ignore_case = TRUE, ...)

Arguments

fun

Function(s) used for transforming matching values

...

Further arguments passed to downstream function

op

Function taking two arguments, such as +

y

Value passed as second argument to function op

val

Value to replace every element of x with

map

Named atomic vector used for mapping a set of values (the names of map) to a different set (the values of map)

var

Argument which is used to determine the column the mapping is applied to

new

Name(s) of transformed units

rgx

Regular expression(s) used for identifying observations based on their current unit of measurement, NULL means everything

ignore_case

Forwarded to base::grep()

Value

Callback function factories such as transform_fun(), apply_map()

or convert_unit() return functions suitable as item callback functions, while transform function generators such as binary_op(), comp_na()

return functions that apply a transformation to a vector.

Details

The most forward setting is where a function is simply referred to by its name. For example in eICU, age is available as character vector due to ages 90 and above being represented by the string "> 89". A function such as the following turns this into a numeric vector, replacing occurrences of "> 89" by the number 90.

eicu_age <- function(x, val_var, ...) {
  data.table::set(
    data.table::set(x, which(x[[val_var]] == "> 89"), j = val_var,
                    value = 90),
    j = val_var,
    value = as.numeric(x[[val_var]])
  )
}

This function then is specified as item callback function for items corresponding to eICU data sources of the age concept as

item(src = "eicu_demo", table = "patient", val_var = "age",
     callback = "eicu_age", class = "col_itm")

The string passed as callback argument is evaluated, meaning that an expression can be passed which evaluates to a function that in turn can be used as callback. Several function factories are provided which return functions suitable for use as item callbacks: transform_fun() creates a function that transforms the val_var column using the function supplied as fun argument, apply_map() can be used to map one set of values to another (again using the val_var column) and convert_unit() is intended for converting a subset of rows (identified by matching rgx against the unit_var column) by applying fun to the val_var column and setting new as the transformed unit name (arguments are not limited to scalar values). As transformations require unary functions, two utility function, binary_op() and comp_na() are provided which can be used to fix the second argument of binary functions such as * or ==. Taking all this together, an item callback function for dividing the val_var column by 2 could be specified as "transform_fun(binary_op(/, 2))". The supplied function factories create functions that operate on the data using by-reference semantics. Furthermore, during concept loading, progress is reported by a progress::progress_bar. In order to signal a message without disrupting the current loading status, see msg_progress().

Examples

dat <- ts_tbl(x = rep(1:2, each = 5), y = hours(rep(1:5, 2)), z = 1:10)

subtract_3 <- transform_fun(binary_op(`-`, 3))
subtract_3(data.table::copy(dat), val_var = "z")
#>     x       y  z
#>  1: 1 1 hours -2
#>  2: 1 2 hours -1
#>  3: 1 3 hours  0
#>  4: 1 4 hours  1
#>  5: 1 5 hours  2
#>  6: 2 1 hours  3
#>  7: 2 2 hours  4
#>  8: 2 3 hours  5
#>  9: 2 4 hours  6
#> 10: 2 5 hours  7

gte_4 <- transform_fun(comp_na(`>=`, 4))
gte_4(data.table::copy(dat), val_var = "z")
#>     x       y     z
#>  1: 1 1 hours FALSE
#>  2: 1 2 hours FALSE
#>  3: 1 3 hours FALSE
#>  4: 1 4 hours  TRUE
#>  5: 1 5 hours  TRUE
#>  6: 2 1 hours  TRUE
#>  7: 2 2 hours  TRUE
#>  8: 2 3 hours  TRUE
#>  9: 2 4 hours  TRUE
#> 10: 2 5 hours  TRUE

map_letters <- apply_map(setNames(letters[1:9], 1:9))
res <- map_letters(data.table::copy(dat), val_var = "z")
res
#>     x       y    z
#>  1: 1 1 hours    a
#>  2: 1 2 hours    b
#>  3: 1 3 hours    c
#>  4: 1 4 hours    d
#>  5: 1 5 hours    e
#>  6: 2 1 hours    f
#>  7: 2 2 hours    g
#>  8: 2 3 hours    h
#>  9: 2 4 hours    i
#> 10: 2 5 hours <NA>

not_b <- transform_fun(comp_na(`!=`, "b"))
not_b(res, val_var = "z")
#>     x       y     z
#>  1: 1 1 hours  TRUE
#>  2: 1 2 hours FALSE
#>  3: 1 3 hours  TRUE
#>  4: 1 4 hours  TRUE
#>  5: 1 5 hours  TRUE
#>  6: 2 1 hours  TRUE
#>  7: 2 2 hours  TRUE
#>  8: 2 3 hours  TRUE
#>  9: 2 4 hours  TRUE
#> 10: 2 5 hours FALSE