callback_itm.Rd
For concept loading, item callback functions are used in order to handle
item-specific post-processing steps, such as converting measurement units,
mapping a set of values to another or for more involved data
transformations, like turning absolute drug administration rates into rates
that are relative to body weight. Item callback functions are called by
load_concepts()
with arguments x
(the data), a variable number of name/
string pairs specifying roles of columns for the given item, followed by
env
, the data source environment as src_env
object.
Item callback functions can be specified by their name or using function
factories such as transform_fun()
, apply_map()
or convert_unit()
.
transform_fun(fun, ...)
binary_op(op, y)
comp_na(op, y)
set_val(val)
apply_map(map, var = "val_var")
convert_unit(fun, new, rgx = NULL, ignore_case = TRUE, ...)
Function(s) used for transforming matching values
Further arguments passed to downstream function
Function taking two arguments, such as +
Value passed as second argument to function op
Value to replace every element of x with
Named atomic vector used for mapping a set of values (the names
of map
) to a different set (the values of map
)
Argument which is used to determine the column the mapping is applied to
Name(s) of transformed units
Regular expression(s) used for identifying observations based on
their current unit of measurement, NULL
means everything
Forwarded to base::grep()
Callback function factories such as transform_fun()
, apply_map()
or convert_unit()
return functions suitable as item callback functions,
while transform function generators such as binary_op()
, comp_na()
return functions that apply a transformation to a vector.
The most forward setting is where a function is simply referred to by its name. For example in eICU, age is available as character vector due to ages 90 and above being represented by the string "> 89". A function such as the following turns this into a numeric vector, replacing occurrences of "> 89" by the number 90.
<- function(x, val_var, ...) {
eicu_age ::set(
data.table::set(x, which(x[[val_var]] == "> 89"), j = val_var,
data.tablevalue = 90),
j = val_var,
value = as.numeric(x[[val_var]])
) }
This function then is specified as item callback function for items
corresponding to eICU data sources of the age
concept as
item(src = "eicu_demo", table = "patient", val_var = "age",
callback = "eicu_age", class = "col_itm")
The string passed as callback
argument is evaluated, meaning that an
expression can be passed which evaluates to a function that in turn can be
used as callback. Several function factories are provided which return
functions suitable for use as item callbacks: transform_fun()
creates a
function that transforms the val_var
column using the function supplied
as fun
argument, apply_map()
can be used to map one set of values to
another (again using the val_var
column) and convert_unit()
is intended
for converting a subset of rows (identified by matching rgx
against the
unit_var
column) by applying fun
to the val_var
column and setting
new
as the transformed unit name (arguments are not limited to scalar
values). As transformations require unary functions, two utility function,
binary_op()
and comp_na()
are provided which can be used to fix the
second argument of binary functions such as *
or ==
. Taking all this
together, an item callback function for dividing the val_var
column by 2
could be specified as "transform_fun(binary_op(
/, 2))"
. The supplied
function factories create functions that operate on the data using
by-reference semantics. Furthermore, during concept
loading, progress is reported by a progress::progress_bar. In order to
signal a message without disrupting the current loading status, see
msg_progress()
.
dat <- ts_tbl(x = rep(1:2, each = 5), y = hours(rep(1:5, 2)), z = 1:10)
subtract_3 <- transform_fun(binary_op(`-`, 3))
subtract_3(data.table::copy(dat), val_var = "z")
#> x y z
#> 1: 1 1 hours -2
#> 2: 1 2 hours -1
#> 3: 1 3 hours 0
#> 4: 1 4 hours 1
#> 5: 1 5 hours 2
#> 6: 2 1 hours 3
#> 7: 2 2 hours 4
#> 8: 2 3 hours 5
#> 9: 2 4 hours 6
#> 10: 2 5 hours 7
gte_4 <- transform_fun(comp_na(`>=`, 4))
gte_4(data.table::copy(dat), val_var = "z")
#> x y z
#> 1: 1 1 hours FALSE
#> 2: 1 2 hours FALSE
#> 3: 1 3 hours FALSE
#> 4: 1 4 hours TRUE
#> 5: 1 5 hours TRUE
#> 6: 2 1 hours TRUE
#> 7: 2 2 hours TRUE
#> 8: 2 3 hours TRUE
#> 9: 2 4 hours TRUE
#> 10: 2 5 hours TRUE
map_letters <- apply_map(setNames(letters[1:9], 1:9))
res <- map_letters(data.table::copy(dat), val_var = "z")
res
#> x y z
#> 1: 1 1 hours a
#> 2: 1 2 hours b
#> 3: 1 3 hours c
#> 4: 1 4 hours d
#> 5: 1 5 hours e
#> 6: 2 1 hours f
#> 7: 2 2 hours g
#> 8: 2 3 hours h
#> 9: 2 4 hours i
#> 10: 2 5 hours <NA>
not_b <- transform_fun(comp_na(`!=`, "b"))
not_b(res, val_var = "z")
#> x y z
#> 1: 1 1 hours TRUE
#> 2: 1 2 hours FALSE
#> 3: 1 3 hours TRUE
#> 4: 1 4 hours TRUE
#> 5: 1 5 hours TRUE
#> 6: 2 1 hours TRUE
#> 7: 2 2 hours TRUE
#> 8: 2 3 hours TRUE
#> 9: 2 4 hours TRUE
#> 10: 2 5 hours FALSE