concept_dictionary.Rd
Data concepts can be specified in JSON format as a concept dictionary which
can be read and parsed into concept
/item
objects. Dictionary loading
can either be performed on the default included dictionary or on a user-
specified custom dictionary. Furthermore, a mechanism is provided for adding
concepts and/or data sources to the existing dictionary (see the Details
section).
load_dictionary(
src = NULL,
concepts = NULL,
name = "concept-dict",
cfg_dirs = NULL
)
concept_availability(dict = NULL, include_rec = FALSE, ...)
explain_dictionary(
dict = NULL,
cols = c("name", "category", "description"),
...
)
NULL
or the name of one or several data sources
A character vector used to subset the concept dictionary or
NULL
indicating no subsetting
Name of the dictionary to be read
File name of the dictionary
A dictionary (conncept
object) or NULL
Logical flag indicating whether to include rec_cncpt
concepts as well
Forwarded to load_dictionary()
in case NULL
is passed as
dict
argument
Columns to include in the output of explain_dictionary()
A concept
object containing several data concepts as cncpt
objects.
A default dictionary is provided at
system.file(
file.path("extdata", "config", "concept-dict.json"),
package = "ricu"
)
and can be loaded in to an R session by calling
get_config("concept-dict")
. The default dictionary can be extended by
adding a file concept-dict.json
to the path specified by the environment
variable RICU_CONFIG_PATH
. New concepts can be added to this file and
existing concepts can be extended (by adding new data sources).
Alternatively, load_dictionary()
can be called on non-default
dictionaries using the file
argument.
In order to specify a concept as JSON object, for example the numeric concept for glucose, is given by
{"glu": {
"unit": "mg/dL",
"min": 0,
"max": 1000,
"description": "glucose",
"category": "chemistry",
"sources": {
"mimic_demo": [
{"ids": [50809, 50931],
"table": "labevents",
"sub_var": "itemid"
}
]
}
} }
Using such a specification, constructors for cncpt
and
itm
objects are called either using default arguments or as
specified by the JSON object, with the above corresponding to a call like
concept(
name = "glu",
items = item(
src = "mimic_demo", table = "labevents", sub_var = "itemid",
ids = list(c(50809L, 50931L))
),description = "glucose", category = "chemistry",
unit = "mg/dL", min = 0, max = 1000
)
The arguments src
and concepts
can be used to only load a subset of a
dictionary by specifying a character vector of data sources and/or concept
names.
A summary of item availability for a set of concepts can be created using
concept_availability()
. This produces a logical matrix with TRUE
entries
corresponding to concepts where for the given data source, at least a single
item has been defined. If data is loaded for a combination of concept and
data source, where the corresponding entry is FALSE
, this will yield
either a zero-row id_tbl
object or an object inheriting form id_tbl
where the column corresponding to the concept is NA
throughout, depending
on whether the concept was loaded alongside other concepts where data is
available or not.
Whether to include rec_cncpt
concepts in the overview produced by
concept_availability()
can be controlled via the logical flag
include_rec
. A recursive concept is considered available simply if all its
building blocks are available. This can, however lead to slightly confusing
output as a recursive concept might not strictly depend on one of its
sub-concepts but handle such missingness by design. In such a scenario, the
availability summary might report FALSE
even though data can still be
produced.
if (require(mimic.demo)) {
head(load_dictionary("mimic_demo"))
load_dictionary("mimic_demo", c("glu", "lact"))
}
#> <concept[2]>
#> glu lact
#> glucose <num_cncpt[1]> lactate <num_cncpt[1]>