src_cfg.Rd
Data source configuration objects store information on data sources used
throughout ricu
. This includes URLs for data set downloading, Column
specifications used for data set importing, default values per table for
important columns such as index columns when loading data and how different
patient identifiers used throughout a dataset relate to another. Per
dataset, a src_cfg
object is created from a JSON file (see
load_src_cfg()
), consisting of several helper-classes compartmentalizing
the pieces of information outlined above. Alongside constructors for the
various classes, several utilities, such as inheritance checks, coercion
functions, as well as functions to extract pieces of information from these
objects are provided.
new_src_cfg(name, id_cfg, col_cfg, tbl_cfg, ..., class_prefix = name)
new_id_cfg(
src,
name,
id,
pos = seq_along(name),
start = NULL,
end = NULL,
table = NULL,
class_prefix = src
)
new_col_cfg(src, table, ..., class_prefix = src)
new_tbl_cfg(
src,
table,
files = NULL,
cols = NULL,
num_rows = NULL,
partitioning = NULL,
...,
class_prefix = src
)
is_src_cfg(x)
as_src_cfg(x)
is_id_cfg(x)
as_id_cfg(x)
is_col_cfg(x)
as_col_cfg(x)
is_tbl_cfg(x)
as_tbl_cfg(x)
src_name(x)
tbl_name(x)
src_extra_cfg(x)
src_prefix(x)
src_url(x)
id_var_opts(x)
default_vars(x, type)
Name of the data source
An id_cfg
object for the given data source
A list of col_cfg
objects representing column defaults for
all tables of the
A list of tbl_cfg
containing information on how tables are
organized (may be NULL
)
Further objects to add (such as an URL specification)
A character vector of class prefixes that are added to the instantiated classes
Data source name
Name(s) of ID column(s), as well as respective start and end timestamps
Integer valued position, ordering IDs by their cardinality
Table name
List containing a list per column each holding string valued
entries name
(column name as used by ricu
), col
(column name as used
in the raw data) and spec
(name of readr::cols()
column specification).
Further entries will be passed as argument to the respective readr
column
specification
A count indicating the expected number of rows
A table partitioning is defined by a column name and a
vector of numeric values that are passed as vec
argument to
base::findInterval()
Object to coerce/query
Constructors new_*()
as well as coercion functions as_*()
return the respective objects, while inheritance tester functions is_*()
return a logical flag.
src_url()
: string valued data source URL
id_var_opts()
: character vector of ID variable options
src_name()
: string valued data source name
tbl_name()
: string valued table name
The following classes are used to represent data source configuration objects:
src_cfg
: wraps objects id_cfg
, col_cfg
and optionally tbl_cfg
id_cfg
: contains information in ID systems and is created from id_cfg
entries in config files
col_cfg
: contains column default settings represented by defaults
entries in table configuration blocks
tbl_cfg
: used when importing data and therefore encompasses information
in files
, num_rows
and cols
entries of table configuration blocks
Represented by a col_cfg
, a table can have some of its columns marked as
default columns for the following concepts and further column meanings can
be specified via ...
:
id_col
: column will be used for as id for icu_tbl
objects
index_col
: column represents a timestamp variable and will be use as
such for ts_tbl
objects
val_col
: column contains the measured variable of interest
unit_col
: column specifies the unit of measurement in the corresponding
val_col
Alongside constructors (new_*()
), inheritance checking functions
(is_*()
), as well as coercion functions (as_*(
), relevant utility
functions include:
src_url()
: retrieve the URL of a data source
id_var_opts()
: column name(s) corresponding to ID systems
src_name()
: name of the data source
tbl_name()
: name of a table
Coercion between objects under some circumstances can yield list-of object
return types. For example when coercing src_cfg
to tbl_cfg
, this will
result in a list of tbl_cfg
objects, as multiple tables typically
correspond to a data source.