attach_src.Rd
Making a dataset available to ricu
consists of 3 steps: downloading
(download_src()
), importing (import_src()
) and attaching
(attach_src()
). While downloading and importing are one-time procedures,
attaching of the dataset is repeated every time the package is loaded.
Briefly, downloading loads the raw dataset from the internet (most likely
in .csv
format), importing consists of some preprocessing to make the
data available more efficiently and attaching sets up the data for use by
the package.
attach_src(x, ...)
# S3 method for src_cfg
attach_src(x, assign_env = NULL, data_dir = src_data_dir(x), ...)
# S3 method for character
attach_src(x, assign_env = NULL, data_dir = src_data_dir(x), ...)
detach_src(x)
setup_src_env(x, ...)
# S3 method for src_cfg
setup_src_env(x, data_dir = src_data_dir(x), link_env = NULL, ...)
Data source to attach
Forwarded to further calls to attach_src()
Environment in which the data source will become available
Directory used to look for fst::fst()
files; NULL
calls
data_dir()
using the source name as subdir
argument
Both attach_src()
and setup_src_env()
are called for side
effects and therefore return invisibly. While attach_src()
returns NULL
,
setup_src_env()
returns the newly created src_env
object.
Attaching a dataset sets up two types of S3 classes: a single src_env
object, containing as many src_tbl
objects as tables are associated with
the dataset. A src_env
is an environment with an id_cfg
attribute, as
well as sub-classes as specified by the data source class_prefix
configuration setting (see load_src_cfg()
). All src_env
objects created
by calling attach_src()
represent environments that are direct
descendants of the data
environment and are bound to the respective
dataset name within that environment. For more information on src_env
and
src_tbl
objects, refer to new_src_tbl()
.
If set up correctly, it is not necessary for the user to directly call
attach_src()
. When the package is loaded, the default data sources (see
auto_attach_srcs()
) are attached automatically. This default can be
controlled by setting as environment variable RICU_SRC_LOAD
a comma
separated list of data source names before loading the library. Setting
this environment variable as
Sys.setenv(RICU_SRC_LOAD = "mimic_demo,eicu_demo")
will change the default of loading both MIMIC-III and eICU, alongside the
respective demo datasets, as well as HiRID and AUMC, to just the two demo
datasets. For setting an environment variable upon startup of the R
session, refer to base::.First.sys()
.
Attaching a dataset during package namespace loading will both instantiate
a corresponding src_env
in the data
environment and for convenience
also assign this object into the package namespace, such that for example
the MIMIC-III demo dataset not only is available as
ricu::data::mimic_demo
, but also as ricu::mimic_demo
(or if the package
namespace is attached, simply as mimic_demo
). Dataset attaching using
attach_src()
does not need to happen during namespace loading, but can be
triggered by the user at any time. If such a convenience link as described
above is desired by the user, an environment such as .GlobalEnv
has to be
passed as assign_env
to attach_src()
.
Data sets are set up as src_env
objects irrespective of whether all (or
any) of the required data is available. If some (or all) data is missing,
the user is asked for permission to download in interactive sessions and an
error is thrown in non-interactive sessions. Downloading demo datasets
requires no further information but access to full-scale datasets (even
though they are publicly available) is guarded by access credentials (see
download_src()
).
While attach_src()
provides the main entry point, src_env
objects are
instantiated by the S3 generic function setup_src_env()
and the wrapping
function serves to catch errors that might be caused by config file parsing
issues as to not break attaching of the package namespace. Apart form this,
attach_src()
also provides the convenience linking into the package
namespace (or a user-specified environment) described above.
A src_env
object created by setup_src_env()
does not directly contain
src_tbl
objects bound to names, but rather an active binding (see
base::makeActiveBinding()
) per table. These active bindings check for
availability of required files and evaluate to corresponding src_tbl
objects if these checks are passed and ask for user input otherwise. As
src_tbl
objects are intended to be read-only, assignment is not possible
except for the value NULL
which resets the internally cached src_tbl
that is created on first successful access.