file_utils.Rd
Determine the location where to place data meant to persist between individual sessions.
data_dir(subdir = NULL, create = TRUE)
src_data_dir(srcs)
auto_attach_srcs()
config_paths()
get_config(name, cfg_dirs = config_paths(), combine_fun = c, ...)
set_config(x, name, dir = file.path("inst", "extdata", "config"), ...)
A string specifying a directory that will be made sure to exist below the data directory.
Logical flag indicating whether to create the specified directory
Character vector of data source names, an object for which an
src_name()
method is defined or an arbitrary-length list thereof.
File name of the configuration file (.json
will be appended)
Character vector of directories searched for config files
If multiple files are found, a function for combining returned lists
Passed to jsonlite::read_json()
or jsonlite::write_json()
Object to be written
Directory to write the file to (created if non-existent)
Functions data_dir()
, src_data_dir()
and config_paths()
return
file paths as character vectors, auto_attach_srcs()
returns a character
vector of data source names, src_data_avail()
returns a data.frame
describing availability of data sources and is_data_avail()
a named
logical vector. Configuration utilities get_config()
and set_config()
read and write list objects to/from JSON format.
For data, the default location depends on the operating system as
Platform | Location |
Linux | ~/.local/share/ricu |
macOS | ~/Library/Application Support/ricu |
Windows | %LOCALAPPDATA%/ricu |
If the default storage directory does not exists, it will only be created upon user consent (requiring an interactive session).
The environment variable RICU_DATA_PATH
can be used to overwrite the
default location. If desired, this variable can be set in an R startup file
to make it apply to all R sessions. For example, it could be set within:
A project-local .Renviron
;
The user-level .Renviron
;
A file at $(R RHOME)/etc/Renviron.site
.
Any directory specified as environment variable will recursively be created.
Data source directories typically are sub-directories to data_dir()
named
the same as the respective dataset. For demo datasets corresponding to
mimic
and eicu
, file location however deviates from this scheme. The
function src_data_dir()
is used to determine the expected data location
of a given dataset.
Configuration files used both for data source configuration, as well as for
dictionary definitions potentially involve multiple files that are read and
merged. For that reason, get_config()
will iterate over directories
passed as cfg_dirs
and look for the specified file (with suffix .json
appended and might be missing in some of the queried directories). All
found files are read by jsonlite::read_json()
and the resulting lists are
combined by reduction with the binary function passed as combine_fun
.
With default arguments, get_config()
will simply concatenate lists
corresponding to files found in the default config locations as returned by
config_paths()
: first the directory specified by the environment variable
RICU_CONFIG_PATH
(if set), followed by the directory at
system.file("extdata", "config", package = "ricu")
Further arguments are passed to jsonlite::read_json()
, which is called
with slightly modified defaults: simplifyVector = TRUE
,
simplifyDataFrame = FALSE
and simplifyMatrix = FALSE
.
The utility function set_config()
writes the list passed as x
to file
dir/name.json
, using jsonlite::write_json()
also with slightly modified
defaults (which can be overridden by passing arguments as ...
): null = "null"
, auto_unbox = TRUE
and pretty = TRUE
.
Whenever the package namespace is attached, a summary of dataset
availability is printed using the utility functions auto_attach_srcs()
and src_data_avail()
. While the former simply returns a character vector
of data sources that are configures for automatically being set up on
package loading, the latter returns a summary of the number of available
tables per dataset.m Finally, is_data_avail()
returns a named logical
vector indicating which data sources have all required data available.
Sys.setenv(RICU_DATA_PATH = tempdir())
identical(data_dir(), tempdir())
#> [1] TRUE
dir.exists(file.path(tempdir(), "some_subdir"))
#> [1] FALSE
some_subdir <- data_dir("some_subdir")
dir.exists(some_subdir)
#> [1] TRUE
cfg <- get_config("concept-dict")
identical(
cfg,
get_config("concept-dict",
system.file("extdata", "config", package = "ricu"))
)
#> [1] TRUE