Making a dataset available to ricu consists of 3 steps: downloading (download_src()), importing (import_src()) and attaching (attach_src()). While downloading and importing are one-time procedures, attaching of the dataset is repeated every time the package is loaded. Briefly, downloading loads the raw dataset from the internet (most likely in .csv format), importing consists of some preprocessing to make the data available more efficiently (by converting it to .fst format) and attaching sets up the data for use by the package.

download_src(x, data_dir = src_data_dir(x), ...)

# S3 method for src_cfg
download_src(x, data_dir = src_data_dir(x), tables = NULL, force = FALSE, ...)

# S3 method for aumc_cfg
download_src(
  x,
  data_dir = src_data_dir(x),
  tables = NULL,
  force = FALSE,
  token = NULL,
  verbose = TRUE,
  ...
)

# S3 method for character
download_src(
  x,
  data_dir = src_data_dir(x),
  tables = NULL,
  force = FALSE,
  user = NULL,
  pass = NULL,
  verbose = TRUE,
  ...
)

Arguments

x

Object specifying the source configuration

data_dir

Destination directory where the downloaded data is written to.

...

Generic consistency

tables

Character vector specifying the tables to download. If NULL, all available tables are downloaded.

force

Logical flag; if TRUE, existing data will be re-downloaded

token

Download token for AmsterdamUMCdb (see 'Details')

verbose

Logical flag indicating whether to print progress information

user, pass

PhysioNet credentials; if NULL and environment variables RICU_PHYSIONET_USER/RICU_PHYSIONET_PASS are not set, user input is required

Value

Called for side effects and returns NULL invisibly.

Details

Downloads by ricu are focused data hosted by PhysioNet and tools are currently available for downloading the datasets MIMIC-III, eICU and HiRID (see data). While credentials are required for downloading any of the three datasets, demo dataset for both MIMIC-III and eICU are available without having to log in. Even though access to full dataset is credentialed, the datasets are in fact publicly available. For setting up an account, please refer to the registration form.

PhysioNet credentials can either be entered in an interactive session, passed as function arguments user/pass or as environment variables RICU_PHYSIONET_USER/RICU_PHYSIONET_PASS. For setting environment variables on session startup, refer to base::.First.sys() and for setting environment variables in general, refer to base::Sys.setenv() If the openssl package is available, SHA256 hashes of downloaded files are verified using openssl::sha256().

Demo datasets MIMIC-III demo and eICU demo can either be installed as R packages directly by running

install.packages(
  c("mimic.demo", "eicu.demo"),
  repos = "https://eth-mds.github.io/physionet-demo"
)

or downloaded and imported using download_src() and import_src(). Furthermore, ricu specifies mimic.demo and eicu.demo as Suggests dependencies therefore, passing dependencies = TURE when calling install.packages() for installing ricu, this will automatically install the demo datasets as well.

While the included data downloaders are intended for data hosted by PhysioNet, download_src() is an S3 generic function that can be extended to new classes. Method dispatch is intended to occur on objects that inherit from or can be coerced to src_cfg. For more information on data source configuration, refer to load_src_cfg().

As such, with the addition of the AmsterdamUMCdb dataset, which unfortunately is not hosted on PhysioNet, A separate downloader for that dataset is available as well. Currently this requires both availability of the CRAN package xml2, as well as the command line utility 7zip. Furthermore, data access has to be requested and for non-interactive download the download token has to be made available as environment variable RICU_AUMC_TOKEN or passed as token argument to download_src(). The download token can be retrieved from the URL provided when granted access as by extracting the string followed by token=:

https://example.org/?s=download&token=0c27af59-72d1-0349-aa59-00000a8076d9

would translate to

Sys.setenv(RICU_AUMC_TOKEN = "0c27af59-72d1-0349-aa59-00000a8076d9")

If the dependencies outlined above are not fulfilled, download and archive extraction can be carried out manually into the corresponding folder and import_src() can be run.

Examples

if (FALSE) { dir <- tempdir() list.files(dir) download_datasource("mimic_demo", data_dir = dir) list.files(dir) unlink(dir, recursive = TRUE) }