download.Rd
Making a dataset available to ricu
consists of 3 steps: downloading
(download_src()
), importing (import_src()
) and attaching
(attach_src()
). While downloading and importing are one-time procedures,
attaching of the dataset is repeated every time the package is loaded.
Briefly, downloading loads the raw dataset from the internet (most likely
in .csv
format), importing consists of some preprocessing to make the
data available more efficiently (by converting it to .fst
format) and attaching sets up the data for use by the package.
download_src(x, data_dir = src_data_dir(x), ...)
# S3 method for src_cfg
download_src(x, data_dir = src_data_dir(x), tables = NULL, force = FALSE, ...)
# S3 method for aumc_cfg
download_src(
x,
data_dir = src_data_dir(x),
tables = NULL,
force = FALSE,
token = NULL,
verbose = TRUE,
...
)
# S3 method for character
download_src(
x,
data_dir = src_data_dir(x),
tables = NULL,
force = FALSE,
user = NULL,
pass = NULL,
verbose = TRUE,
...
)
Object specifying the source configuration
Destination directory where the downloaded data is written to.
Generic consistency
Character vector specifying the tables to download. If
NULL
, all available tables are downloaded.
Logical flag; if TRUE
, existing data will be re-downloaded
Download token for AmsterdamUMCdb (see 'Details')
Logical flag indicating whether to print progress information
PhysioNet credentials; if NULL
and environment
variables RICU_PHYSIONET_USER
/RICU_PHYSIONET_PASS
are not set, user
input is required
Called for side effects and returns NULL
invisibly.
Downloads by ricu
are focused data hosted by
PhysioNet and tools are currently available for
downloading the datasets
MIMIC-III,
eICU and
HiRID (see data). While
credentials are required for downloading any of the three datasets, demo
dataset for both MIMIC-III and eICU are available without having to log in.
Even though access to full dataset is credentialed, the datasets are in
fact publicly available. For setting up an account, please refer to the registration form.
PhysioNet credentials can either be entered in an interactive session,
passed as function arguments user
/pass
or as environment
variables RICU_PHYSIONET_USER
/RICU_PHYSIONET_PASS
. For setting
environment variables on session startup, refer to base::.First.sys()
and
for setting environment variables in general, refer to base::Sys.setenv()
If the openssl package is available, SHA256 hashes of downloaded files are
verified using openssl::sha256()
.
Demo datasets MIMIC-III demo and eICU demo can either be installed as R packages directly by running
install.packages(
c("mimic.demo", "eicu.demo"),
repos = "https://eth-mds.github.io/physionet-demo"
)
or downloaded and imported using download_src()
and import_src()
.
Furthermore, ricu
specifies mimic.demo
and eicu.demo
as Suggests
dependencies therefore, passing dependencies = TURE
when calling
install.packages()
for installing ricu
, this will automatically install
the demo datasets as well.
While the included data downloaders are intended for data hosted by
PhysioNet, download_src()
is an S3 generic function that can be extended
to new classes. Method dispatch is intended to occur on objects that
inherit from or can be coerced to src_cfg
. For more information on data
source configuration, refer to load_src_cfg()
.
As such, with the addition of the AmsterdamUMCdb dataset, which
unfortunately is not hosted on PhysioNet, A separate downloader for that
dataset is available as well. Currently this requires both availability of
the CRAN package xml2
, as well as the command line utility 7zip.
Furthermore, data access has to be requested and for
non-interactive download the download token has to be made available as
environment variable RICU_AUMC_TOKEN
or passed as token
argument to
download_src()
. The download token can be retrieved from the URL provided
when granted access as by extracting the string followed by token=
:
://example.org/?s=download&token=0c27af59-72d1-0349-aa59-00000a8076d9 https
would translate to
Sys.setenv(RICU_AUMC_TOKEN = "0c27af59-72d1-0349-aa59-00000a8076d9")
If the dependencies outlined above are not fulfilled, download and archive
extraction can be carried out manually into the corresponding folder and
import_src()
can be run.
if (FALSE) {
dir <- tempdir()
list.files(dir)
download_datasource("mimic_demo", data_dir = dir)
list.files(dir)
unlink(dir, recursive = TRUE)
}