How-to-Run Study
1 Setup Study
1.1 Download Zip
To avoid using git to leverage this study, the easiest way to access the study code is via a zip file. Instructions for downloading the zip file are below:
- Navigate to github repo url.
- Select the green code button revealing a dropdown menu.
- Select Download Zip.
- Unzip the folder on your local computer that is easily accessible within R Studio.
- Open the unzipped folder and select
ehden_hmb.Rproj
.
1.2 Setup R Environment
1.2.1 Using renv
This study uses renv
to reproduce the R environment to execute this study. The study code maintains an renv.lock
file in the main branch of the repository. To activate the R dependencies through renv
use the following code:
::restore() renv
1.2.2 Troubleshooting renv
Sometimes there are errors in the package installation via renv
. If you encounter an error try removing the problematic package from the renv.lock
file and restore again. To remove a package from the lock file find the header of the package and delete all corresponding lines. Once you are able to get the remaining packages to install, manually install the problem package using one of the strategies below:
# Installing an R package from CRAN ------------
## Installing latest version of R package on CRAN
install.packages("ggplot2")
## Installing archived version of R package on CRAN
<- "http://cran.r-project.org/src/contrib/Archive/ggplot2/ggplot2_0.9.1.tar.gz"
packageurl install.packages(packageurl, repos=NULL, type="source")
# Installing an R package from github -----------------
# Installing current version of R package from github
install.packages("remotes") # note you may also work devtools
::install_github("ohdsi/FeatureExtraction")
remotes
# Installing develop version of R package from github
::install_github("ohdsi/Ulysses", ref = "develop")
remotes
# Installing old version of R package from github
::install_github("ohdsi/CohortGenerator", ref = "v0.7.0") remotes
Any additional issues with the renv
lock file please file an issue in the study repository
1.2.3 Conflicts
Some organizational IT setups pose conflicts with renv
. One example is if your organization uses the Broadsea Docker. If you are aware that your OHDSI environment will conflict with renv
, deactivate it by running renv::deactivate()
in the active ehden .RProj. Also delete the renv folder in your project directory. Review the list of R packages required to run the study in the Technical Requirements tab of the study hub and manually install.
It is highly recommended you stick with the renv
snapshot as this is the easiest way to reproduce the study execution environment. Please only deactivate the lock file if it is a last resort.
1.3 Load Execution Credentials
This study uses keyring
and config
to mask and query database credentials needed for execution. The study will help load these credentials using the file extras/KeyringSetup.R
.
1.3.1 Required Credentials
Data nodes executing this study require the following credentials:
- dbms - the name of the dbms you are using (redshift, postgresql, snowflake, etc)
- user - the username credential used to connect to the OMOP database
- password - the password credential used to connect to the OMOP database
- connectionString - a composed string that establishes the connection to the OMOP database. An example of a connection string would be jdbc:dbms://host-url.com:port/database_name.
- cdmDatabaseSchema - the database and schema used to access the cdm. Note that this credential may be separated by a dot to indicate the database and schema, which tends to be the case in sql server. Example: our_cdm.cdm.
- vocabDatabaseSchema - the database and schema used to access the vocabulary tables. Note this is typically the same as the cdmDatabaseSchema.
- workDatabaseSchema - a section of the database where the user has read/write access. This schema is where we write the cohortTable used to enumerate the cohort definitions.
It is recommended that you write these credentials down in a private txt file to make it easier to load into the credential manager for the study.
1.3.2 Loading Credentials
- Open the file in the study names
extras/KeyringSetup.R
. - On L15:16 place a name for your config block and the database. The configBlock name can be an abbreviation, for example:
<- "synpuf"
configBlock <- "synpuf_110k" database
- One at a time run each line in the script and follow any prompts
- L27 asks you to build a new config.yml file which can be done as so
Ulysses::makeConfig(block = configBlock, database = database)
. Running this function will open a new file. Make sure any keyring variable is labelled as ehden_hmb. - L32 will setup a new
keyring
for the study. The name of the keyring is ehden_hmb and the password for the keyring is ohdsi. If R prompts you to place a keyring password it will be ohdsi unless changed by the user. - L36 will setup all the credentials for the study. A prompt will appear asking to input credentials, using your txt file input your credentials into the prompt. Review the credentials once you are done
- L27 asks you to build a new config.yml file which can be done as so
1.3.2.1 Troubleshooting
If you have a problem with the keyring, please post an issue in the study repository. Otherwise you can avoid the keyring api by hard-coding your credentials to the config.yml
file as shown in the example:
db: # replace with an acronym for your database no underscore
databaseName: db_ehr # replace with the name of your database
dbms: <your_dbms>
user: <your_user>
password: <your_pass>
connectionString: <your_connectionString>
cdmDatabaseSchema: <your_ cdmDatabaseSchema>
vocabDatabaseSchema: <your_ vocabDatabaseSchema>
workDatabaseSchema: <your_ workDatabaseSchema>
cohortTable: ehden_hmb_<databaseName>
1.3.3 Multiple Databases (Optional)
Some study participants will be running this study on multiple databases. To add additional credentials rerun the extras/KeyringSetup.R
file replacing the configBlock and database with a second database you need to use. on L28 you can add the following function to add a new config block to the config.yml
file.
::addConfig(block = "[next_block]", database = "[next_database]") Ulysses
2 Run Study
2.1 Execution Script
Running all tasks sequentially can be done using the executeStudy.R
file. Replace L17 with the configBlock of choice and run the script.
2.2 Study Tasks
Once your study is setup you are ready to run the EHDEN HMB study. The EHDEN HMB study contains six analytical tasks:
- Build Cohorts - initialize a cohort table and generate cohort counts from the circe json in cohortsToCreate
- Cohort Diagnostics - review of the hmb cohort definition
- Incidence Analysis - calculation of the incidence of hmb in a population of females between ages 11 to 55
- Baseline Characteristics - generation of prevalence of demographics and comorbidities at baseline (-365d prior to index)
- Treatment Patterns - calculation of drug prevalence post-index, sequences and time to treatment discontinuation
- Procedure Analysis - calculation of post-index prevalence of procedures and time to initial treatment
Each task will output to the results
folder. Note that the first task buildCohorts is required to run any additional analytical task (base dependency). If a cohort definition json has changed it requires rerunning the buildCohorts task.
2.3 Review Cohort Diagnostics
If you would like to review the results of the Cohort Diagnostics prior to sending you can do so by the instructions below:
- Open the file
extras/ExploreDiagnostics.R
- Run the R file line by line
- Replace L20 with the name of your database
- L28 will create a new folder in the directory called
scratchDiagnostics
where it will save the sqlite object - L36 prepares the diagnostics results per database
- L41 launches the shiny app
- Ignore any warning messages about reserved keywords and navigation containers
2.3.1 Troubleshooting
Multiple Databases
Some data partners may be running this study on multiple databases. If that is the case then you to identify a folder that contains all the zip files produced from the CohortDiagnostics
routine. To check if this file was produced, go into the cohortDiagnostics
folder of each database and identify the .zip
file amongst the csv. It should be labelled something like Results_database.zip
. You can collect all of these zip files in a single folder and make the dataFolder
target as the folder with the multiple zip files per database.
Shiny App
The shiny app rendered from CohortDiagnostics
relies on the OHDSI package OhdsiShinyModules
. This package is installed within the renv
snapshot loaded with the package. If you are having trouble launching the app from within the ehden_hmb.RProj
, exit the R project and try launching the app outside the study session. If you are outside the ehden_hmb.RProj
remember you have to provide full file paths because you are no longer zoomed in on the study directory.
2.4 Results Folder
Following successful execution of the study, each database will have its own sub-folder within results. There will be a further 14 subfolders containing results from the execution underneath the database. Within each of these folders there will be a combination of csv
, rds
and parquet
(only for treatmentHistory) that contain the results. You may review these files individually if you wish. Instructions on how to share the results are maintained in the contribution tab of the website.
2.5 Review Results
In version 0.2 of the study code, we have added code to launch a shiny app to locally review results prior to distributing them to the study host. In order to run the shiny app the user needs to prep the data for the app. Navigate to extras/PreviewResults.R
and run the first step. This will take all databases run in the results section and build a data folder for the shiny app. This data folder is ignored to prevent accidental commits back to github. Next launch the shiny app.