# Getting started with R package ctrdata for clinical trial information

## Install package ctrdata on a R system

The R Project website (https://www.r-project.org/) provides installers for the R system.

Alternatively, the R system can be used from software products such as R Studio (https://www.rstudio.com/products/RStudio/), which includes an open source integrated development environment (IDE), or Microsoft R Open (https://mran.microsoft.com/open/).

General information on the ctrdata package is available here: https://github.com/rfhb/ctrdata.

install.packages("ctrdata")

The above should install package ctrdata into the user’s library. If this installation does not succeed, the following sections offer potential solutions.

For using the development version of package ctrdata, install from GitHub:

# install preparatory package
install.packages(c("devtools", "httr"))
# note: unset build_opts so that vignettes are built
devtools::install_github("rfhb/ctrdata", build_opts = "")

## Mongo database

A remote or a local mongo database server can be used with the package ctrdata. Suggested installation instructions for a local database server are here. An example of a remote mongo database server is here.

## Internet access via proxy?

Functions in package ctrdata that start with ctr... require access to internet resources via https. Package ctrdata checks and automatically uses the proxy that is set under MS Windows in system settings.

However, proxy settings need to be set by the user for other operating systems and for authenticating proxies, such as follows:

Sys.setenv(https_proxy = "your_proxy.server.domain:8080")
Sys.setenv(https_proxy_user = "userid:password")

## Additional installation aspects for MS Windows

On MS Windows, it seems recommended to not use UNC notation (such as \\server\directory) for specifying the user’s library location:

.libPaths("D:/my/directory/")

As noted in the README for package ctrdata, on MS Windows the cygwin environment has to be installed, into the local directory c:\cygwin. The applications php, bash, perl, cat and sed in the cygwin environment are required for function ctrLoadQueryIntoDb() of package ctrdata (other functions in the package do not have this requirement). The installation of a minimal cygwin environment on MS Windows can be done from package ctrdata as follows:

ctrdata::installCygwinWindowsDoInstall() 

If need be, a proxy can be specified:

ctrdata::installCygwinWindowsDoInstall(proxy = "proxy.server.domain:8080") 

Users who want or need to install cygwin manually can download the setup executable from here. In MS Windows command window or Powershell window, use the following command line. The parameters are explained here.

setup-x86_64.exe --no-admin --quiet-mode --verbose --upgrade-also --root c:/cygwin
--site http://www.mirrorservice.org/sites/sourceware.org/pub/cygwin/
--packages perl,php-jsonc,php-simplexml

## Attach package ctrdata

library(ctrdata)

## Open register’s advanced search page in browser

These functions open the browser, where the user can start searching for trials of interest.

ctrOpenSearchPagesInBrowser()

# Open browser with example search:
ctrOpenSearchPagesInBrowser(input = "cancer&age=under-18",
register = "EUCTR")

## Click search parameters and execute search in browser

Refine the search until the trials of interest are listed in the browser. Currently, the total number of trials that can be retrieved with package ctrdata is intentionally set to 5000 (CTGOV).

Using operating system functions.

The next steps are executed in the R environment:

q <- ctrGetQueryUrlFromBrowser()
# Found search query from EUCTR.
# [1] "cancer&age=under-18"

# Open browser with this query
# Note the register needs to be specified
# when it cannot be deduced from the query
ctrOpenSearchPagesInBrowser(input = q,
register = "EUCTR")

## Analyse information on clinical trials

# find names of fields of interest in database:
dbFindFields(namepart = "status",
allmatches = TRUE)
# [3] "p_end_of_trial_status" "location.status"

# Get all records that have values in all specified fields.
# Note that b31_... is a field within the array b1_...
"p_end_of_trial_status"))

# Tabulate the status of the clinical trial on the date of information retrieval
with(result,
table("Status"       = p_end_of_trial_status,
#   Temporarily Halted            14              4