To provide convenient access to epidemiological data on the coronavirus outbreak, we developed an R package, nCov2019 (https://github.com/yulab-smu/nCov2019). Besides detailed basis statistics, it also includes information about vaccine development and therapeutics candidates. We redesigned the function plot() for geographic maps visualization and provided a interactive shiny app. These analytics tools could be useful in informing the public and studying how this and similar viruses spread in populous countries.

Our R package is designed for both command line and dashboard interaction analysis, As show in diagram, while dashboard() is the main entry for the GUI explore part, the query() is the main function used in CLI explore part, 5 types of data were contain in its return result. Result type were explain in Statistic query part.

Installation

To start off, users could utilize the ‘remotes’ package to install it directly from GitHub by running the following in R:

remotes::install_github("yulab-smu/nCov2019", dependencies = TRUE)

Statistic query

Data query is simple as one command:

library("nCov2019")
res <- query()
## Querying the latest data...
## last update: 2023-03-01
## Querying the global data...
## Gloabl total  679972187  cases; and  6800189  deaths
## Gloabl total affect country or areas: 231
## Gloabl total recovered cases: 26867
## last update: 2023-03-01
## Querying the historical data...
## Query finish, each time you can launch query() to reflash the data

This may take seconds to few minutes, which depend on the users’ network connection, if the user connection is broken, a local stored version data will be used for demo.

The result returned by query() function will contains 5 types of statistic:

names(res)
## [1] "latest"     "global"     "historical"

The query() only need to be performed once in a session, print each of statistic objects, users could get their update time. And for the vaccine and therapeutics query results, print them will return the candidates number.

Global data

The query result of global status will contain a data frame with 21 types of statistic, which have detail explanation on the bottom of this documents. And summary(x) will return overview of global status.

x = res$global
x$affectedCountries # total affected countries
## [1] 231
summary(x)
## Gloabl total  679972187  cases; and  6800189  deaths
## Gloabl total affect country or areas: 231
## Gloabl total recovered cases: 26867
## last update: 2023-03-01

Latest data

Here is the example for operating latest data. once again, all data have queried and store in res.

x = res$latest

And then print(x) will return the update time for the latest data

print(x) # check update time
## last update: 2023-03-01

To subset latest data could be easily done by using [. x["Global"] or x["global"] will return the data frame for all countries but users could determine a specific country, such as:

head(x["Global"]) # return all global countries.
x[c("USA","India")] # return only for USA and India 

The data is order by “todayCases” column, users could sort them by other order.

df = x["Global"]
head(df[order(df$cases, decreasing = T),])  

As for the latest data, it provides 11 types of main information by default, but 12 more statistic type are provided in the “latest$detail”, they also have corresponding explanation on the bottom.

x = res$latest
head(x$detail)  # more detail data 

Historical data

Historical data is useful in retrospective analysis or to establish predictive models, the operation is similar as latest data, user could get the data frame for all countries or some specific countries within c() vector, such as head(Z[c(country1,country2,country3)])

Z = res$historical
print(Z) # update time
## last update: 2023-02-27
head(Z["Global"])
head(Z[c("China","UK","USA")]) 

For the following countries, we provide detail province data, which can be obtained in a similar way but within [ operation: head(Z[country,province])

  • Australia Canada China Denmark France Netherlands
 head(Z['China','hubei'])

For users’ own historical data, we provide a convert() function, users could convert other data into class of nCov2019History data, and then explore in nCov2019:

userowndata <- read.csv("path_to_user_data.csv")
# userowndata, it should contain these 6 column: 
# "country","province","date","cases","deaths","recovered"
Z = convert(data=userowndata)
head(Z["Global"])

Visualization

We provide a visualization function as a redesign “plot”.

 plot(
   x,
   region = "Global",
   continuous_scale = FALSE,
   palette = "Reds",
   date = NULL,
   from = NULL,
   to = NULL,
   title = "COVID-19",
   type = "cases",
   ...
 )

Here, type could be one of “cases”,“deaths”,“recovered”,“active”,“todayCases”,“todayDeaths”,“todayRecovered”,“population” and “tests”. By default, color palette is “Reds”, more color palettes can be found here: palette.

To get the overview for the latest status, the mini code required is as below:

X <- res$latest
plot(X)
## Warning in geom_map(aes_string("long", "lat", map_id = "region", group =
## "group", : Ignoring unknown aesthetics: x and y

Or To get the overview for the detection testing status,

plot(X, type="tests",palette="Green")