To provide convenient access to epidemiological data on the coronavirus outbreak, we developed an R package, nCov2019 (https://github.com/yulab-smu/nCov2019). Besides detailed basis statistics, it also includes information about vaccine development and therapeutics candidates. We redesigned the function plot() for geographic maps visualization and provided a interactive shiny app. These analytics tools could be useful in informing the public and studying how this and similar viruses spread in populous countries.
Our R package is designed for both command line and dashboard
interaction analysis, As show in diagram, while dashboard()
is the main entry for the GUI explore part, the query()
is
the main function used in CLI explore part, 5 types of data were contain
in its return result. Result type were explain in Statistic
query part.
To start off, users could utilize the ‘remotes’ package to install it directly from GitHub by running the following in R:
remotes::install_github("yulab-smu/nCov2019", dependencies = TRUE)
Data query is simple as one command:
library("nCov2019")
res <- query()
## Querying the latest data...
## last update: 2023-03-01
## Querying the global data...
## Gloabl total 679972187 cases; and 6800189 deaths
## Gloabl total affect country or areas: 231
## Gloabl total recovered cases: 26867
## last update: 2023-03-01
## Querying the historical data...
## Query finish, each time you can launch query() to reflash the data
This may take seconds to few minutes, which depend on the users’ network connection, if the user connection is broken, a local stored version data will be used for demo.
The result returned by query()
function will contains 5
types of statistic:
names(res)
## [1] "latest" "global" "historical"
global
The global overall summary statisticlatest
The global latest statistic for all
countrieshistorical
The historical statistic for all
countriesvaccine
The current vaccine development
progresstherapeutics
The current therapeutics development
progressThe query()
only need to be performed once in a session,
print each of statistic objects, users could get their update time. And
for the vaccine
and therapeutics
query
results, print them will return the candidates number.
The query result of global status will contain a data frame with 21
types of statistic, which have detail explanation on the bottom of this
documents. And summary(x)
will return overview of global
status.
x = res$global
x$affectedCountries # total affected countries
## [1] 231
summary(x)
## Gloabl total 679972187 cases; and 6800189 deaths
## Gloabl total affect country or areas: 231
## Gloabl total recovered cases: 26867
## last update: 2023-03-01
Here is the example for operating latest data. once again, all data
have queried and store in res
.
x = res$latest
And then print(x)
will return the update time for the
latest data
print(x) # check update time
## last update: 2023-03-01
To subset latest data could be easily done by using [
.
x["Global"]
or x["global"]
will return the
data frame for all countries but users could determine a specific
country, such as:
head(x["Global"]) # return all global countries.
x[c("USA","India")] # return only for USA and India
The data is order by “todayCases” column, users could sort them by other order.
df = x["Global"]
head(df[order(df$cases, decreasing = T),])
As for the latest data, it provides 11 types of main information by default, but 12 more statistic type are provided in the “latest$detail”, they also have corresponding explanation on the bottom.
x = res$latest
head(x$detail) # more detail data
Historical data is useful in retrospective analysis or to establish
predictive models, the operation is similar as latest data, user could
get the data frame for all countries or some specific countries within
c()
vector, such as
head(Z[c(country1,country2,country3)])
Z = res$historical
print(Z) # update time
## last update: 2023-02-27
head(Z["Global"])
head(Z[c("China","UK","USA")])
For the following countries, we provide detail province data, which
can be obtained in a similar way but within [
operation:
head(Z[country,province])
Australia
Canada
China
Denmark
France
Netherlands
head(Z['China','hubei'])
For users’ own historical data, we provide a convert()
function, users could convert other data into class of nCov2019History
data, and then explore in nCov2019:
userowndata <- read.csv("path_to_user_data.csv")
# userowndata, it should contain these 6 column:
# "country","province","date","cases","deaths","recovered"
Z = convert(data=userowndata)
head(Z["Global"])
We provide a visualization function as a redesign “plot”.
plot(
x,
region = "Global",
continuous_scale = FALSE,
palette = "Reds",
date = NULL,
from = NULL,
to = NULL,
title = "COVID-19",
type = "cases",
...
)
Here, type could be one of “cases”,“deaths”,“recovered”,“active”,“todayCases”,“todayDeaths”,“todayRecovered”,“population” and “tests”. By default, color palette is “Reds”, more color palettes can be found here: palette.
To get the overview for the latest status, the mini code required is as below:
X <- res$latest
plot(X)
## Warning in geom_map(aes_string("long", "lat", map_id = "region", group =
## "group", : Ignoring unknown aesthetics: x and y
Or To get the overview for the detection testing status,
plot(X, type="tests",palette="Green")