This function access and munges the cumulative time series confirmed, deaths and recovered from the data in the repository for the 2019 Novel Coronavirus Visual Dashboard operated by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). Also, Supported by ESRI Living Atlas Team and the Johns Hopkins University Applied Physics Lab (JHU APL).

jhu_data()

Value

A tidy data.frame (actually, a tbl_df) with columns:

  • ProvinceState: Province or state. Note:

  • CountryRegion: This is the main column for finding countries of interest

  • Lat: Latitude

  • Long: Longitude

  • date: Date

  • count: The cumulative count of cases for a given geographic area.

  • subset: either confirmed, deaths, or recovered.

Details

Data are updated daily by JHU. Each call to this function redownloads the data from github. No data cleansing is performed. Data are downloaded and then munged into long-form tidy data.frame.

Note

Uses https://raw.githubusercontent.com/CSSEGISandData/... as data source, then modifies column names and munges to long form table.

  • US States are treated different from other countries, so are not directly included right now.

  • Although numbers are meant to be cumulative, there are instances where a day's count might be less than the prior day due to a reclassification of a case. These are not currently corrected in the source data

Examples

res = jhu_data()
colnames(res)
#> [1] "ProvinceState" "CountryRegion" "Lat"           "Long"         
#> [5] "date"          "count"         "subset"       
head(res)
#> # A tibble: 6 × 7
#>   ProvinceState CountryRegion   Lat  Long date       count subset   
#>   <chr>         <chr>         <dbl> <dbl> <date>     <dbl> <chr>    
#> 1 NA            Afghanistan    33.9  67.7 2020-01-22     0 confirmed
#> 2 NA            Afghanistan    33.9  67.7 2020-01-23     0 confirmed
#> 3 NA            Afghanistan    33.9  67.7 2020-01-24     0 confirmed
#> 4 NA            Afghanistan    33.9  67.7 2020-01-25     0 confirmed
#> 5 NA            Afghanistan    33.9  67.7 2020-01-26     0 confirmed
#> 6 NA            Afghanistan    33.9  67.7 2020-01-27     0 confirmed
dplyr::glimpse(res)
#> Rows: 701,406
#> Columns: 7
#> $ ProvinceState <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
#> $ CountryRegion <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanista…
#> $ Lat           <dbl> 33.93911, 33.93911, 33.93911, 33.93911, 33.93911, 33.939…
#> $ Long          <dbl> 67.70995, 67.70995, 67.70995, 67.70995, 67.70995, 67.709…
#> $ date          <date> 2020-01-22, 2020-01-23, 2020-01-24, 2020-01-25, 2020-01…
#> $ count         <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
#> $ subset        <chr> "confirmed", "confirmed", "confirmed", "confirmed", "con…