R/usa_facts_data.R
usa_facts_data.Rd
A single dataset from USAFacts that has state and county data included.
usa_facts_data()
a tidy data.table with columns:
fips
county
state (two-letter abbreviation)
subset: deaths
or confirmed
date: observation date
count: case count for that date on that local
From the USAFacts website:
Methodology: This interactive feature aggregates data from the Centers for Disease Control and Prevention (CDC), state- and local-level public health agencies. County-level data is confirmed by referencing state and local agencies directly.
The data for all states was last updated on March 27, 2020, at 7:00 AM Pacific/10:00 AM Eastern Time. We've noted below when we last checked data from the states.
The 21 cases confirmed on the Grand Princess cruise ship on March 5 and 6 are attributed to the state of California, but not to any counties. The national numbers also include the 45 people with coronavirus repatriated from the Diamond Princess.
USAFacts attempts to match each case with a county, but some cases counted at the state level are not allocated to counties due to lack of information.
Because of the frequency with which we are currently updating this data, they may not reflect the exact numbers reported state and local government organizations or the news media. Numbers may also fluctuate as agencies update their own data. At present, we are working on ensuring that we can provide this data with the most up-to-date information possible.
Uses https://usafacts.org/visualizations/coronavirus-covid-19-spread-map/ as data source, then modifies column names and munges to long form table.
From the folks at USAFacts
Want to use USAFacts county-level COVID-19 data? Download it here. The data is available under a Creative Commons license. We simply request that you cite USAFacts as the data provider and link back to this page. Don’t forget to share what you've created with the USAFacts data. Please tag @usafacts on social media and use the hashtag #MadewithUSAFacts. We'll reshare the posts with the data-loving community.
Other data-import:
acaps_government_measures_data()
,
acaps_secondary_impact_data()
,
apple_mobility_data()
,
beoutbreakprepared_data()
,
cci_us_vaccine_data()
,
cdc_aggregated_projections()
,
cdc_excess_deaths()
,
cdc_social_vulnerability_index()
,
coronadatascraper_data()
,
coronanet_government_response_data()
,
cov_glue_lineage_data()
,
cov_glue_newick_data()
,
cov_glue_snp_lineage()
,
covidtracker_data()
,
descartes_mobility_data()
,
ecdc_data()
,
econ_tracker_consumer_spending
,
econ_tracker_employment
,
econ_tracker_unemp_data
,
economist_excess_deaths()
,
financial_times_excess_deaths()
,
google_mobility_data()
,
government_policy_timeline()
,
jhu_data()
,
jhu_us_data()
,
kff_icu_beds()
,
nytimes_county_data()
,
oecd_unemployment_data()
,
owid_data()
,
param_estimates_published()
,
test_and_trace_data()
,
us_county_geo_details()
,
us_county_health_rankings()
,
us_healthcare_capacity()
,
us_hospital_details()
,
us_state_distancing_policy()
,
who_cases()
Other case-tracking:
align_to_baseline()
,
beoutbreakprepared_data()
,
bulk_estimate_Rt()
,
combined_us_cases_data()
,
coronadatascraper_data()
,
covidtracker_data()
,
ecdc_data()
,
estimate_Rt()
,
jhu_data()
,
nytimes_county_data()
,
owid_data()
,
plot_epicurve()
,
test_and_trace_data()
,
who_cases()
res = usa_facts_data()
colnames(res)
#> [1] "fips" "county" "state" "subset" "date" "count"
head(res)
#> # A tibble: 6 × 6
#> fips county state subset date count
#> <chr> <chr> <chr> <chr> <date> <int>
#> 1 00000 Statewide Unallocated AL confirmed NA 1
#> 2 00000 Statewide Unallocated AL confirmed NA 0
#> 3 00000 Statewide Unallocated AL confirmed NA 0
#> 4 00000 Statewide Unallocated AL confirmed NA 0
#> 5 00000 Statewide Unallocated AL confirmed NA 0
#> 6 00000 Statewide Unallocated AL confirmed NA 0
dplyr::glimpse(res)
#> Rows: 5,337,860
#> Columns: 6
#> $ fips <chr> "00000", "00000", "00000", "00000", "00000", "00000", "00000", …
#> $ county <chr> "Statewide Unallocated", "Statewide Unallocated", "Statewide Un…
#> $ state <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL…
#> $ subset <chr> "confirmed", "confirmed", "confirmed", "confirmed", "confirmed"…
#> $ date <date> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ count <int> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
summary(res)
#> fips county state subset
#> Length:5337860 Length:5337860 Length:5337860 Length:5337860
#> Class :character Class :character Class :character Class :character
#> Mode :character Mode :character Mode :character Mode :character
#>
#>
#>
#>
#> date count
#> Min. :NA Min. : -6
#> 1st Qu.:NA 1st Qu.: 10
#> Median :NA Median : 100
#> Mean :NaN Mean : 4570
#> 3rd Qu.:NA 3rd Qu.: 1478
#> Max. :NA Max. :2751220
#> NA's :5337860
# dataset inclusion as of this build
max(res$date)
#> [1] NA