Understanding the potentially unmeasured death toll due to COVID-19 starts with understanding the expected vs observed death rates over time. This dataset presents observed and expected deaths by state by week of year for 2018 to present. In addition, the dataset can be broken down into all cause death and that attributable to COVID-19 based on reported COVID-19 deaths.

cdc_excess_deaths()

Value

a data.frame

Details

Estimates of excess deaths can provide information about the burden of mortality potentially related to COVID-19, beyond the number of deaths that are directly attributed to COVID-19. Excess deaths are typically defined as the difference between observed numbers of deaths and expected numbers. This visualization provides weekly data on excess deaths by jurisdiction of occurrence. Counts of deaths in more recent weeks are compared with historical trends to determine whether the number of deaths is significantly higher than expected.

Estimates of excess deaths can be calculated in a variety of ways, and will vary depending on the methodology and assumptions about how many deaths are expected to occur. Estimates of excess deaths presented in this webpage were calculated using Farrington surveillance algorithms (1). For each jurisdiction, a model is used to generate a set of expected counts, and the upper bound of the 95% Confidence Intervals (95% CI) of these expected counts is used as a threshold to estimate excess deaths. Observed counts are compared to these upper bound estimates to determine whether a significant increase in deaths has occurred. Provisional counts are weighted to account for potential underreporting in the most recent weeks. However, data for the most recent week(s) are still likely to be incomplete. Only about 60% of deaths are reported within 10 days of the date of death, and there is considerable variation by jurisdiction. More detail about the methods, weighting, data, and limitations can be found in the Technical Notes.

Author

Sean Davis seandavi@gmail.com

Examples

cdcdeaths = cdc_excess_deaths()
head(cdcdeaths)
#> # A tibble: 6 × 12
#>   date       state   deaths upper_bound_thres… exceeds_thresho… average_expecte…
#>   <date>     <chr>    <dbl>              <dbl> <lgl>                       <dbl>
#> 1 2017-01-07 Alabama   1121               1108 TRUE                         1032
#> 2 2017-01-14 Alabama   1130               1117 TRUE                         1045
#> 3 2017-01-21 Alabama   1048               1118 FALSE                        1049
#> 4 2017-01-28 Alabama   1026               1115 FALSE                        1045
#> 5 2017-02-04 Alabama   1036               1115 FALSE                        1043
#> 6 2017-02-11 Alabama   1058               1113 FALSE                        1041
#> # … with 6 more variables: excess_estimate <dbl>, type <chr>, outcome <chr>,
#> #   suppress <chr>, note <chr>, week_of_year <dbl>
colnames(cdcdeaths)
#>  [1] "date"                   "state"                  "deaths"                
#>  [4] "upper_bound_threshold"  "exceeds_threshold"      "average_expected_count"
#>  [7] "excess_estimate"        "type"                   "outcome"               
#> [10] "suppress"               "note"                   "week_of_year"          
table(cdcdeaths$outcome)
#> 
#>                     All causes All causes, excluding COVID-19 
#>                          29916                          14958 


# Examine excess deaths in three states
library(ggplot2)
library(dplyr)
interesting_states = c('Ohio', 'Pennsylvania', 'California')
ggplot(cdcdeaths %>% dplyr::filter(type=="Predicted (weighted)" &
                            outcome=="All causes" &
                            state %in% interesting_states),
        aes(x=date,y=deaths)) +
    geom_point(aes(color=deaths>upper_bound_threshold)) +
    geom_line(aes(x=date,y=upper_bound_threshold)) +
    facet_grid(rows=vars(state)) +
    ggtitle('Excess deaths over time')