library(tidyverse)
This document presents an analysis of recent presidential popularity as inspired by examples from FiveThirtyEight & Data Science in a Box. Here’s the target plot, which appeared on the website on October 4, 2021.
This web plot has a pull-down box for the poll type, and the plotted values are animated. We won’t be able to reproduce that directly in a static ggplot.
We will base our analysis on the assumption that the approval ratings in the FiveThirtyEight datasets are accurate and useful. For a discussion of how the data were collected and processed, see this article by N. Rakich: How We’re Tracking Joe Biden’s Approval Rating.
The dataset was downloaded directly from the FiveThirtyEight website.
approval_raw_biden <- read_csv("data/approval_topline_biden.csv")
approval_raw_biden
## # A tibble: 1,161 × 10
## president subgroup modeldate appro…¹ appro…² appro…³ disap…⁴ disap…⁵ disap…⁶
## <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Joe Biden All polls 2/13/2022 41.8 45.7 37.9 53.1 57.8 48.3
## 2 Joe Biden Adults 2/13/2022 41.5 45.5 37.6 52.9 57.4 48.4
## 3 Joe Biden Voters 2/13/2022 42.3 46.3 38.3 52.4 57.2 47.6
## 4 Joe Biden All polls 2/12/2022 41.4 45.3 37.6 52.5 57.2 47.8
## 5 Joe Biden Adults 2/12/2022 40.7 44.6 36.8 52.0 56.3 47.7
## 6 Joe Biden Voters 2/12/2022 42.3 46.3 38.3 52.4 57.2 47.6
## 7 Joe Biden All polls 2/11/2022 41.4 45.3 37.6 52.5 57.2 47.8
## 8 Joe Biden Adults 2/11/2022 40.7 44.6 36.8 52.0 56.3 47.7
## 9 Joe Biden Voters 2/11/2022 42.3 46.3 38.3 52.4 57.2 47.6
## 10 Joe Biden All polls 2/10/2022 41.3 45.2 37.3 52.6 57.3 47.8
## # … with 1,151 more rows, 1 more variable: timestamp <chr>, and abbreviated
## # variable names ¹approve_estimate, ²approve_hi, ³approve_lo,
## # ⁴disapprove_estimate, ⁵disapprove_hi, ⁶disapprove_lo
We’ll focus on the approval estimates over time, renaming some columns, processing the date character string, and ensuring a consistent spelling of Biden’s name.
approval_biden <- approval_raw_biden %>%
select(president,
subgroup,
date = modeldate,
approval = approve_estimate,
disapproval = disapprove_estimate) %>%
mutate(date = lubridate::mdy(date),
president = "Joe Biden",
) %>%
filter(subgroup != "All polls")
approval_biden
## # A tibble: 774 × 5
## president subgroup date approval disapproval
## <chr> <chr> <date> <dbl> <dbl>
## 1 Joe Biden Adults 2022-02-13 41.5 52.9
## 2 Joe Biden Voters 2022-02-13 42.3 52.4
## 3 Joe Biden Adults 2022-02-12 40.7 52.0
## 4 Joe Biden Voters 2022-02-12 42.3 52.4
## 5 Joe Biden Adults 2022-02-11 40.7 52.0
## 6 Joe Biden Voters 2022-02-11 42.3 52.4
## 7 Joe Biden Adults 2022-02-10 40.7 52.0
## 8 Joe Biden Voters 2022-02-10 42.1 52.6
## 9 Joe Biden Adults 2022-02-09 40.9 51.8
## 10 Joe Biden Voters 2022-02-09 42.1 52.6
## # … with 764 more rows
approval_biden %>%
distinct(president)
## # A tibble: 1 × 1
## president
## <chr>
## 1 Joe Biden
We note that the ratings values are split between two columns, which doesn’t allow us to easily plot both approval and disapproval ratings in a single, 2D graph. To do this, we need all the rating values to be in a single column with an additional column indicating the rating type, approval or disapproval.
approval_longer_biden <- approval_biden %>%
pivot_longer(
cols = c(approval, disapproval),
names_to = "rating_type",
values_to = "rating_value"
)
approval_longer_biden
## # A tibble: 1,548 × 5
## president subgroup date rating_type rating_value
## <chr> <chr> <date> <chr> <dbl>
## 1 Joe Biden Adults 2022-02-13 approval 41.5
## 2 Joe Biden Adults 2022-02-13 disapproval 52.9
## 3 Joe Biden Voters 2022-02-13 approval 42.3
## 4 Joe Biden Voters 2022-02-13 disapproval 52.4
## 5 Joe Biden Adults 2022-02-12 approval 40.7
## 6 Joe Biden Adults 2022-02-12 disapproval 52.0
## 7 Joe Biden Voters 2022-02-12 approval 42.3
## 8 Joe Biden Voters 2022-02-12 disapproval 52.4
## 9 Joe Biden Adults 2022-02-11 approval 40.7
## 10 Joe Biden Adults 2022-02-11 disapproval 52.0
## # … with 1,538 more rows
One should ask which dataset is properly tidy, the original dataset or this restructured dataset. In some sense, the approval/disapproval values are all ratings, but adding them up or averaging them makes little sense, even if we represent disapprovals as negative ratings. Also, it may be easier to add additional rating types to the new structure because the original requires adding more columns, but, again, it’s not clear that this modification makes the dataset more tidy (n.b., the original dataset had additional rating types in separate columns). Consequently, we acknowledge that for the purposes of this example, we’ve pivoted the data to present it, not to tidy it.
We’re now ready to re-engineer an approximation of FiveThirtyEight’s original plot.
approval_longer_biden %>%
ggplot() +
aes(x = date,
y = rating_value,
color = rating_type,
) +
geom_line() +
facet_wrap(vars(subgroup)) +
scale_color_manual(values = c("darkgreen", "orange")) +
labs(
x = "Date", y = "Rating",
color = NULL,
title = "How (un)popular is Joe Biden?",
subtitle = "Estimates based on polls of all adults and polls of likely/registered voters",
caption = "Source: FiveThirtyEight modeling estimates"
) +
theme_minimal()
FiveThirtyEight also presents plots of approval data for some previous presidents, back through H. Truman (see: How Biden compares with past presidents). We’ve downloaded the available data, which only goes back through D. Trump, and processed it in a manner similar to what we did for J. Biden’s approval data.
approval_raw_trump <- read_csv("data/approval_topline_trump.csv")
approval_longer_trump <- approval_raw_trump %>%
select(president,
subgroup,
date = modeldate,
approval = approve_estimate,
disapproval = disapprove_estimate) %>%
mutate(date = lubridate::mdy(date)) %>%
filter(subgroup != "All polls") %>%
pivot_longer(
cols = c(approval, disapproval),
names_to = "rating_type",
values_to = "rating_value"
)
approval_longer_trump
## # A tibble: 5,836 × 5
## president subgroup date rating_type rating_value
## <chr> <chr> <date> <chr> <dbl>
## 1 Donald Trump Voters 2021-01-20 approval 39.4
## 2 Donald Trump Voters 2021-01-20 disapproval 56.7
## 3 Donald Trump Adults 2021-01-20 approval 37.0
## 4 Donald Trump Adults 2021-01-20 disapproval 59.6
## 5 Donald Trump Adults 2021-01-19 approval 38.1
## 6 Donald Trump Adults 2021-01-19 disapproval 59.1
## 7 Donald Trump Voters 2021-01-19 approval 40.2
## 8 Donald Trump Voters 2021-01-19 disapproval 55.8
## 9 Donald Trump Adults 2021-01-18 approval 36.1
## 10 Donald Trump Adults 2021-01-18 disapproval 60.6
## # … with 5,826 more rows
To compare the presidents, we combine these datasets into one using the union function, which produces the set union of the records in the two datasets. Here, it is important that the two datasets have exactly the same columns.
approval_trump_biden <- bind_rows(approval_longer_trump,
approval_longer_biden
)
approval_trump_biden
## # A tibble: 7,384 × 5
## president subgroup date rating_type rating_value
## <chr> <chr> <date> <chr> <dbl>
## 1 Donald Trump Voters 2021-01-20 approval 39.4
## 2 Donald Trump Voters 2021-01-20 disapproval 56.7
## 3 Donald Trump Adults 2021-01-20 approval 37.0
## 4 Donald Trump Adults 2021-01-20 disapproval 59.6
## 5 Donald Trump Adults 2021-01-19 approval 38.1
## 6 Donald Trump Adults 2021-01-19 disapproval 59.1
## 7 Donald Trump Voters 2021-01-19 approval 40.2
## 8 Donald Trump Voters 2021-01-19 disapproval 55.8
## 9 Donald Trump Adults 2021-01-18 approval 36.1
## 10 Donald Trump Adults 2021-01-18 disapproval 60.6
## # … with 7,374 more rows
Now, we can reproduce the approval chart for the past two presidents as a time series.
approval_trump_biden %>%
ggplot() +
aes(x = date,
y = rating_value,
color = rating_type,
) +
geom_line() +
facet_grid(vars(subgroup)) +
scale_color_manual(values = c("darkgreen", "orange")) +
labs(
x = "Date", y = "Rating",
color = NULL,
title = "How (un)popular are Trump (2017-2021) & Biden (2021-present)?",
subtitle = "Estimates based on polls of all adults and polls of likely/registered voters",
caption = "Source: FiveThirtyEight modeling estimates"
) +
theme_minimal()
We see here that the approval ratings of republican D. Trump and democrat J. Biden reversed when Biden took office in January 2021, but have seen another reversal in 2022.