Notes:

The goal of this exercise is build an analysis of potential gerrymandering in Michigan based on the model provided by the textbook for North Carolina.

Loading and Wrangling the Data

See install notes for fec12 in the text (https://github.com/baumer-lab/fec12).

library(fec12)

results_house
## # A tibble: 2,343 × 13
##    state district_id cand_id   incumbent party primary_votes primary_percent
##    <chr> <chr>       <chr>     <lgl>     <chr>         <dbl>           <dbl>
##  1 AL    01          H2AL01077 TRUE      R             48702          0.555 
##  2 AL    01          H2AL01176 FALSE     R             21308          0.243 
##  3 AL    01          H2AL01184 FALSE     R             13809          0.158 
##  4 AL    01          H0AL01030 FALSE     R              3854          0.0440
##  5 AL    02          H0AL02087 TRUE      R                NA         NA     
##  6 AL    02          H2AL02141 FALSE     D                NA         NA     
##  7 AL    03          H2AL03032 TRUE      R                NA         NA     
##  8 AL    03          H2AL03099 FALSE     D                NA         NA     
##  9 AL    04          H6AL04098 TRUE      R                NA         NA     
## 10 AL    04          H2AL04055 FALSE     D             10971          0.514 
## # ℹ 2,333 more rows
## # ℹ 6 more variables: runoff_votes <dbl>, runoff_percent <dbl>,
## #   general_votes <dbl>, general_percent <dbl>, won <lgl>, footnotes <chr>
district_elections <- results_house %>%
  mutate(district = parse_number(district_id)) %>%
  group_by(state, district) %>%
  summarize(
    N = n(),
    total_votes = sum(general_votes, na.rm = TRUE),
    d_votes = sum(ifelse(party == "D", general_votes, 0), na.rm = TRUE),
    r_votes = sum(ifelse(party == "R", general_votes, 0), na.rm = TRUE)
  ) %>%
  mutate(
    other_votes = total_votes - d_votes - r_votes,
    r_prop = r_votes / total_votes,
    winner = ifelse(r_votes > d_votes, "Republican", "Democrat")
  )
## `summarise()` has grouped output by 'state'. You can override using the
## `.groups` argument.
mi_results <- district_elections %>%
  filter(state == "MI")
mi_results %>%
  select(-state)
## Adding missing grouping variables: `state`
## # A tibble: 14 × 9
## # Groups:   state [1]
##    state district     N total_votes d_votes r_votes other_votes r_prop winner   
##    <chr>    <dbl> <int>       <dbl>   <dbl>   <dbl>       <dbl>  <dbl> <chr>    
##  1 MI           1     4      347037  165179  167060       14798  0.481 Republic…
##  2 MI           2     5      318267       0  194653      123614  0.612 Republic…
##  3 MI           3     4      326281  144108  171675       10498  0.526 Republic…
##  4 MI           4     5      312949  104996  197386       10567  0.631 Republic…
##  5 MI           5     5      330146  214531  103931       11684  0.315 Democrat 
##  6 MI           6     5      320475  136563  174955        8957  0.546 Republic…
##  7 MI           7     6      318069  136849  169668       11552  0.533 Republic…
##  8 MI           8     6      345054  128657  202217       14180  0.586 Republic…
##  9 MI           9     6      337316  208846  114760       13710  0.340 Democrat 
## 10 MI          10     4      328612   97734  226075        4803  0.688 Republic…
## 11 MI          11    15      687253  318137  333524       35592  0.485 Republic…
## 12 MI          12     5      319223  216884   92472        9867  0.290 Democrat 
## 13 MI          13     8      284270  235336   38769       10165  0.136 Democrat 
## 14 MI          14     8      328792  270450   51395        6947  0.156 Democrat

This is a bigger vote distribution than in North Carolina, and yet, Republicans won 9 of the 14 districts.

mi_results %>%
  skim(total_votes) %>%
  select(-na)

Variable type: numeric

var state n mean sd p0 p25 p50 p75 p100
total_votes MI 14 350267.4 98189.55 284270 318506 327446.5 335523.5 687253

Here is the key data. Districts 12&ndash14 are heavily democratic and are the three most concentrated districts for either party. Then come districts 10 (Republican) and 5 (Democratic).

mi_results %>% 
  select(district, r_prop, winner) %>% 
  arrange(desc(r_prop))
## Adding missing grouping variables: `state`
## # A tibble: 14 × 4
## # Groups:   state [1]
##    state district r_prop winner    
##    <chr>    <dbl>  <dbl> <chr>     
##  1 MI          10  0.688 Republican
##  2 MI           4  0.631 Republican
##  3 MI           2  0.612 Republican
##  4 MI           8  0.586 Republican
##  5 MI           6  0.546 Republican
##  6 MI           7  0.533 Republican
##  7 MI           3  0.526 Republican
##  8 MI          11  0.485 Republican
##  9 MI           1  0.481 Republican
## 10 MI           9  0.340 Democrat  
## 11 MI           5  0.315 Democrat  
## 12 MI          12  0.290 Democrat  
## 13 MI          14  0.156 Democrat  
## 14 MI          13  0.136 Democrat

The democratic winning margin in raw votes was higher in Michigan.

mi_results %>%
  summarize(
    N = n(), 
    state_votes = sum(total_votes), 
    state_d = sum(d_votes), 
    state_r = sum(r_votes)
  ) %>%
  mutate(
    d_prop = state_d / state_votes, 
    r_prop = state_r / state_votes
  )
## # A tibble: 1 × 7
##   state     N state_votes state_d state_r d_prop r_prop
##   <chr> <int>       <dbl>   <dbl>   <dbl>  <dbl>  <dbl>
## 1 MI       14     4903744 2378270 2238540  0.485  0.456

Presenting the Data

library(sf)
## Linking to GEOS 3.11.2, GDAL 3.6.2, PROJ 9.2.0; sf_use_s2() is TRUE
fs::path("data")
## data

This code downloads and unzips the district data file used in the text. Note that the sub-directory named by the fs::path() argument must exit.

# Can't get this to run, so just download and unzip manually.
# src <- "http://cdmaps.polisci.ucla.edu/shp/districts113.zip"
# dsn_districts <- usethis::use_zip(src, destdir = fs::path("data"))

When the data is downloaded/unzipped into data/districtShapes/, this code reads it more quickly.

dsn_districts <- fs::path(fs::path_wd(), "data", "districts113", "districtShapes")

st_layers(dsn_districts)
## Driver: ESRI Shapefile 
## Available layers:
##     layer_name geometry_type features fields crs_name
## 1 districts113       Polygon      436     15    NAD83
districts <- st_read(dsn_districts, layer = "districts113") %>%
  mutate(DISTRICT = parse_number(as.character(DISTRICT))) %>% 
  janitor::clean_names()
## Reading layer `districts113' from data source 
##   `C:\projects\info601\docs\13geospatial\lab\data\districts113\districtShapes' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 436 features and 15 fields (with 1 geometry empty)
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -179.1473 ymin: 18.91383 xmax: 179.7785 ymax: 71.35256
## Geodetic CRS:  NAD83
glimpse(districts)
## Rows: 436
## Columns: 16
## $ statename  <chr> "Louisiana", "Maine", "Maine", "Maryland", "Maryland", "Mar…
## $ id         <chr> "022113114006", "023113114001", "023113114002", "0241131140…
## $ district   <dbl> 6, 1, 2, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 9,…
## $ startcong  <chr> "113", "113", "113", "113", "113", "113", "113", "113", "11…
## $ endcong    <chr> "114", "114", "114", "114", "114", "114", "114", "114", "11…
## $ districtsi <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ county     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ page       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ law        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ note       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ bestdec    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ finalnote  <chr> "{\"From US Census website\"}", "{\"From US Census website\…
## $ rnote      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ lastchange <chr> "2016-05-29 16:44:10.857626", "2016-05-29 16:44:10.857626",…
## $ fromcounty <chr> "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F",…
## $ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((-91.82288 3..., MULTIPOLYGON (…
library(sf)
library(ggspatial)

mi_shp <- districts %>%
 filter(statename == "Michigan")

mi_shp %>%
  ggplot() +
  geom_sf()

# Not sure what this non-standard stuff did for us.
# mi_shp %>%
#  st_geometry() %>%
#  plot(col = gray.colors(nrow(mi_shp)))
mi_merged <- mi_shp %>%
  st_transform(4326) %>%
  inner_join(mi_results, by = c("district" = "district"))
glimpse(mi_merged)
## Rows: 14
## Columns: 24
## $ statename   <chr> "Michigan", "Michigan", "Michigan", "Michigan", "Michigan"…
## $ id          <chr> "026113114001", "026113114002", "026113114003", "026113114…
## $ district    <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14
## $ startcong   <chr> "113", "113", "113", "113", "113", "113", "113", "113", "1…
## $ endcong     <chr> "114", "114", "114", "114", "114", "114", "114", "114", "1…
## $ districtsi  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ county      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ page        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ law         <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ note        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ bestdec     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ finalnote   <chr> "{\"From US Census website\"}", "{\"From US Census website…
## $ rnote       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ lastchange  <chr> "2016-05-29 16:44:10.857626", "2016-05-29 16:44:10.857626"…
## $ fromcounty  <chr> "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F"…
## $ state       <chr> "MI", "MI", "MI", "MI", "MI", "MI", "MI", "MI", "MI", "MI"…
## $ N           <int> 4, 5, 4, 5, 5, 5, 6, 6, 6, 4, 15, 5, 8, 8
## $ total_votes <dbl> 347037, 318267, 326281, 312949, 330146, 320475, 318069, 34…
## $ d_votes     <dbl> 165179, 0, 144108, 104996, 214531, 136563, 136849, 128657,…
## $ r_votes     <dbl> 167060, 194653, 171675, 197386, 103931, 174955, 169668, 20…
## $ other_votes <dbl> 14798, 123614, 10498, 10567, 11684, 8957, 11552, 14180, 13…
## $ r_prop      <dbl> 0.4813896, 0.6116028, 0.5261569, 0.6307290, 0.3148031, 0.5…
## $ winner      <chr> "Republican", "Republican", "Republican", "Republican", "D…
## $ geometry    <MULTIPOLYGON [°]> MULTIPOLYGON (((-84.26702 4..., MULTIPOLYGON (((-86.43104 …
mi <- mi_merged %>%
  ggplot() +
  aes(fill = winner) +
  annotation_map_tile(zoom = 6, type = "osm") + 
  geom_sf(alpha = 0.5) +
  scale_fill_manual("Winner", values = c("blue", "red")) +
  geom_sf_label(aes(label = district), fill = "white") + 
  theme_void()
mi
## Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
## give correct results for longitude/latitude data
## Loading required namespace: raster
## The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
## which was just loaded, will retire in October 2023.
## Please refer to R-spatial evolution reports for details, especially
## https://r-spatial.org/r/2023/05/15/evolution4.html.
## It may be desirable to make the sf package available;
## package maintainers should consider adding sf to Suggests:.
## The sp package is now running under evolution status 2
##      (status 2 uses the sf package in place of rgdal)
## Please note that rgdal will be retired during October 2023,
## plan transition to sf/stars/terra functions using GDAL and PROJ
## at your earliest convenience.
## See https://r-spatial.org/r/2023/05/15/evolution4.html and https://github.com/r-spatial/evolution
## rgdal: version: 1.6-7, (SVN revision 1203)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.6.2, released 2023/01/02
## Path to GDAL shared files: C:/Users/kvlinden/AppData/Local/R/win-library/4.3/rgdal/gdal
##  GDAL does not use iconv for recoding strings.
## GDAL binary built with GEOS: TRUE 
## Loaded PROJ runtime: Rel. 9.2.0, March 1st, 2023, [PJ_VERSION: 920]
## Path to PROJ shared files: C:/Users/kvlinden/AppData/Local/R/win-library/4.3/rgdal/proj
## PROJ CDN enabled: FALSE
## Linking to sp version:2.0-0
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading sp or rgdal.
## Zoom: 6
## Fetching 6 missing tiles
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |======================================================================| 100%
## ...complete!

Here’s a map showing proportions.

mi +
  aes(fill = r_prop) + 
  scale_fill_distiller(
    "Proportion\nRepublican", 
    palette = "RdBu"
#    limits = c(0.2, 0.8)
  )
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
## give correct results for longitude/latitude data
## Zoom: 6

Here are versions of these maps with better labels (N. Tammenga, Spring 2022).

library(ggspatial)
library(ggrepel)

mi <- ggplot(data = mi_merged, 
             aes(fill = winner)) +
  annotation_map_tile(zoom = 6, 
                      type = "osm") +
  geom_sf(alpha = 0.5) +
  scale_fill_manual("Winner", 
                  values = c("blue", "red")) +
  theme_void()
  
  
mi1 <- mi + ggrepel::geom_label_repel(data = mi_merged %>%
                                   sf::st_set_geometry(NULL) %>%
                                   bind_cols(mi_merged %>% 
                                        sf::st_centroid() %>% 
                                        sf::st_coordinates() %>% as_tibble()),
                                 aes(label = district, x = X, y = Y), color = 'white')
## Warning: st_centroid assumes attributes are constant over geometries
mi1
## Zoom: 6

mi2 <- mi + 
  aes(fill = r_prop) +
  scale_fill_distiller(
    "Proportion\nRepublican",
    palette = "RdBu",
    limits = c(0.1, 0.9)
  ) +
  ggrepel::geom_label_repel(data = mi_merged %>%
                            sf::st_set_geometry(NULL) %>%
                            bind_cols(mi_merged %>% 
                                        sf::st_centroid() %>% 
                                        sf::st_coordinates() %>% as_tibble()),
                            aes(label = district, x = X, y = Y)
  )
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## Warning: st_centroid assumes attributes are constant over geometries
mi2
## Zoom: 6

Conclusions

There is gerrymandering, but it can be hard to see in the previous maps because of the small size of the packed democratic districts. The zooming map makes them easier to see.

library(leaflet)
pal <- colorNumeric(palette = "RdBu", domain = c(0, 1))

leaflet_mi <- leaflet(mi_merged) %>% 
  addTiles() %>%
  addPolygons(
    weight = 1, fillOpacity = 0.7, 
    color = ~pal(1 - r_prop),
    popup = ~paste("District", district, "</br>", round(r_prop, 4))
  ) %>%
  setView(lng = -85, lat = 43.5, zoom = 6.2)

leaflet_mi

Here’s a nice summary from N. Tammenga, Spring 2022

Going back to the goal of looking for gerrymandering. As we saw in the analysis on North Carolina, many of the districts were very warped without much reason. In contrast, much of Michigan appears block-like and tends to be more uniform. However, we can see that the Detroit area is most likely dealing with gerrymandering. Blocking the city of Detroit together makes sense because it is a densely populated area. However, the branches out from Detroit are less justifiable. Most notably, a district branches out from southern Detroit to grab Ann Arbor. Essentially, while gerrymandering is likely an issue as seen in the Detroit-Ann Arbor issue, compared to North Carolina’s tortured looking districts, Michigan appears to have less of an issue with gerrymandering.