Lab 1.2 - The Seattle Pets Dataset

This RMarkdown document presents a preliminary analysis of Seattle’s pet licenses dataset. It interleaves text, code, and code output.

Loading the Dataset

The following code chunk loads the tidyverse library, reads the CSV file containing the Seattle pet license dataset, and saves that dataset under the name seattle_pets. The code assumes that the CSV file is loaded in a sub-directory, data, of the directory that contains this document.

library(tidyverse)
seattle_pets <- read_csv("data/Seattle_Pet_Licenses.csv")

Viewing the Structure of the Dataset

We can now view the dataset as it is stored in R.

seattle_pets

## # A tibble: 46,062 x 7
##    `License Issue Date` `License Number` `Animal's Name` Species `Primary Breed`
##    <chr>                <chr>            <chr>           <chr>   <chr>          
##  1 November 12 2015     819997           Dixie           Dog     Terrier        
##  2 March 24 2016        900605           Chloe           Dog     Chihuahua, Sho~
##  3 May 21 2018          21081            Molly           Dog     Retriever, Lab~
##  4 May 27 2018          283603           Whidbey         Dog     Terrier        
##  5 June 18 2018         359079           Chinook         Dog     Retriever, Lab~
##  6 June 19 2018         S144848          Penny           Dog     Retriever, Lab~
##  7 June 21 2018         215454           Peggy Sue       Dog     Bulldog, French
##  8 July 08 2018         S144996          Summer          Dog     Australian She~
##  9 July 10 2018         S112107          Tess            Dog     Border Collie  
## 10 July 10 2018         S116838          Emmy            Dog     Schnauzer, Min~
## # ... with 46,052 more rows, and 2 more variables: `Secondary Breed` <chr>,
## #   `ZIP Code` <dbl>

Based on the information provided in the previous section, we can see that the pets dataset contains how many of the following:

Pets (i.e., rows, a.k.a. records): 🚧 ??
Variables (i.e., columns, a.k.a. fields): 🚧 ??

Analyzing the Dataset

We can now count the number of each species using the count() function.

count(seattle_pets, Species, sort=TRUE)

## # A tibble: 4 x 2
##   Species     n
##   <chr>   <int>
## 1 Dog     31893
## 2 Cat     14134
## 3 Goat       31
## 4 Pig         4

🚧 Replace this line with a description of what the output of the last code chunk tells us.

🚧 Finally, add one more code chunk that computes the most popular names in the dataset and describes the results.

Lab 1.2 - The Seattle Pets Dataset

Author Goes Here

Semester Goes Here

Loading the Dataset

Viewing the Structure of the Dataset

Analyzing the Dataset