This lab exercise introduces R and Posit/RStudio using data from the Seattle pet registration website. We will use Posit’s RStudio Integrated Development Environment (IDE) to manage data, analysis, and scripting.

The Dataset

Go to dataset website (linked above) and do the following:

  1. Review the description of the dataset and, in particular, study the data dictionary (i.e., “Columns in this Dataset”), which specifies the data fields stored for each pet registration entry.

  2. Download (i.e., “Export”) the dataset to your local machine and load it into MS Excel. Check that the data matches the website’s description. Verify that the fields listed in the data dictionary match, and check many entries there are.

This dataset is a well-formatted, comma-separated-values (CSV) file. If you have problems downloading the file, you can access this local copy (data from October 7, 2021).

RStudio

RStudio integrates file editing and R programming. Log in to the course Posit server (see the main course page for the link) and do the following:

  1. Console — The Console pane (on the lower left) provides an R interpreter into which you can enter R commands. Try typing the following commands:

    1. 1 + 1 — This arithmetic expression, and others like it, should evaluate properly.

    2. library(tidyverse) — This function call loads the Tidyverse library, which provides basic data wrangling and analysis functions.

  2. Files — The Files pane (on the lower right) provides a file browser. For both this lab and future labs, do the following:

    1. Create a “New Folder” for this course with an appropriate name (e.g., info601) and then, inside that folder, create a sub-folder named lab01 for this lab. Use this folder (e.g., info601/lab01) to store all your lab 1 files and to keep them separate from the files for your other lab and homework assignments.

    2. For this lab (and others), you’ll also need data files, so create a data sub-folder (e.g., info601/lab01/data) and “upload” the pet registration CSV file linked from the Moodle course into that folder.

  3. Editor — The Editor pane (which will appear on the upper left) provides a way to view and modify files. Do the following:

    1. Click on the CSV file in the File pane and select “View File”. This will open the raw CSV file in the editor pane.

    2. Be sure that you can answer these questions:

      1. What data is stored in the first line of the CSV file?

      2. How do the other lines of the file compare with the way they are presented in Excel?

      3. Why there are no spaces around the commas?

  4. Environment — The Environment pane (which will appear on the upper right) provides a view of the R objects currently being stored by RStudio. Do the following:

    1. Click on the CSV file again in the File pane, but this time select “Import Dataset”. This will open a formatted version of CSV file in the Editor pane and add the registry dataset to the Environment Pane. Verify that this happens.

    2. Create a new variable by typing x <- 1 in the Console pane. Verify that this adds a new object named x to the Environment pane.

There’s nothing to turn in for this exercise.

LS0tDQp0aXRsZTogIkxhYiAxLjEgLSBIZWxsbywgUlN0dWRpbyEiDQpvdXRwdXQ6DQogIGh0bWxfZG9jdW1lbnQ6DQogICAgY29kZV9kb3dubG9hZDogdHJ1ZQ0KLS0tDQoNCmBgYHtyIGluY2x1ZGU9RkFMU0V9DQprbml0cjo6b3B0c19jaHVuayRzZXQoZWNobyA9IFRSVUUpDQpgYGANCg0KVGhpcyBsYWIgZXhlcmNpc2UgaW50cm9kdWNlcyBbUl0oaHR0cHM6Ly93d3cuci1wcm9qZWN0Lm9yZy8pIGFuZCBbUG9zaXQvUlN0dWRpb10oaHR0cHM6Ly93d3cucG9zaXQuY28vKSB1c2luZyBkYXRhIGZyb20gdGhlIFtTZWF0dGxlIHBldCByZWdpc3RyYXRpb24gd2Vic2l0ZV0oaHR0cHM6Ly9kYXRhLnNlYXR0bGUuZ292L0NvbW11bml0eS9TZWF0dGxlLVBldC1MaWNlbnNlcy9qZ3V2LXQ5cmIpLg0KV2Ugd2lsbCB1c2UgUG9zaXQmcnNxdW87cyBSU3R1ZGlvIEludGVncmF0ZWQgRGV2ZWxvcG1lbnQgRW52aXJvbm1lbnQgKElERSkgdG8gbWFuYWdlIGRhdGEsIGFuYWx5c2lzLCBhbmQgc2NyaXB0aW5nLg0KDQojIyBUaGUgRGF0YXNldA0KDQpHbyB0byBkYXRhc2V0IHdlYnNpdGUgKGxpbmtlZCBhYm92ZSkgYW5kIGRvIHRoZSBmb2xsb3dpbmc6DQoNCjEuIFJldmlldyB0aGUgZGVzY3JpcHRpb24gb2YgdGhlIGRhdGFzZXQgYW5kLCBpbiBwYXJ0aWN1bGFyLCBzdHVkeSB0aGUgKmRhdGEgZGljdGlvbmFyeSogKGkuZS4sICJDb2x1bW5zIGluIHRoaXMgRGF0YXNldCIpLCB3aGljaCBzcGVjaWZpZXMgdGhlIGRhdGEgZmllbGRzIHN0b3JlZCBmb3IgZWFjaCBwZXQgcmVnaXN0cmF0aW9uIGVudHJ5Lg0KDQoyLiBEb3dubG9hZCAoaS5lLiwgIkV4cG9ydCIpIHRoZSBkYXRhc2V0IHRvIHlvdXIgbG9jYWwgbWFjaGluZSBhbmQgbG9hZCBpdCBpbnRvIE1TIEV4Y2VsLiBDaGVjayB0aGF0IHRoZSBkYXRhIG1hdGNoZXMgdGhlIHdlYnNpdGUncyBkZXNjcmlwdGlvbi4gVmVyaWZ5IHRoYXQgdGhlIGZpZWxkcyBsaXN0ZWQgaW4gdGhlIGRhdGEgZGljdGlvbmFyeSBtYXRjaCwgYW5kIGNoZWNrIG1hbnkgZW50cmllcyB0aGVyZSBhcmUuDQoNClRoaXMgZGF0YXNldCBpcyBhIHdlbGwtZm9ybWF0dGVkLCBjb21tYS1zZXBhcmF0ZWQtdmFsdWVzIChDU1YpIGZpbGUuIElmIHlvdSBoYXZlIHByb2JsZW1zIGRvd25sb2FkaW5nIHRoZSBmaWxlLCB5b3UgY2FuIGFjY2VzcyBbdGhpcyBsb2NhbCBjb3B5XShkYXRhL1NlYXR0bGVfUGV0X0xpY2Vuc2VzLmNzdikgKGRhdGEgZnJvbSBPY3RvYmVyIDcsIDIwMjEpLg0KDQojIyBSU3R1ZGlvDQoNClJTdHVkaW8gaW50ZWdyYXRlcyBmaWxlIGVkaXRpbmcgYW5kIFIgcHJvZ3JhbW1pbmcuIExvZyBpbiB0byB0aGUgY291cnNlIFBvc2l0IHNlcnZlciAoc2VlIHRoZSBtYWluIGNvdXJzZSBwYWdlIGZvciB0aGUgbGluaykgYW5kIGRvIHRoZSBmb2xsb3dpbmc6IA0KDQoxLiAqQ29uc29sZSogLS0tIFRoZSBDb25zb2xlIHBhbmUgKG9uIHRoZSBsb3dlciBsZWZ0KSBwcm92aWRlcyBhbiBSIGludGVycHJldGVyIGludG8gd2hpY2ggeW91IGNhbiBlbnRlciBSIGNvbW1hbmRzLiBUcnkgdHlwaW5nIHRoZSBmb2xsb3dpbmcgY29tbWFuZHM6DQoNCiAgICBhLiBgMSArIDFgIC0tLSBUaGlzICphcml0aG1ldGljIGV4cHJlc3Npb24qLCBhbmQgb3RoZXJzIGxpa2UgaXQsIHNob3VsZCBldmFsdWF0ZSBwcm9wZXJseS4NCiAgICANCiAgICBiLiBgbGlicmFyeSh0aWR5dmVyc2UpYCAtLS0gVGhpcyAqZnVuY3Rpb24gY2FsbCogbG9hZHMgdGhlIFRpZHl2ZXJzZSAqbGlicmFyeSosIHdoaWNoIHByb3ZpZGVzIGJhc2ljIGRhdGEgd3JhbmdsaW5nIGFuZCBhbmFseXNpcyBmdW5jdGlvbnMuDQoNCjIuICpGaWxlcyogLS0tIFRoZSBGaWxlcyBwYW5lIChvbiB0aGUgbG93ZXIgcmlnaHQpIHByb3ZpZGVzIGEgZmlsZSBicm93c2VyLiBGb3IgYm90aCB0aGlzIGxhYiBhbmQgZnV0dXJlIGxhYnMsIGRvIHRoZSBmb2xsb3dpbmc6DQoNCiAgICBhLiBDcmVhdGUgYSAiTmV3IEZvbGRlciIgZm9yIHRoaXMgY291cnNlIHdpdGggYW4gYXBwcm9wcmlhdGUgbmFtZSAoZS5nLiwgYGluZm82MDFgKSBhbmQgdGhlbiwgaW5zaWRlIHRoYXQgZm9sZGVyLCBjcmVhdGUgYSBzdWItZm9sZGVyIG5hbWVkIGBsYWIwMWAgZm9yIHRoaXMgbGFiLiBVc2UgdGhpcyBmb2xkZXIgKGUuZy4sIGBpbmZvNjAxL2xhYjAxYCkgdG8gc3RvcmUgYWxsIHlvdXIgbGFiIDEgZmlsZXMgYW5kIHRvIGtlZXAgdGhlbSBzZXBhcmF0ZSBmcm9tIHRoZSBmaWxlcyBmb3IgeW91ciBvdGhlciBsYWIgYW5kIGhvbWV3b3JrIGFzc2lnbm1lbnRzLg0KDQogICAgYi4gRm9yIHRoaXMgbGFiIChhbmQgb3RoZXJzKSwgeW91J2xsIGFsc28gbmVlZCBkYXRhIGZpbGVzLCBzbyBjcmVhdGUgYSBgZGF0YWAgc3ViLWZvbGRlciAoZS5nLiwgYGluZm82MDEvbGFiMDEvZGF0YWApIGFuZCAidXBsb2FkIiB0aGUgcGV0IHJlZ2lzdHJhdGlvbiBDU1YgZmlsZSBsaW5rZWQgZnJvbSB0aGUgTW9vZGxlIGNvdXJzZSBpbnRvIHRoYXQgZm9sZGVyLg0KICAgIA0KMy4gKkVkaXRvciogLS0tIFRoZSBFZGl0b3IgcGFuZSAod2hpY2ggd2lsbCBhcHBlYXIgb24gdGhlIHVwcGVyIGxlZnQpIHByb3ZpZGVzIGEgd2F5IHRvIHZpZXcgYW5kIG1vZGlmeSBmaWxlcy4gRG8gdGhlIGZvbGxvd2luZzoNCg0KICAgIGEuIENsaWNrIG9uIHRoZSBDU1YgZmlsZSBpbiB0aGUgRmlsZSBwYW5lIGFuZCBzZWxlY3QgIlZpZXcgRmlsZSIuIFRoaXMgd2lsbCBvcGVuIHRoZSByYXcgQ1NWIGZpbGUgaW4gdGhlIGVkaXRvciBwYW5lLg0KDQogICAgYi4gQmUgc3VyZSB0aGF0IHlvdSBjYW4gYW5zd2VyIHRoZXNlIHF1ZXN0aW9uczoNCiAgICANCiAgICAgICAgaS4gV2hhdCBkYXRhIGlzIHN0b3JlZCBpbiB0aGUgZmlyc3QgbGluZSBvZiB0aGUgQ1NWIGZpbGU/DQogIA0KICAgICAgICBpaS4gSG93IGRvIHRoZSBvdGhlciBsaW5lcyBvZiB0aGUgZmlsZSBjb21wYXJlIHdpdGggdGhlIHdheSB0aGV5IGFyZSBwcmVzZW50ZWQgaW4gRXhjZWw/DQoNCiAgICAgICAgaWlpLiBXaHkgdGhlcmUgYXJlIG5vIHNwYWNlcyBhcm91bmQgdGhlIGNvbW1hcz8NCg0KNC4gKkVudmlyb25tZW50KiAtLS0gVGhlIEVudmlyb25tZW50IHBhbmUgKHdoaWNoIHdpbGwgYXBwZWFyIG9uIHRoZSB1cHBlciByaWdodCkgcHJvdmlkZXMgYSB2aWV3IG9mIHRoZSBSIG9iamVjdHMgY3VycmVudGx5IGJlaW5nIHN0b3JlZCBieSBSU3R1ZGlvLiBEbyB0aGUgZm9sbG93aW5nOg0KDQogICAgYS4gQ2xpY2sgb24gdGhlIENTViBmaWxlIGFnYWluIGluIHRoZSBGaWxlIHBhbmUsIGJ1dCB0aGlzIHRpbWUgc2VsZWN0ICJJbXBvcnQgRGF0YXNldCIuIFRoaXMgd2lsbCBvcGVuIGEgZm9ybWF0dGVkIHZlcnNpb24gb2YgQ1NWIGZpbGUgaW4gdGhlIEVkaXRvciBwYW5lIGFuZCBhZGQgdGhlIHJlZ2lzdHJ5IGRhdGFzZXQgdG8gdGhlIEVudmlyb25tZW50IFBhbmUuIFZlcmlmeSB0aGF0IHRoaXMgaGFwcGVucy4NCiAgICANCiAgICBiLiBDcmVhdGUgYSBuZXcgdmFyaWFibGUgYnkgdHlwaW5nIGB4IDwtIDFgIGluIHRoZSBDb25zb2xlIHBhbmUuIFZlcmlmeSB0aGF0IHRoaXMgYWRkcyBhIG5ldyBvYmplY3QgbmFtZWQgYHhgIHRvIHRoZSBFbnZpcm9ubWVudCBwYW5lLg0KDQpUaGVyZSdzIG5vdGhpbmcgdG8gdHVybiBpbiBmb3IgdGhpcyBleGVyY2lzZS4NCg==