This lab exercise introduces R and RStudio using the Seattle pet registration website. We will use the RStudio Integrated Development Environment (IDE) to manage data, analysis, and scripting.

The Dataset

Go to dataset website (linked above) and do the following:

  1. Review the description of the dataset and, in particular, study the data dictionary (i.e., “Columns in this Dataset”), which specifies the data fields stored for each pet registration entry.

  2. Download (i.e., “Export”) the dataset to your local machine and load it into MS Excel. Check that the data matches the website’s description. Do the fields listed in the data dictionary match? How many entries are there?

This dataset is a well-formatted, comma-separated-values (CSV) file. If you have problems downloading the file, you can access this local copy (data from October 7, 2021).

RStudio

RStudio integrates file editing and R programming. Login to the Calvin RStudio Service and do the following:

  1. Console — The Console pane (on the lower left) provides an R interpreter into which you can enter R commands. Try typing the following commands:

    1. 1 + 1 — This arithmetic expression, and others like it, should evaluate properly.

    2. library(tidyverse) — This function call loads the Tidyverse library, which provides basic data wrangling and analysis functions.

  2. Files — The Files pane (on the lower right) provides a file browser. Do the following:

    1. Create a “New Folder” named info601 and then, inside that folder, create a sub-folder named lab01 for this lab, and then, again, inside folder create a sub-folder named data.

    2. “Upload” the pet registration CSV file you downloaded earlier into the lab 1 data sub-folder (i.e., info601/lab01/data.

  3. Editor — The Editor pane (which will appear on the upper left) provides a way to view and modify files. Do the following:

    1. Click on the CSV file in the File pane and select “View File”. This will open the raw CSV file in the editor pane.

    2. Be sure that you can answer these questions:

      1. What data is stored in the first line of the CSV file?

      2. How do the other lines of the file compare with the way they are presented in Excel?

      3. Why there are no spaces around the commas?

  4. Environment — The Environment pane (which will appear on the upper right) provides a view of the R objects currently being stored by RStudio. Do the following:

    1. Click on the CSV file again in the File pane, but this time select “Import Dataset”. This will open a formatted version of CSV file in the Editor pane and add the registry dataset to the Environment Pane. Verify that this happens.

    2. Create a new variable by typing x <- 1 in the Console pane. Verify that this adds a new object named x to the Environment pane.

There’s nothing to turn in for this exercise.


©Calvin University, 2022
kvlinden
, 2023-01-17 13:56:30
LS0tDQp0aXRsZTogIkxhYiAxLjEgLSBIZWxsbywgUlN0dWRpbyEiDQpvdXRwdXQ6DQogIGh0bWxfZG9jdW1lbnQ6DQogICAgY29kZV9kb3dubG9hZDogdHJ1ZQ0KLS0tDQoNCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQ0Ka25pdHI6Om9wdHNfY2h1bmskc2V0KGVjaG8gPSBUUlVFKQ0KYGBgDQoNClRoaXMgbGFiIGV4ZXJjaXNlIGludHJvZHVjZXMgW1JdKGh0dHBzOi8vd3d3LnItcHJvamVjdC5vcmcvKSBhbmQgW1JTdHVkaW9dKGh0dHBzOi8vd3d3LnBvc2l0LmNvLykgdXNpbmcgdGhlIFtTZWF0dGxlIHBldCByZWdpc3RyYXRpb24gd2Vic2l0ZV0oaHR0cHM6Ly9kYXRhLnNlYXR0bGUuZ292L0NvbW11bml0eS9TZWF0dGxlLVBldC1MaWNlbnNlcy9qZ3V2LXQ5cmIpLg0KV2Ugd2lsbCB1c2UgdGhlIFJTdHVkaW8gSW50ZWdyYXRlZCBEZXZlbG9wbWVudCBFbnZpcm9ubWVudCAoSURFKSB0byBtYW5hZ2UgZGF0YSwgYW5hbHlzaXMsIGFuZCBzY3JpcHRpbmcuDQoNCiMjIFRoZSBEYXRhc2V0DQoNCkdvIHRvIGRhdGFzZXQgd2Vic2l0ZSAobGlua2VkIGFib3ZlKSBhbmQgZG8gdGhlIGZvbGxvd2luZzoNCg0KMS4gUmV2aWV3IHRoZSBkZXNjcmlwdGlvbiBvZiB0aGUgZGF0YXNldCBhbmQsIGluIHBhcnRpY3VsYXIsIHN0dWR5IHRoZSAqZGF0YSBkaWN0aW9uYXJ5KiAoaS5lLiwgIkNvbHVtbnMgaW4gdGhpcyBEYXRhc2V0IiksIHdoaWNoIHNwZWNpZmllcyB0aGUgZGF0YSBmaWVsZHMgc3RvcmVkIGZvciBlYWNoIHBldCByZWdpc3RyYXRpb24gZW50cnkuDQoNCjIuIERvd25sb2FkIChpLmUuLCAiRXhwb3J0IikgdGhlIGRhdGFzZXQgdG8geW91ciBsb2NhbCBtYWNoaW5lIGFuZCBsb2FkIGl0IGludG8gTVMgRXhjZWwuIENoZWNrIHRoYXQgdGhlIGRhdGEgbWF0Y2hlcyB0aGUgd2Vic2l0ZSdzIGRlc2NyaXB0aW9uLiBEbyB0aGUgZmllbGRzIGxpc3RlZCBpbiB0aGUgZGF0YSBkaWN0aW9uYXJ5IG1hdGNoPyBIb3cgbWFueSBlbnRyaWVzIGFyZSB0aGVyZT8NCg0KVGhpcyBkYXRhc2V0IGlzIGEgd2VsbC1mb3JtYXR0ZWQsIGNvbW1hLXNlcGFyYXRlZC12YWx1ZXMgKENTVikgZmlsZS4gSWYgeW91IGhhdmUgcHJvYmxlbXMgZG93bmxvYWRpbmcgdGhlIGZpbGUsIHlvdSBjYW4gYWNjZXNzIFt0aGlzIGxvY2FsIGNvcHldKC4uLy4uL3Jlc291cmNlcy9kYXRhL1NlYXR0bGVfUGV0X0xpY2Vuc2VzLmNzdikgKGRhdGEgZnJvbSBPY3RvYmVyIDcsIDIwMjEpLg0KDQojIyBSU3R1ZGlvDQoNClJTdHVkaW8gaW50ZWdyYXRlcyBmaWxlIGVkaXRpbmcgYW5kIFIgcHJvZ3JhbW1pbmcuIExvZ2luIHRvIHRoZSBbQ2FsdmluIFJTdHVkaW8gU2VydmljZV0oaHR0cHM6Ly9yc3R1ZGlvLmNhbHZpbi5lZHU6ODc4Ny8pIGFuZCBkbyB0aGUgZm9sbG93aW5nOiANCg0KMS4gKkNvbnNvbGUqIC0tLSBUaGUgQ29uc29sZSBwYW5lIChvbiB0aGUgbG93ZXIgbGVmdCkgcHJvdmlkZXMgYW4gUiBpbnRlcnByZXRlciBpbnRvIHdoaWNoIHlvdSBjYW4gZW50ZXIgUiBjb21tYW5kcy4gVHJ5IHR5cGluZyB0aGUgZm9sbG93aW5nIGNvbW1hbmRzOg0KDQogICAgYS4gYDEgKyAxYCAtLS0gVGhpcyAqYXJpdGhtZXRpYyBleHByZXNzaW9uKiwgYW5kIG90aGVycyBsaWtlIGl0LCBzaG91bGQgZXZhbHVhdGUgcHJvcGVybHkuDQogICAgDQogICAgYi4gYGxpYnJhcnkodGlkeXZlcnNlKWAgLS0tIFRoaXMgKmZ1bmN0aW9uIGNhbGwqIGxvYWRzIHRoZSBUaWR5dmVyc2UgKmxpYnJhcnkqLCB3aGljaCBwcm92aWRlcyBiYXNpYyBkYXRhIHdyYW5nbGluZyBhbmQgYW5hbHlzaXMgZnVuY3Rpb25zLg0KDQoyLiAqRmlsZXMqIC0tLSBUaGUgRmlsZXMgcGFuZSAob24gdGhlIGxvd2VyIHJpZ2h0KSBwcm92aWRlcyBhIGZpbGUgYnJvd3Nlci4gRG8gdGhlIGZvbGxvd2luZzoNCg0KICAgIGEuIENyZWF0ZSBhICJOZXcgRm9sZGVyIiBuYW1lZCBgaW5mbzYwMWAgYW5kIHRoZW4sIGluc2lkZSB0aGF0IGZvbGRlciwgY3JlYXRlIGEgc3ViLWZvbGRlciBuYW1lZCBgbGFiMDFgIGZvciB0aGlzIGxhYiwgYW5kIHRoZW4sIGFnYWluLCBpbnNpZGUgZm9sZGVyIGNyZWF0ZSBhIHN1Yi1mb2xkZXIgbmFtZWQgYGRhdGFgLg0KICAgIA0KICAgIGIuICJVcGxvYWQiIHRoZSBwZXQgcmVnaXN0cmF0aW9uIENTViBmaWxlIHlvdSBkb3dubG9hZGVkIGVhcmxpZXIgaW50byB0aGUgbGFiIDEgZGF0YSBzdWItZm9sZGVyIChpLmUuLCBgaW5mbzYwMS9sYWIwMS9kYXRhYC4NCiAgICANCjMuICpFZGl0b3IqIC0tLSBUaGUgRWRpdG9yIHBhbmUgKHdoaWNoIHdpbGwgYXBwZWFyIG9uIHRoZSB1cHBlciBsZWZ0KSBwcm92aWRlcyBhIHdheSB0byB2aWV3IGFuZCBtb2RpZnkgZmlsZXMuIERvIHRoZSBmb2xsb3dpbmc6DQoNCiAgICBhLiBDbGljayBvbiB0aGUgQ1NWIGZpbGUgaW4gdGhlIEZpbGUgcGFuZSBhbmQgc2VsZWN0ICJWaWV3IEZpbGUiLiBUaGlzIHdpbGwgb3BlbiB0aGUgcmF3IENTViBmaWxlIGluIHRoZSBlZGl0b3IgcGFuZS4NCg0KICAgIGIuIEJlIHN1cmUgdGhhdCB5b3UgY2FuIGFuc3dlciB0aGVzZSBxdWVzdGlvbnM6DQogICAgDQogICAgICAgIGkuIFdoYXQgZGF0YSBpcyBzdG9yZWQgaW4gdGhlIGZpcnN0IGxpbmUgb2YgdGhlIENTViBmaWxlPw0KDQogICAgICAgIGlpLiBIb3cgZG8gdGhlIG90aGVyIGxpbmVzIG9mIHRoZSBmaWxlIGNvbXBhcmUgd2l0aCB0aGUgd2F5IHRoZXkgYXJlIHByZXNlbnRlZCBpbiBFeGNlbD8NCg0KICAgICAgICBpaWkuIFdoeSB0aGVyZSBhcmUgbm8gc3BhY2VzIGFyb3VuZCB0aGUgY29tbWFzPw0KDQo0LiAqRW52aXJvbm1lbnQqIC0tLSBUaGUgRW52aXJvbm1lbnQgcGFuZSAod2hpY2ggd2lsbCBhcHBlYXIgb24gdGhlIHVwcGVyIHJpZ2h0KSBwcm92aWRlcyBhIHZpZXcgb2YgdGhlIFIgb2JqZWN0cyBjdXJyZW50bHkgYmVpbmcgc3RvcmVkIGJ5IFJTdHVkaW8uIERvIHRoZSBmb2xsb3dpbmc6DQoNCiAgICBhLiBDbGljayBvbiB0aGUgQ1NWIGZpbGUgYWdhaW4gaW4gdGhlIEZpbGUgcGFuZSwgYnV0IHRoaXMgdGltZSBzZWxlY3QgIkltcG9ydCBEYXRhc2V0Ii4gVGhpcyB3aWxsIG9wZW4gYSBmb3JtYXR0ZWQgdmVyc2lvbiBvZiBDU1YgZmlsZSBpbiB0aGUgRWRpdG9yIHBhbmUgYW5kIGFkZCB0aGUgcmVnaXN0cnkgZGF0YXNldCB0byB0aGUgRW52aXJvbm1lbnQgUGFuZS4gVmVyaWZ5IHRoYXQgdGhpcyBoYXBwZW5zLg0KICAgIA0KICAgIGIuIENyZWF0ZSBhIG5ldyB2YXJpYWJsZSBieSB0eXBpbmcgYHggPC0gMWAgaW4gdGhlIENvbnNvbGUgcGFuZS4gVmVyaWZ5IHRoYXQgdGhpcyBhZGRzIGEgbmV3IG9iamVjdCBuYW1lZCBgeGAgdG8gdGhlIEVudmlyb25tZW50IHBhbmUuDQoNClRoZXJlJ3Mgbm90aGluZyB0byB0dXJuIGluIGZvciB0aGlzIGV4ZXJjaXNlLg0KDQojIyMjIyMgPGJyPiomY29weTtDYWx2aW4gVW5pdmVyc2l0eSwgMjAyMjxicj5gciBTeXMuaW5mbygpWyJ1c2VyIl1gKiwgKmByIFN5cy50aW1lKClgKg0K