This lab exercise introduces R and RStudio using the Seattle
pet registration website. We will use the RStudio Integrated
Development Environment (IDE) to manage data, analysis, and
scripting.
The Dataset
Go to dataset website (linked above) and do the following:
Review the description of the dataset and, in particular, study
the data dictionary (i.e., “Columns in this Dataset”), which
specifies the data fields stored for each pet registration
entry.
Download (i.e., “Export”) the dataset to your local machine and
load it into MS Excel. Check that the data matches the website’s
description. Do the fields listed in the data dictionary match? How many
entries are there?
This dataset is a well-formatted, comma-separated-values (CSV) file.
If you have problems downloading the file, you can access this local copy
(data from October 7, 2021).
RStudio
RStudio integrates file editing and R programming. Login to the Calvin RStudio Service and
do the following:
Console — The Console pane (on the lower left) provides
an R interpreter into which you can enter R commands. Try typing the
following commands:
1 + 1 — This arithmetic expression, and
others like it, should evaluate properly.
library(tidyverse) — This function call
loads the Tidyverse library, which provides basic data
wrangling and analysis functions.
Files — The Files pane (on the lower right) provides a
file browser. Do the following:
Create a “New Folder” named info601 and then, inside
that folder, create a sub-folder named lab01 for this lab,
and then, again, inside folder create a sub-folder named
data.
“Upload” the pet registration CSV file you downloaded earlier
into the lab 1 data sub-folder (i.e.,
info601/lab01/data.
Editor — The Editor pane (which will appear on the upper
left) provides a way to view and modify files. Do the following:
Click on the CSV file in the File pane and select “View File”.
This will open the raw CSV file in the editor pane.
Be sure that you can answer these questions:
What data is stored in the first line of the CSV file?
How do the other lines of the file compare with the way they are
presented in Excel?
Why there are no spaces around the commas?
Environment — The Environment pane (which will appear on
the upper right) provides a view of the R objects currently being stored
by RStudio. Do the following:
Click on the CSV file again in the File pane, but this time
select “Import Dataset”. This will open a formatted version of CSV file
in the Editor pane and add the registry dataset to the Environment Pane.
Verify that this happens.
Create a new variable by typing x <- 1 in the
Console pane. Verify that this adds a new object named x to
the Environment pane.
There’s nothing to turn in for this exercise.
©Calvin University, 2022
kvlinden, 2023-01-17
13:56:30
LS0tDQp0aXRsZTogIkxhYiAxLjEgLSBIZWxsbywgUlN0dWRpbyEiDQpvdXRwdXQ6DQogIGh0bWxfZG9jdW1lbnQ6DQogICAgY29kZV9kb3dubG9hZDogdHJ1ZQ0KLS0tDQoNCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQ0Ka25pdHI6Om9wdHNfY2h1bmskc2V0KGVjaG8gPSBUUlVFKQ0KYGBgDQoNClRoaXMgbGFiIGV4ZXJjaXNlIGludHJvZHVjZXMgW1JdKGh0dHBzOi8vd3d3LnItcHJvamVjdC5vcmcvKSBhbmQgW1JTdHVkaW9dKGh0dHBzOi8vd3d3LnBvc2l0LmNvLykgdXNpbmcgdGhlIFtTZWF0dGxlIHBldCByZWdpc3RyYXRpb24gd2Vic2l0ZV0oaHR0cHM6Ly9kYXRhLnNlYXR0bGUuZ292L0NvbW11bml0eS9TZWF0dGxlLVBldC1MaWNlbnNlcy9qZ3V2LXQ5cmIpLg0KV2Ugd2lsbCB1c2UgdGhlIFJTdHVkaW8gSW50ZWdyYXRlZCBEZXZlbG9wbWVudCBFbnZpcm9ubWVudCAoSURFKSB0byBtYW5hZ2UgZGF0YSwgYW5hbHlzaXMsIGFuZCBzY3JpcHRpbmcuDQoNCiMjIFRoZSBEYXRhc2V0DQoNCkdvIHRvIGRhdGFzZXQgd2Vic2l0ZSAobGlua2VkIGFib3ZlKSBhbmQgZG8gdGhlIGZvbGxvd2luZzoNCg0KMS4gUmV2aWV3IHRoZSBkZXNjcmlwdGlvbiBvZiB0aGUgZGF0YXNldCBhbmQsIGluIHBhcnRpY3VsYXIsIHN0dWR5IHRoZSAqZGF0YSBkaWN0aW9uYXJ5KiAoaS5lLiwgIkNvbHVtbnMgaW4gdGhpcyBEYXRhc2V0IiksIHdoaWNoIHNwZWNpZmllcyB0aGUgZGF0YSBmaWVsZHMgc3RvcmVkIGZvciBlYWNoIHBldCByZWdpc3RyYXRpb24gZW50cnkuDQoNCjIuIERvd25sb2FkIChpLmUuLCAiRXhwb3J0IikgdGhlIGRhdGFzZXQgdG8geW91ciBsb2NhbCBtYWNoaW5lIGFuZCBsb2FkIGl0IGludG8gTVMgRXhjZWwuIENoZWNrIHRoYXQgdGhlIGRhdGEgbWF0Y2hlcyB0aGUgd2Vic2l0ZSdzIGRlc2NyaXB0aW9uLiBEbyB0aGUgZmllbGRzIGxpc3RlZCBpbiB0aGUgZGF0YSBkaWN0aW9uYXJ5IG1hdGNoPyBIb3cgbWFueSBlbnRyaWVzIGFyZSB0aGVyZT8NCg0KVGhpcyBkYXRhc2V0IGlzIGEgd2VsbC1mb3JtYXR0ZWQsIGNvbW1hLXNlcGFyYXRlZC12YWx1ZXMgKENTVikgZmlsZS4gSWYgeW91IGhhdmUgcHJvYmxlbXMgZG93bmxvYWRpbmcgdGhlIGZpbGUsIHlvdSBjYW4gYWNjZXNzIFt0aGlzIGxvY2FsIGNvcHldKC4uLy4uL3Jlc291cmNlcy9kYXRhL1NlYXR0bGVfUGV0X0xpY2Vuc2VzLmNzdikgKGRhdGEgZnJvbSBPY3RvYmVyIDcsIDIwMjEpLg0KDQojIyBSU3R1ZGlvDQoNClJTdHVkaW8gaW50ZWdyYXRlcyBmaWxlIGVkaXRpbmcgYW5kIFIgcHJvZ3JhbW1pbmcuIExvZ2luIHRvIHRoZSBbQ2FsdmluIFJTdHVkaW8gU2VydmljZV0oaHR0cHM6Ly9yc3R1ZGlvLmNhbHZpbi5lZHU6ODc4Ny8pIGFuZCBkbyB0aGUgZm9sbG93aW5nOiANCg0KMS4gKkNvbnNvbGUqIC0tLSBUaGUgQ29uc29sZSBwYW5lIChvbiB0aGUgbG93ZXIgbGVmdCkgcHJvdmlkZXMgYW4gUiBpbnRlcnByZXRlciBpbnRvIHdoaWNoIHlvdSBjYW4gZW50ZXIgUiBjb21tYW5kcy4gVHJ5IHR5cGluZyB0aGUgZm9sbG93aW5nIGNvbW1hbmRzOg0KDQogICAgYS4gYDEgKyAxYCAtLS0gVGhpcyAqYXJpdGhtZXRpYyBleHByZXNzaW9uKiwgYW5kIG90aGVycyBsaWtlIGl0LCBzaG91bGQgZXZhbHVhdGUgcHJvcGVybHkuDQogICAgDQogICAgYi4gYGxpYnJhcnkodGlkeXZlcnNlKWAgLS0tIFRoaXMgKmZ1bmN0aW9uIGNhbGwqIGxvYWRzIHRoZSBUaWR5dmVyc2UgKmxpYnJhcnkqLCB3aGljaCBwcm92aWRlcyBiYXNpYyBkYXRhIHdyYW5nbGluZyBhbmQgYW5hbHlzaXMgZnVuY3Rpb25zLg0KDQoyLiAqRmlsZXMqIC0tLSBUaGUgRmlsZXMgcGFuZSAob24gdGhlIGxvd2VyIHJpZ2h0KSBwcm92aWRlcyBhIGZpbGUgYnJvd3Nlci4gRG8gdGhlIGZvbGxvd2luZzoNCg0KICAgIGEuIENyZWF0ZSBhICJOZXcgRm9sZGVyIiBuYW1lZCBgaW5mbzYwMWAgYW5kIHRoZW4sIGluc2lkZSB0aGF0IGZvbGRlciwgY3JlYXRlIGEgc3ViLWZvbGRlciBuYW1lZCBgbGFiMDFgIGZvciB0aGlzIGxhYiwgYW5kIHRoZW4sIGFnYWluLCBpbnNpZGUgZm9sZGVyIGNyZWF0ZSBhIHN1Yi1mb2xkZXIgbmFtZWQgYGRhdGFgLg0KICAgIA0KICAgIGIuICJVcGxvYWQiIHRoZSBwZXQgcmVnaXN0cmF0aW9uIENTViBmaWxlIHlvdSBkb3dubG9hZGVkIGVhcmxpZXIgaW50byB0aGUgbGFiIDEgZGF0YSBzdWItZm9sZGVyIChpLmUuLCBgaW5mbzYwMS9sYWIwMS9kYXRhYC4NCiAgICANCjMuICpFZGl0b3IqIC0tLSBUaGUgRWRpdG9yIHBhbmUgKHdoaWNoIHdpbGwgYXBwZWFyIG9uIHRoZSB1cHBlciBsZWZ0KSBwcm92aWRlcyBhIHdheSB0byB2aWV3IGFuZCBtb2RpZnkgZmlsZXMuIERvIHRoZSBmb2xsb3dpbmc6DQoNCiAgICBhLiBDbGljayBvbiB0aGUgQ1NWIGZpbGUgaW4gdGhlIEZpbGUgcGFuZSBhbmQgc2VsZWN0ICJWaWV3IEZpbGUiLiBUaGlzIHdpbGwgb3BlbiB0aGUgcmF3IENTViBmaWxlIGluIHRoZSBlZGl0b3IgcGFuZS4NCg0KICAgIGIuIEJlIHN1cmUgdGhhdCB5b3UgY2FuIGFuc3dlciB0aGVzZSBxdWVzdGlvbnM6DQogICAgDQogICAgICAgIGkuIFdoYXQgZGF0YSBpcyBzdG9yZWQgaW4gdGhlIGZpcnN0IGxpbmUgb2YgdGhlIENTViBmaWxlPw0KDQogICAgICAgIGlpLiBIb3cgZG8gdGhlIG90aGVyIGxpbmVzIG9mIHRoZSBmaWxlIGNvbXBhcmUgd2l0aCB0aGUgd2F5IHRoZXkgYXJlIHByZXNlbnRlZCBpbiBFeGNlbD8NCg0KICAgICAgICBpaWkuIFdoeSB0aGVyZSBhcmUgbm8gc3BhY2VzIGFyb3VuZCB0aGUgY29tbWFzPw0KDQo0LiAqRW52aXJvbm1lbnQqIC0tLSBUaGUgRW52aXJvbm1lbnQgcGFuZSAod2hpY2ggd2lsbCBhcHBlYXIgb24gdGhlIHVwcGVyIHJpZ2h0KSBwcm92aWRlcyBhIHZpZXcgb2YgdGhlIFIgb2JqZWN0cyBjdXJyZW50bHkgYmVpbmcgc3RvcmVkIGJ5IFJTdHVkaW8uIERvIHRoZSBmb2xsb3dpbmc6DQoNCiAgICBhLiBDbGljayBvbiB0aGUgQ1NWIGZpbGUgYWdhaW4gaW4gdGhlIEZpbGUgcGFuZSwgYnV0IHRoaXMgdGltZSBzZWxlY3QgIkltcG9ydCBEYXRhc2V0Ii4gVGhpcyB3aWxsIG9wZW4gYSBmb3JtYXR0ZWQgdmVyc2lvbiBvZiBDU1YgZmlsZSBpbiB0aGUgRWRpdG9yIHBhbmUgYW5kIGFkZCB0aGUgcmVnaXN0cnkgZGF0YXNldCB0byB0aGUgRW52aXJvbm1lbnQgUGFuZS4gVmVyaWZ5IHRoYXQgdGhpcyBoYXBwZW5zLg0KICAgIA0KICAgIGIuIENyZWF0ZSBhIG5ldyB2YXJpYWJsZSBieSB0eXBpbmcgYHggPC0gMWAgaW4gdGhlIENvbnNvbGUgcGFuZS4gVmVyaWZ5IHRoYXQgdGhpcyBhZGRzIGEgbmV3IG9iamVjdCBuYW1lZCBgeGAgdG8gdGhlIEVudmlyb25tZW50IHBhbmUuDQoNClRoZXJlJ3Mgbm90aGluZyB0byB0dXJuIGluIGZvciB0aGlzIGV4ZXJjaXNlLg0KDQojIyMjIyMgPGJyPiomY29weTtDYWx2aW4gVW5pdmVyc2l0eSwgMjAyMjxicj5gciBTeXMuaW5mbygpWyJ1c2VyIl1gKiwgKmByIFN5cy50aW1lKClgKg0K