In this homework, we design and build a relational database for an organization. Start by creating a new RMarkdown document named hw5-teams.Rmd using the standard homework format. Include your PostgreSQL commands as non-executed, SQL code chunks.

The purpose of this database is to maintain records on people and teams for an organization.

Data

The database needs to represent information on the following:

Design and implement a relational schema using SQL that is appropriate for this database, and then add around a dozen records to this database, making sure that there are:

Queries

Write SQL queries to retrieve:

R Access

Finally, load your database into RStudio using dplyr to reproduce the queries you wrote in SQL. Start by loading the required libraries.

library(tidyverse)
library(DBI)

You can load an ElephantSQL database in R by going to the “Details” tab for you ElephantSQL data instance and copying the following information into R.

DB_HOST <- "COPY_THE_SERVER_HERE"
DB_USER <- "COPY_THE_USER_HERE"
DB_PASSWORD <- "COPY_THE_PASSWORD_HERE"

You can then connect to the database and load the data as follows.

db <- DBI::dbConnect(RPostgres::Postgres(), 
                     host = DB_HOST, 
                     port = 5432,
                     user = DB_USER,
                     password = DB_PASSWORD)

dbListTables(db)
YOUR_TABLE <- tbl(db, "YOUR_TABLE_NAME")

Then, you can craft the required dplyr commands.

Conclusion

Because this is not an exploratory analysis, there are no analytical conclusions. Instead, give a one-or-two-paragraph discussion of the relative value of the dataset technologies we’ve used before this unit (e.g., CSV files, spreadsheets) and the relational database technologies we’ve used in this unit. Why would an organization choose one over the other? Which is more common in business organizations?