In this homework, we design and build a relational database for an
organization. Start by creating a new RMarkdown document named
hw5-teams.Rmd using the standard homework format. Include
your PostgreSQL commands as non-executed, SQL code chunks.
The purpose of this database is to maintain records on people and teams for an organization.
The database needs to represent information on the following:
NULL).Design and implement a relational schema using SQL that is appropriate for this database, and then add around a dozen records to this database, making sure that there are:
Write SQL queries to retrieve:
Finally, load your database into RStudio using dplyr to
reproduce the queries you wrote in SQL. Start by loading the required
libraries.
library(tidyverse)
library(DBI)
You can load an ElephantSQL database in R by going to the “Details” tab for you ElephantSQL data instance and copying the following information into R.
DB_HOST <- "COPY_THE_SERVER_HERE"
DB_USER <- "COPY_THE_USER_HERE"
DB_PASSWORD <- "COPY_THE_PASSWORD_HERE"
You can then connect to the database and load the data as follows.
db <- DBI::dbConnect(RPostgres::Postgres(),
host = DB_HOST,
port = 5432,
user = DB_USER,
password = DB_PASSWORD)
dbListTables(db)
YOUR_TABLE <- tbl(db, "YOUR_TABLE_NAME")
Then, you can craft the required dplyr commands.
Because this is not an exploratory analysis, there are no analytical conclusions. Instead, give a one-or-two-paragraph discussion of the relative value of the dataset technologies we’ve used before this unit (e.g., CSV files, spreadsheets) and the relational database technologies we’ve used in this unit. Why would an organization choose one over the other? Which is more common in business organizations?