Start your solution notebook by describing the purpose of the analysis. For this homework assignment, imagine that you’ve been hired by Capital Bikeshare to help them understand and predict the hourly demand for rental bikes. This understanding will help them plan the number of bikes that they need to make available at different parts of the system at different times so that they can avoid cases in which:

After this introduction, your solution notebook should include the following sections.

Data

Describe the source and nature of your dataset here.

The data for this problem were collected from Capital Bikeshare over the course of two years (2011 and 2012) (find the raw data here: Capital Bikeshare System Data). Researchers at the University of Porto processed this data and augmented it with extra information, as described on this UCI ML Repository webpage. We’ll use this simplified version of the dataset that we’ve derived from the original data. It is in CSV format.

To complete this section:

Analysis

Do the following data exploration exercises and include descriptions of your work in the document:

1. Create a scatterplot of the number of rides.

Create a scatterplot showing the total number of rides each day. Sample code is provided here.

____ %>%
  ggplot() +
  aes(x = ___, y = ___) +
  geom_point() +
  geom_smooth() +
  labs(
    x = "___", 
    y = "___"
  )

Fill in the blanks as needed. Here are some notes on this code:

The results should like like this.

Write a one or two sentence interpretation of the graph with respect to the goals of this analysis. In particular, do you see patterns in the data that would help inform Capital Bikeshare’s deployment of bikes?

2. Create new plot that distinguishes weekdays from weekends.

Add a new code chunk that creates the same plot again, but add a mapping of workingday to the color aesthetic. Your result should look like:

We suggest that you start this section by coping and pasting your previous code chunk.

Again, write a one or two sentence interpretation of the graph, this time focusing on the differences between bike use on weekdays vs weekends.

Conclusion

Include a recommendation for Capital Bikeshare, based on your analysis, on how to plan the number of bikes.

Submit your solution as specified in the Moodle page for this assignment.


*Exercise based on Data Science in a Box