The final project in CS106 is an opportunity for you to showcase what you have learned in this class, and begin applying your newfound knowledge and abilities to a problem that interests you personally.
The final project is to be an individual or a small group project, although I hope that in either case you consult with others in the class and with professors (if possible) for help in design, implementation, and debugging.
I encourage you to investigate and (hopefully) use existing python packages. Interesting ones include:
- Plotting and Visualizing:
matplotlib,plotly,vpython - Simulation:
vpython,pymunk, - Calculating:
numpy,scipy,pandas,sklearn - Others:
sympy,spacy,transformers, way way more
Many final projects fall into one of two categories: simulations and data analyses.
For simulations, it is often best if you can create a class representing each of the types of "actors" in your simulation. Then, if you create multiple instances of these classes, how do they interact with each other? For a simulation project, you must have a hypothesis you are trying to test. You cannot just create multiple agents and “see what happens.”
Data analysis projects take data and manipulate it, either to analyze its properties, or to allow users to visualize it in new ways. To do this kind of project, you need data (that should be obvious, but I thought I'd point it out anyway). I would much prefer that you don't pick a project where you have to collect, gather, or fabricate your own data. You don't need extra work to do.
In either case, your project must be interactive with the user. You
may either ask the user to enter data via input(), or from a file
(preferred), or you may take values on the command line (using
sys.argv, or something fancier like click or
the built-in argparse library). You should implement a DEBUG flag or “verbose” option to help you debug
your project as you write it.
Submitting your project ¶
Submit your code as a ZIP file on Moodle. Include a plain-text README.txt file that looks like:
Title: (a title for your project)
Author: (your name)
Objective: 1-2 sentence description of your project's goal
How to demo:
*Specific instructions for what to do to run through a basic demo
of the main functionality of your project.* You don't need to show
off all features here.
Highlights:
*List a few parts of the program that you're proud of.*
This could be something tricky you got working or how you organized
your code.
Process:
*A one-paragraph summary on your process for creating the code.
Include at least one specific difficulty that you encountered
and how your overcame it.*
Testing:
*What steps could someone do to check that your code works correctly?*
(If you use assert statements, running the code may suffice.)
Also include a screenshot or very brief video of your project in action.
Timeline ¶
There will be four deliverables for this project, due according to the following schedule:
-
Project Design (2%): You must submit to me a document outlining what your project will do. See below for more details. Due Friday, Nov. 19, at 23:59:59.
-
Project Walkthrough (3%): You and I meet to look at what you have completed. I will recommend solutions to problems you are having. At least 50% of your code must be complete when we meet. Scheduled for the week of Nov. 29.
-
Project Showcase (5%): after the written portion of our final exam, those who have presentable work will present their work to the class. You should be able to run your code at this time. If you have concrete results to show, please do so. You may want to have a web page or PowerPoint slides to describe what your project does.
-
Project Submission (90%): You submit your final project code by midnight the day of the final. If you use data for your project, you must submit the data files as well.
I will combine the points from these deliverables to compute your score for the final project/presentation.
Grading ¶
I will grade your project submission as follows:
-
50%: code is complete and correct
-
10%: the README describes a simple and clear way to test that the code is complete and correct
-
10%: code is well structured and minimizes duplication
-
10%: documentation: naming is clear and consistent; comments and docstrings are accurate sufficient and accurate
-
5%: submission ZIP file includes a README with all elements given above
-
5%: submission ZIP file includes a screenshot or brief demo video
-
10%: complexity/significance of the project
If any data is needed to run the code, please either include the data in the ZIP file or provide specific instructions for how to obtain it (e.g., go to a certain URL).
Note that it is better to choose a final project that is not overly complex and get it right than it is to choose a project that is too complex and not finish it. I recommend that you find a project that you can implement in stages, so that at multiple points you can have a "finished" project, and then decide if you want to or have time to proceed to the next stage. If you choose this route, it would be best to document these stages in your design document.
Project Design ¶
This document must include:
-
A high-level description of the project,
-
The main algorithm of your project, as pseudo-code. E.g., for a hangman game, you might write:
1. Print introductory message 2. Get words from a file into a list of words. 3. Main loop: 3.1. Ask user for how many letters in their word 3.2. Initialize variables to hold the number of guesses made already and letters guessed already. 3.3. while there are still guesses available: 3.3.1. get guess from the user, repeatedly until they give a letter they haven't guessed yet. etc... -
A description of each class or module you intend to implement, including the class variables and methods.
-
What you expect the input and output to look like: if it is text, then a quick sample of what it will look like. If it is graphical, a description of it.
-
The ways a user can alter the run of the program by changing input.
The document should be a Word document, PDF document, or text document. It must not be python code.
Note that the more work you do on your design, the less time you'll have to spend writing the code (because you'll implement more of it correctly the first time).
Ways students lose points ¶
-
poor documentation: no comments, inscrutable variable or function names.
-
poor software organization: repeated code that could easily be in functions. Mega-giant main loops, e.g. Or, functions defined within a mega-giant main loop.
The main loop should ideally be quite small – perhaps 20 to 40 lines -- with calls out to functions or class methods to do the work.
-
Magic numbers in their code… Repeated checks against a value that should be put as a constant.
-
Methods in classes that use
input(): makes it completely non-portable. -
No or poor testing: complicated code with no way of testing whether it’s right. Ideally you’d use
assertstatements or other kinds of automated testing, but other kinds of testing are fine too.
Reuse vs Plagiarism ¶
It can be very helpful to find an example we did in class (lab, homework, etc.) that is similar to what you want to do and adapt it. This is highly encouraged. Just note that you did so in your code documentation.
If you intend to use code from other people outside of the class, talk to us first and remember that we'll grade you on the code you write, not on what other people write. If you do make use of existing code and libraries, be sure to clearly indicate who wrote what parts of the code; using code without proper attribution is a form of plagiarism.
Feel free to discuss ideas with us or with your classmates, but don’t copy code (i.e. plagiarize). Here are examples of what plagiarism looks like:
- You find a program online and copy the entire contents of the file into your submission without attribution.
- You find code online, and change the variable names.
- Your roommate writes some code, which you add to your program. You add documentation that shows you understand the code, but never indicate the source of the code.
- Your older sibling sends you a function that will help your program. You add it to your submission without attribution.
Consider these rules of thumb:
- If you found it efficient to use copy/paste to create some portion of your application, you must supply documentation that indicates the original source of the code.
- If the moment you figure out how to do something occurs while you are looking at a website, you should document that website.
Note that these rules of thumb apply to the code supplied in this course’s materials as well.
Ideas ¶
-
Simulate a ball being shot out of a cannon and bouncing on the ground. What if the wind picks up? Or gravity increases?
-
Simulate a predator/prey situation: wolves eat mice, so the mouse population goes down, so the wolf population goes down, so the mouse population goes up, so the wolf population goes up, and so on. Can it be extended to involve 3 species? I want to see graphs! And, it would be nice if it could be based on some actual data (found on the web, or elsewhere).
-
Build a traffic simulation.
-
Use SymPy to model mathematical equations and do, like, mathematical-type stuff with them, like, you know.
-
Use DendroPy to, like, do some, like phylogenetic stuff, or whatever.
-
Iterate over a collection of atoms/molecules and compute whether or not they can (theoretically) combine. If they can combine to form a new molecule, can you determine its official name. Can you use
pymolto visualize the molecule? -
Create a star/planets/satellites simulation to model our solar system with actual values for planet masses, distances, etc.
-
Implement code to demonstrate the Monty Hall Problem.
-
Newton's method illustration
-
Evil Hangman
-
Work with some text data using
spacy. -
Another Monte Carlo simulation of a solitaire game. (all in your hand, pyramid, etc.)
-
Conway's Game of Life or Water world.
-
See more ideas on the website under “Resources”.
Sources of data online ¶
- Kaggle Datasets
- data.gov
- Our World in Data
- CORGIS Datasets Project (“The Collection of Really Great, Interesting, Situated Datasets”): https://think.cs.vt.edu/corgis/
- NOAA climate data
- US Geological survey live feeds, including recent earthquakes
Acknowledgments ¶
Like most CS 106 material, these instructions were based on material written for a prior offering of this class by Dr. Victor Norman.