Deadlines
- Moodle Reading Mon Nov 1, midnight
- Practice 7 Mon Nov 1, midnight
- Book Ch. 4 — How Numbers Get Their Clout — read for Forum 4, Fri Nov 5 (next week)
- Quiz 4 Fri Oct 29, in class
Week 9: Feature Engineering
Learning Objectives
- 09A I can apply preprocessing steps — scaling and one-hot encoding — and explain why each is needed before modeling.
- 09B I can build a scikit-learn pipeline that chains preprocessing and modeling steps into a single reproducible workflow.
- 09C I can identify how missing data arise (MCAR, MAR, MNAR) and choose appropriate imputation strategies for each case.
Perspectival Reading
Reading: TBD
Reflection Questions
- Feature engineering requires domain knowledge — whose knowledge counts, and who is excluded from this process?
- Imputation fills in missing values with estimates. What assumptions does a chosen strategy make about why data is missing?
- When you encode a variable like gender or ethnicity, what are you doing to how the model treats those groups?