overview

Model Development

Keith VanderLinden
Calvin University

Model Evaluation

Modeling is an iterative process in which we experiment with different models and model configurations. Best practices:

Simple is better than complex.
People are biased.
Time changes things.
There are no perfect solutions.
All models are wrong.

Modeling is an iterative process. These are C. Huyen’s best practices.
Simple is better than complex.
- Start simple and proceed iteratively & incrementally.
- Avoid state-of-the-art, at least at first.
- Ensemble methods (DMLS p. 156-161) are complex but effective when small gains in performance bring large payoffs.
- C.f. the observation that the extra work required to get asymptotically close to winning Kaggle competitions probably doesn’t lead to better models, just models that better match particular datasets.
People are biased.
- Acknowledge your own biases (q.v., List of cognitive biases).
Time changes things.
- Academic work is often short-lived (one-semester-limited); not so for production work.
There is no perfect solution.
- Latency and compute requirements are legitimate considerations.
All models are wrong, but some are useful - G. Box.
- I wonder if any of the assumptions C. Huyen lists (DMLS p. 155-156) are true.

Experiment Tracking and Versioning

Success in modeling often depends on experimentation with:

Datasets and Preprocessing
Model Architectures & Hyperparameters

All leading to different:

Evaluation results
Memory and compute requirements

We must be able to compare these experiments (tracking) and to reproduce them (versioning).

Distributed Training

In production, the simplifying assumptions we make in academic work are generally not true.

Datasets can’t be fit into main memory.
Models can’t be trained in a single machine.

“Big data” is when your workflow breaks. — R. Pruim, MDSR2e

AutoML

Much of the success of machine learning has been due to the deployment of relatively simple approaches trained on relatively voluminous datasets and powerful compute engines.

AutoML applies this approach to:

Hyper-parameter tuning
Architecture search

Phases of ML Adoption

Huyen presents these phrases of adoption for ML.

Before ML
Simple models
Optimizing
Complex models

I found these to be a helpful tonic against ML mania.

Model Metrics

You can’t succeed if you can’t measure success.

Start with well-understood baseline models.
Extra-performance metrics
- Perturbation tests
- Invariance tests
- Calibration tests
- Slice-based tests