Systems Engineering for Data Science
Keith VanderLinden
Calvin University
Systems Engineering
Systems engineering conceives of, designs, builds, and deploys systems that satisfy business requirements.
Machine Learning
Machine learning is an approach to learn complex patterns from existing data and use these patterns to make predictions on unseen data.
Machine Learning: Patterns
Machine learning is an approach to learn complex patterns from existing data and use these patterns to make predictions on unseen data. To be viable:
The patterns should be sufficiently complicated and changing that they can’t be pre-specified. Distinguish:
Machine Learning: Data
Machine learning is an approach to learn complex patterns from existing data and use these patterns to make predictions on unseen data. To be viable:
There must be data for learning that’s:
- Appropriate
- Voluminous
- Balanced
- Available
- Unbiased
Machine Learning: Predictions
Machine learning is an approach to learn complex patterns from existing data and use these patterns to make predictions on unseen data. To be viable:
Predictions must be:
- Possible
- Non-Mission-Critical
- Valuable
- Appropriate
Software vs ML
Software: Code & Data
- Code/data are separate.
- Only code is versioned.
- Code-bases are small.
- Code is unit-tested.
- Code updates are infrequent.
ML: Datasets and Models
- Data/models are coupled.
- Data/models are versioned.
- Data/models are huge.
- ML is hard to test.
- Data/model updates are frequent.
Systems Engineering Process
Software Engineering
- Analysis
- Design
- Implementation
- Testing
- Deployment &
Maintenance
ML/Data Engineering
- Project Scoping
- Data & Model Engineering
- System Deployment
- System Monitoring
- Business Analysis