class: left, top, title-slide .title[ # Structuring Predictive Analytics Projects ] .author[ ### Ken Arnold
Calvin University ] --- class: center, middle ## *Connect* the Dots --- ## Connect goal with real-world scenario - How will this be *useful*? - To whom? - Necessary conditions for the model? --- ## Connect data with goal - Source - Structure --- ## Connect EDA with goal - Plots and tables - Insights and Implications --- ## Connect validation with real-world scenario - How will we measure if it's useful? - What evaluation metrics will tell us that? Multiple metrics? - Data splitting strategy (not just random)? --- ## Connect modeling with data (structure, EDA) and goal - What target? How does this connect with goal? How does your data represent this? - What features? Why might each one be useful? --- ## Go beyond the numbers - What's the model doing? - Make a simple model, show its thinking - Variable importance plots - Other explanation techniques: see <https://www.tmwr.org/explain.html> - EDA on modeling results - Where did it work well? - Where did it mess up? - Insights and implications Draw implications for improving model, data, etc. --- ## Compare models Compare multiple models by accuracy and at least one other characteristic - Robustness - Understandability - Complexity - etc. --- ## Connect results with goal and real world - In what ways does it succeed (or not) at the goal? - Recommendations for business use of this model --- ## Be mindful - of *decisions* and possible alternatives - of *limitations*: data, validation, conclusions