class: center, middle, inverse, title-slide # Wrap-up: Communication and Justice ### K Arnold --- ## Q&A > Forecasting vs Prediction? Forecast = prediction... about the **future** of a **time series**. ([more](https://stats.stackexchange.com/questions/65287/difference-between-forecast-and-prediction)) > How can we hold big tech companies accountable? * Regulation (government, or even industry standards) * B Corporations ("triple bottom line") * Collective action (Google Walkout, boycotts, etc.) * Asking hard questions when they try to recruit you > How do you calibrate a model? Train another model to predict true probabilities from classifier probabilities (on held-out data). [sklearn tools](https://scikit-learn.org/stable/modules/calibration.html) --- ## Final Presentation Logistics * See updates about alternative participation opportunities on Moodle. --- ### Clean-up-the-gradebook time * Check for any missing or zero grades Need tweak to grading distribution b/c of midterm project .pull-left[ Syllabus: * 5% Preparation activities * 10% Discussion forums * 10% Lab exercises * 20% Homework * 20% Quizzes * 10% Midterm * 25% Final Project ] .pull-right[ Revised: * 5% Prep and Participation * 10% Discussion forums * 10% Lab exercises * 20% Homework * 20% Quizzes * 10% Midterm Exam * 10% Midterm Project * 15% Final Project ] --- class: center, middle ## Making a Data-Driven Argument --- ### Make a point .pull-left[ **Report A** MAIN POINT * Supporting chart 1 * Supporting table 2 * Supporting model 3 Discussion about how each supports main point ] .pull-right[ **Report B** * Chart 1 * Table 2 * Model 3 * Chart 4 * Table 5 * Chart 6 * Model 7 * Chart 12 * Table 25 ] --- ### Tell a Story * Chart 1 * Therefore, chart 2 * BUT, chart 3 [but-therefore](https://www.youtube.com/watch?v=vGUNqq3jVLg) --- ### Anchor conclusions in data .pull-left[ * The units are probably seconds<br><br> * The fit looks good<br><br> * This was surprising ] -- .pull-right[ * because the median, 600, would be 10 minutes * because the mean error of $15 is less than 0.1% of the price * because I expected that people would leave higher ratings on products they enjoyed more ] --- ### Use appropriate language .pull-left[ **Plain language** for the overview, conclusion, and visuals. * Labels in visuals: use real names, not `code_names`. (For all aesthetics, not just x and y.) * Don't assume the reader knows the structure of the data. ] .pull-right[ **Technical language** when describing methods (data acquisition, wrangling, modeling, etc.). * What data representation choices did you make? *why*? * What modeling choices? Why? etc. ] --- ### Some color tips <https://blog.datawrapper.de/beautifulcolors/> --- class: center, middle ## Start Simple! --- ## Tools for Communication * Slides * Xaringan * RStudio Connect * GitHub Pages * Shiny Apps --- ### Example: [Shiny](https://shiny.rstudio.com/) Apps <https://shiny.rstudio.com/gallery/> * [Engineering Production-Grade Shiny Apps](https://engineering-shiny.org/) --- class: center, middle ## Final Thoughts on Data + Justice With enormous thanks to: Tim Keller, [A Biblical Critique of Secular Justice and Critical Theory](https://quarterly.gospelinlife.com/a-biblical-critique-of-secular-justice-and-critical-theory/) --- Keller's bullet-point summary of biblical justice: * Community above individual (voluntarily) * Equity: equal treatment, dignity * Collective responsibility * Individual responsibility * Advocacy for poor and marginalized .question[ What does biblical justice require, in the area of data science? ] *Post your thoughts in your Cohort channel*.
03
:
00
--- ### Some things you pointed out in Discussion * Care for people *together with* care for environment * Care for individual people affected, not just general economic impacts * Cherishing and celebrating what God has made (people, natural resources) instead of exploiting * Need safeguards to protect from the effects of sin some scripture you mentioned: * Isaiah 56:1, Psalm 82:3 --- ### Community > The righteous (*saddiq*) are willing to disadvantage themselves to advantage the community; the wicked are willing to disadvantage the community to advantage themselves. So: * privacy * integrity in data collection, analysis, reporting, communication --- ### Equity: Everyone must be treated equally and with dignity. * *direct* impact * fair risk assessment (see Discussion and COMPAS) * fair resource allocation * fair surveillance (don't hyper-surveil the poor etc.) * *indirect* impact: * don't show ads for criminal background checks more often for Black names * don't tolerate higher speech reco error rates for minorities * show a representative diversity of age/gender/race/... in image searches --- ### Should we even be predicting peoples' lives? * Risk assessment for criminality, loan approval, etc. requires predicting peoples' future actions and situations * These predictions might be terribly inaccurate. *Should we be trying at all*? Salganik et al, [**Measuring the predictability of life outcomes with a scientific mass collaboration**](https://www.pnas.org/content/117/15/8398.short). PNAS, April 2020 > Despite using a rich dataset and applying machine-learning methods optimized for prediction, the best predictions were not very accurate and were only slightly better than those from a simple benchmark model. --- ### Corporate responsibility: I am sometimes responsible for and involved in other people’s sins. * Even if *I* intend no prejudice, my *algorithm* could be prejudiced because of training data. * Even if my work is honest, I could be supporting a company that exploits other workers directly or rely on conflict minerals and [child labor](https://www.bbc.com/news/world-africa-50812616) * Environmental responsibility is both individual and collective ??? http://opiniojuris.org/2020/01/13/the-mighty-apple-google-tesla-dell-and-microsoft-in-the-dock-a-look-at-the-child-labour-lawsuit/ --- ### Individual responsibility: I am finally responsible for all my sins, but not for all my outcomes. * I must do what's right, whether or not my company's policies require it. * When something isn't right, I need to say something even if it risks my job. --- ### Advocacy: We must have special concern for the poor and the marginalized. * By **listening to** and **amplifying**, not **speaking for**. * e.g., beware of doing "parachute research" or de-contextualized "Data for Good" --- ### Incarnation .scripture[ ``` In your relationships with one another, have the same mindset as Christ Jesus: Who, being in very nature God, did not consider equality with God something to be used to his own advantage; rather, he made himself nothing by taking the very nature of a servant, being made in human likeness. And being found in appearance as a man, he humbled himself by becoming obedient to death— even death on a cross! ``` .ref[Philippians 2:5-8, NIV] ] --- ## Learning More --- ### Some further reading on data ethics * [The Oxford Handbook of Ethics of AI](https://global.oup.com/academic/product/the-oxford-handbook-of-ethics-of-ai-9780190067397?cc=ca&lang=en&#) * e.g., chapter of [Race and Gender](https://arxiv.org/abs/1908.06165) was written by Timnit Gebru * Coded Bias documentary * Fast.AI [Data Ethics course](https://ethics.fast.ai/) * [Ethics and Data Science](https://www.amazon.com/Ethics-Data-Science-Mike-Loukides-ebook/dp/B07GTC8ZN7) by Mike Loukides, Hilary Mason, DJ Patil * [Weapons of Math Destruction](https://www.amazon.com/Ethics-Data-Science-Mike-Loukides-ebook/dp/B07GTC8ZN7): *How Big Data Increases Inequality and Threatens Democracy*, by Cathy O'Neil * [How Charts Lie](https://wwnorton.com/books/9781324001560): *Getting Smarter about Visual Information*, by Alberto Cairo * [How Deceptive are Deceptive Visualizations?](https://dl.acm.org/doi/10.1145/2702123.2702608) Pandey et al., CHI 2015 --- ## Who/What I'm Reading / Following: Data Ethics * AI Now Institute * Data and Society * [AlgorithmWatch](https://algorithmwatch.org/en/) * [Harvard BKC](https://twitter.com/BKCHarvard) * Data Feminism * ACM Conference on Fairness, Accountability, and Transparency ([FAccT](https://facctconference.org/)) People: * [Timnit Gebru](https://twitter.com/timnitGebru) * [Rediet Abebe](https://twitter.com/red_abebe) * [J. Nathan Matias](https://twitter.com/natematias) * [Joy Buolamwini](https://twitter.com/jovialjoy) --- ## Who/What I'm Reading / Following: Tech * [RStudio AI blog](https://blogs.rstudio.com/ai/) * [tidyverse blog](https://www.tidyverse.org/blog/) * [RWeekly](https://rweekly.org/) * [distill.pub](https://distill.pub/) * [Harvard Data Science Review](https://hdsr.mitpress.mit.edu/) * [TWiML Podcast](https://twimlai.com/shows/) * [Cassie Kozyrkov](https://decision.substack.com/) ([@quaesita](https://twitter.com/quaesita)) --- ## "What can I do?" * Choose jobs carefully ([How to Interview a Tech Company](https://medium.com/@AINowInstitute/how-to-interview-a-tech-company-d4cc74b436e9)) * You *can* make a difference inside even a "bad" company--but recruit a support network like a missionary. * Listen a lot. To diverse opinions. * Keep in touch.