|
|
|
|
|
|
When we approach modeling as a series of design choices, we highlight the assumptions and subjectivity of value judgements made at each stage and begin to expose the inherent biases embedded within our models.
1 minute, talk to your neighbors:
| Authors | Reading | Watching | ||
|---|---|---|---|---|
| Cathy O'Neil |
|
|||
| Viginia Eubanks |
|
Automating Inequality PBS 2018 |
||
| Safiya Umoja Noble |
|
|||
| Meredith Broussard |
|
|
||
| Janelle Shane |
|
The danger of AI is weirder than you think TED 2019 |
||
| Hannah Fry |
|
Should Computers Run the World? Royal Institution 2019 |
||
| Caroline Criado Perez |
|
Invisible Women Engage 2019 |
||
| Ruha Benjamin |
|
Ruha's resources for Race After Tech | ||
| Melanie Mitchell |
|
The Collapse of Artificial Intelligence Santa Fe Institute 2019 |
||
| Sasha Costanza-Chock |
|
|
||
| Kate Crawford |
|
|
||
| Catherine D'Ignazio & Lauren F. Klein |
|
"Our success, happiness, and wellbeing are never fully of our own making. Others' decisions can profoundly affect the course of our lives...
Arbitrary, inconsistent, or faulty decision-making thus raises serious concerns..."
- Fairness and Machine Learning, Barocas, Hardt, and Narayanan
Anatomy of an AI system, Crawford and Joler
When handing over the tools of mathematics,
we are responsible as educators
for teaching their responsible use.
It is a sin of omission when we fail to acknowledge the consequences of the content we teach; Consequences which include ethical and technical pitfalls.
|
|
|
|
|
|
Data |
1. Get the data |
Preprocess |
2. Clean up the data |
Explore |
3. Explore the data |
Model |
4. Model it |
Communicate |
5. Share the results |
Data |
1. Get the data |
Preprocess |
2. Clean up the data |
Explore |
3. Explore the data |
Model |
4. Model it |
Communicate |
5. Share the results |
• Design
∘ Turn a problem into a data-problem.
∘ Survey or experimental design
∘ Database infrastructure
• Acquire
∘ Survey or experiment
∘ Download the dataset! CSV, API, etc.
∘ Web scraping
• Wrangle
∘ Format
∘ Clean and organize
∘ Check data integrity
• Prepare
∘ Label
∘ Split into training and testing sets
∘ Normalize
• Visualize
∘ Plot and familiarize with data
∘ Look for and compare features visually
∘ Consider appropriate models
• Inspect
∘ Exploratory data analysis
∘ Descriptive statistics
∘ Identify features analytically
• Model
∘ Try and compare multiple models
∘ Consider bias and variance
∘ Interpret model and performance
• Validate
∘ Assess model performance on independent test data
∘ Error analysis and stress-test
∘ Consider consequences
• Reflect
∘ Consider contexts, bias, and consequence
∘ Create audit plant
∘ Document - data and model
• Share
∘ Report documentation
∘ Inform policy
∘ Deploy in product
Environment
|
Data
|
• Harmful data collection, lack of consent, insecure / lack of privacy, historical, representational, or measurement bias, ...
|
|
Preprocess
|
• Labor exploitation, labeling by non-experts, incorrect labeling, trauma experienced by labelers, ...
|
|
Explore
|
• Feature selection bias, bias in interpretation of data visualization, data manipulation, feature hacking, ...
|
|
Model
|
• Bias in model choice, model-amplified bias, environmental impact, learning bias, evaluation bias, peripheral modeling, ...
|
|
Communicate
|
• Biased model interpretation, ignoring variance, rejecting model, deploying harmful products, deployment bias, ...
|
|
Meta
|
• "Pernicious feedback loops", runaway homogeneity, susceptability to adversarial attack, lack of oversight or auditing, ...
|
|
|
|
Data
|
• Data problem: What will be the bounce height \(h_{bounce}\) of my bouncy ball when dropped from rest from a given drop height \(h_{drop}\)?
• Record several slow-motion videos. |
|
Preprocess
|
• Randomly choose a subset of videos as the training set.
• Parse the training set videos into a table. |
|
Explore
|
• Create a scatter plot of \(h_{bounce}(h_{drop})\)
• Look for features! Notice and wonder. Consider models. |
|
Model
|
• Find a best-fit model on the training data.
• Validate the model on the testing data. |
|
Communicate
|
• Reflect on the process.
• Share out. |
![]() |
![]() |
| Training Data | Testing Data |