It is natural to assume that algorithms are neutral and unbiased. Or that a machine learning model, trained on “real world” data will inherently reflect the world at large. However, recent news coverage has demonstrated that machine learning is prone to amplifying existing sexist, racist biases ailing our society. In one often-cited example, Amazon trained an AI recruitment tool only to discover that it is biased against females by discounting any time a resume mentions “women’s” as in “women’s college” or “caption of women’s soccer.” Why? Because the tool is trained on historical data, in other words, resumes from past hiring decisions that skew heavily male.
Amazon says the algorithm was never actually deployed but it rings a bell for how easily dangerous these algorithms can become. Thankfully a lot of research has been devoted to this area. But we at ReadyAI also believe that educating students to proactively think about algorithmic bias is essential, for them to become informed citizens or even future ML engineers.
In this lesson, adapted from WashU, students will explore the ways bias can seep into a machine learning model. Students are asked to imagine if they are data scientists at a bank, whose core business is lending money. The bank’s traditional way of approving loans manually is slow and costly. So, students’ task is to automate this process, to train a model to predict loan approvals. However, the catch is that the dataset is purposely designed to be unbalanced. Women comprise only 15% of the training data.
In the assignment, students will walk through every step in the machine learning workflow in Python from data collection, data cleaning, visualization, model training to testing. In the visualization step, gender mismatch becomes apparent. However, as they continue to train the model (using logistic regression), the validation score comes out to be quite high. As they look a little closer, if validation scores for men and women are calculated separately, a big discrepancy emerges. The assignment demonstrates how model evaluation methods like classification accuracy can be misleadingly high, allowing bias to remain undetected.
Underrepresentation is one of the most common sources of bias in machine learning algorithms. If the data the model is trained on is missing samples from one group, it certainly will not perform equally well for those groups. This is the reason voice assistants have trouble understanding accents. It is also the reason for the above mentioned Amazon recruiting bias against women.
We conclude the lesson by prompting students to consider the impacts of machine learning algorithms if they are not used carefully and fairly. Students also have a discussion on what it means for machine learning models to be fair, and their ideas to help alleviate this issue.
From a hands-on Python assignment, students are learning about algorithmic bias firsthand, in addition to getting familiar with the machine learning workflow and methods such as logistic regression. This lesson is a great introduction for high school students who have some programming experience or are interested in machine learning. The more often and earlier we have a discussion about AI fairness, the longer we hope it will stay on students’ minds as they interact with various AI systems in their daily lives.