Machine Learning, Higher Education, and Bias

This week I am attending the Open Source Conference (OSCON) and learning about all kinds of new open source software and projects.  One of the popular new technologies being applied to many problems is machine learning, which is a technique used to extract meaning from data without explicitly programming the computer to follow certain pre-defined rules.  Instead, the software intuits connections between data from the past to predict future outcomes.

Machine learning is being used to analyze data in higher education.  For instance, Moodle has a built-in  analytics tool that can make predictions such as whether a student will be likely to drop out of a class.  The tool uses aggregated student data from past courses, including the amount of interaction with activities in the class (Swarthmore has not activated this Moodle feature and has no plans to turn it on).  

There are lots of other places that can make use of machine learning.  The Slate admissions service has a module to predict which accepted students will ultimately end up enrolling in the college based on their application data.

The appeal of these systems is that we can use the data already being collected about students to help us achieve better educational outcomes.  We might get an early warning if a student might be struggling in a class or recommend courses of interest to a student.

One of the main concerns about machine learning is that by training the software on past data, it can reinforce existing biases.  One particularly egregious example of reinforced bias occurred when company designed a machine learning program to automatically review resumes with the goal of determining the best candidates for job openings.  The existing employee pool may not have been particularly diverse because:

… the algorithm found two factors to be most indicative of job performance: their name was Jared, and whether they played high school lacrosse.

Companies are on the hook if their hiring algorithms are biased, Dave Gershgorn, Quartz, October 22, 2018

Because the details of why the machine learning algorithm makes connections are hidden, these biases can be hard to detect.

Swarthmore ITS staff keeps abreast of emerging technologies and carefully evaluates new products, discussing different options with campus constituents before implementing new services.  If you have any questions or concerns, please get in touch with us.