How machine learning can reinforce systemic racism

Over Thanksgiving, the Washington Post ran a profile of the babysitting startup Predictim:

So she turned to Predictim, an online service that uses “advanced artificial intelligence” to assess a babysitter’s personality, and aimed its scanners at one candidate’s thousands of Facebook, Twitter and Instagram posts.

The system offered an automated “risk rating” of the 24-year-old woman, saying she was at a “very low risk” of being a drug abuser. But it gave a slightly higher risk assessment — a 2 out of 5 — for bullying, harassment, being “disrespectful” and having a “bad attitude.”

Machine learning works by making predictions based on a giant corpus of existing data, which grows, is corrected, and becomes more accurate over time. If the algorithm's original picks are off, the user lets the software know, and this signal is incorporated back into the corpus. So to be any use at all, the system broadly depends on two important factors: the quality of the original data, and the quality of the aggregate user signal.

In the case of Predictim, it needs to have a great corpus of data about a babysitter's social media posts and how it relates to their real-world activity. Somehow, it needs to be able to find patterns in the way they use Instagram, say, and how that relates to whether they're a drug user or have gone to jail. Then, assuming Predictim has a user feedback component, the users need to accurately gauge whether the algorithm made a good decision. Whereas in many systems a data point might be reinforced by hundreds or thousands of users giving feedback, presumably a babysitter has comparatively fewer interactions with parents. So the quality of each instance of that parental feedback is really important.

It made me think of COMPAS, a commercial system that provides an assessment of how likely a criminal defendant is to recidivate. This tool is just one that courts are using to actually adjust their sentences, particularly with respect to parole. Unsurprisingly, when ProPublica analyzed the data, inaccuracies fell along racial lines:

Black defendants were also twice as likely as white defendants to be misclassified as being a higher risk of violent recidivism. And white violent recidivists were 63 percent more likely to have been misclassified as a low risk of violent recidivism, compared with black violent recidivists.

It all comes down to that corpus of data. And when the underlying system of justice is fundamentally racist - as it is in the United States, and in most places - the data will be too. Any machine learning algorithm supported by that data will, in turn, make racist decisions. The biggest difference is that while we've come to understand that the human-powered justice system is beset with bias, that understanding with respect to artificial intelligence is not yet widespread. For many, in fact, the promise of artificial intelligence is specifically - and erroneously - that it is unbiased.

Do we think parents - particularly in the affluent, white-dominated San Francisco Bay Area communities where Predictim is likely to launch - are more or less likely to give positive feedback to babysitters from communities of color? Do we think the algorithm will mark down people who use language most often used in underrepresented communities in their social media posts?

Of course, this is before we even touch the Minority Report pre-crime implications of technologies like these: they aim to predict how we will act, vs how we have acted. The only possible outcome is that people whose behavior fits within a narrow set of norms will more easily find gainful employment, because the algorithms will be trained to support this behavior, while others find it harder to find jobs they might, in reality, be able to do better.

It also incentivizes a broad surveillance society and repaints the tracking of data about our actions as a social good. When knowledge about the very existence of surveillance creates a chilling effect on our actions, and knowledge about our actions can be used to influence democratic elections, this is a serious civil liberties issue.

Technology can have a part to play in building safer, fairer societies. But the rules they enforce must be built with care, empathy, and intelligence. There is an enormous part to play here not just for user researchers, but for sociologists, psychologists, criminal justice specialists, and representatives from the communities that will be most affected. Experts matter here. It's just one more reason that every team should incorporate people from a wide range of backgrounds: one way for a team to make better decisions on issues with societal implications is for them to be more inclusive.