Remember that the darkest black box of all is the pink squishy one between your ears
Algorithms are starting to be used in applications with high-stakes, consequential decisions across a variety of domains. These include sentencing criminals, making medical prescriptions, and hiring employees (among many others). In response to this shift towards AI-driven decision making, much ink has been spilled and many brows furled in consternation about the problem of “black box” machine learning algorithms. Many journalists and critics have thoughtfully pointed to the potential of algorithms discriminating against minorities, loading on spurious variables that shouldn’t affect consequential decisions, and using inscrutably complicated logic that can’t be rationalized by any human being.
In many situations, these concerns are well-founded and algorithms should be implemented with a great deal of caution. However, as we continue to find new applications for machine learning algorithms, we should not let this focus on algorithmic explainability blind us from a harsh truth about the world: human decisions are often capricious, irrational, and not any more explainable than the most opaque algorithm out there.
For purposes of this discussion, its useful to break down applications of algorithms into two categories: one category is for when algorithms are being used to automate a decision that is currently made by humans; the other category is for applications in which algorithms are being used to replace rule-based processes. Rule-based processes are those in which a simple set of easily-measured criteria are used to make a decision. Rule-based processes are great precisely because they are so scrutable. Of course, the rules themselves might not be great (as in many mandatory sentencing statutes), but at least rules-based processes have clearly articulated criteria that can be debated and evaluated against other proposals.
The value of “explainability” in this second category of applications is quite apparent. Moving from a rules-based world to the black box world of random forests and neural nets can understandably be disorienting for policymakers. If a university used to have simple SAT and GPA cut-offs to make their admission decisions, replacing this process with a deep neural net trained on dozens of features would clearly raise some specific questions about how SAT scores and GPAs factor into the algorithm’s admissions decisions.
However, I do not think the same standards of explainability should be required for applications from the first category — when algorithms are being used to replace purely human decisions. As I’ve mentioned elsewhere (and other researchers have emphasized as well), it is important to evaluate the utility of algorithms against the system that they are replacing. This is why the distinction between the two types of applications — those replacing humans and those replacing rules — is important. And when we focus specifically on applications in which algorithms are replacing humans, it becomes clear that explainability is an indefensible double standard.
Humans are predictably irrational
While the latest advancements in machine learning and algorithmic decision making have taken place fairly recently, human brains have been around for a long time. There is plenty of new research emerging about how algorithms make decisions, but researchers have had decades (if not millennia!) to investigate how the human brain makes decisions. And one of the most replicable and consistent findings from this research is that extraneous factors affect human decisions in almost every context imaginable.
A simple example of this is what psychologists call the “anchoring effect”. To demonstrate just how easily humans are influenced by irrelevant information, consider this classic study by Ariely, Lowenstein, & Prelec (2003): The researchers asked students to write down the last two digits of their social security numbers and indicate whether they would be willing to pay that amount for a box of chocolates. To elicit the students’ true valuation of the chocolates, they then had the students bid on the box in an enforced auction. While it should be clear to you and me that the last two digits of your SSN (essentially a random number) should have no bearing on how much you value a box of chocolates, the researchers found a significant correlation between the SSN digits and the students’ actual willingness-to-pay. Furthermore, despite statistical evidence to the contrary, the vast majority of students insisted that their SSN digits had zero impact on their bids.
Another widely publicized example of irrelevant factors influencing human decisions is the “hungry judges” study. The study’s results suggest that judges are more likely to grant favorable parole decisions to defendants just after their lunch break (when their stomachs are full) than just before their lunch break (when their blood sugar is low).
Maybe you have some misgivings about these particular examples: they feel too contrived, the stakes aren’t high enough, the sample sizes weren’t big enough, or the confounding variables weren’t sufficiently controlled for. (Valid criticisms do exist; for example, see  and .) You are more than welcome to ignore these studies, but there are hundreds of well-researched examples of major cognitive biases. Indeed, the behavioral economist Richard Thaler recently won the Nobel Prize, largely for his career’s worth of work demonstrating that these cognitive biases persist even in high-stakes situations with significant consequences. What you can’t ignore is the overwhelming conclusion from this vast body of research on judgment and decision making: humans consistently let extraneous factors affect their decisions.
At least we can explain ourselves… right?
While cognitive biases are pernicious themselves, what’s worse is that when you ask people to explain their decisions, they often have no idea why they acted the way they did. Just as Ariely’s students insisted that their social security numbers did not affect how they perceived the box of chocolates, we often aren’t even aware of how biases enter into our thought processes. Furthermore, even when we do provide plausible reasons for a particular decision, there is ample evidence that these are often mere confabulations.
A classic paper that demonstrates these effects is “Telling more than we can know” by Nisbett and Wilson (1977). I highly recommend reading the entire paper to fully appreciate just how absurdly common it is for humans to pull plausible rationalizations out of thin air, but I will let a simple summary from their abstract illustrate the point:
Evidence is reviewed which suggests that there may be little or no direct introspective access to higher order cognitive processes. Subjects are sometimes (a) unaware of the existence of a stimulus that importantly influenced a response, (b) unaware of the existence of the response, and (c) unaware that the stimulus has affected the response.
This is all a fancy academic way of saying that people often have no idea why they made a particular decision, even when researchers can statistically prove that extraneous factors are involved.
Algorithms aren’t so bad after all
When we properly evaluate the use algorithms to automate human decisions— by keeping in mind the prevalence and predictability of our own cognitive biases — they actually start to look quite favorable in comparison. At least with an algorithm will give you the same answer at both the beginning and end of its shift. Algorithms also don’t have any social reputations or egos to maintain. So when we start peeking under the hood and investigating how they arrived at a particular decision, they can’t defend themselves with seemingly plausible, post hoc, just-so rationalizations.
Don’t get me wrong: I am all for a better understanding of how opaque algorithms make their decisions. But it’s time we stop fooling ourselves into believing that human beings are any less opaque when it comes to rationalizing their decisions. In fact, it is only with the determinism and consistency of algorithms — not the unpredictability and capriciousness of humans — that we can even begin to rigorously interrogate their logic and measure their improvement over time.
We lose understanding, but we gain results
To a social scientist or economist, explainability is absolutely paramount: the primary goal in most scientific research is to arrive a theory that explains how and why things work the way they do. However, to a consequentialist — i.e., someone who’s principle concern is about what is actually happening in the world — explainability must take a back seat. If we care about reducing the amount of racial injustice and increasing equitable access for all classes of people, then this is the metric by which we should compare human and algorithmic decision makers.
So long as algorithms actually do reduce bias and discrimination — as they have been shown to do in existing studies on the topic — we should sideline explainability as a secondary priority. Ensuring that algorithms be explainable is no doubt a valuable goal — but those who insist on explainability must ask whether this goal is more valuable than actual outcomes in the systems we are seeking to improve.
Why do we care so much about explainable algorithms? In defense of the black box was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.