Could Counterfactuals Explain Algorithmic Decisions Without Opening the Black Box?

Algorithmic systems (such as those deciding mortgage applications, or sentencing decisions) can be very difficult to understand, for experts as well as the general public. Image: Ken Lane (CC BY-NC 2.0).

The EU General Data Protection Regulation (GDPR) has sparked much discussion about the “right to explanation” for the algorithm-supported decisions made about us in our everyday lives. While there’s an obvious need for transparency in the automated decisions that are increasingly being made in areas like policing, education, healthcare and recruitment, explaining how these complex algorithmic decision-making systems arrive at any particular decision is a technically challenging problem—to put it mildly.

In their article “Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR” which is forthcoming in the Harvard Journal of Law & Technology, Sandra Wachter, Brent Mittelstadt, and Chris Russell present the concept of “unconditional counterfactual explanations” as a novel type of explanation of automated decisions that could address many of these challenges. Counterfactual explanations describe the minimum conditions that would have led to an alternative decision (e.g. a bank loan being approved), without the need to describe the full logic of the algorithm.

Relying on counterfactual explanations as a means to help us act rather than merely to understand could help us gauge the scope and impact of automated decisions in our lives. They might also help bridge the gap between the interests of data subjects and data controllers, which might otherwise be a barrier to a legally binding right to explanation.

We caught up with the authors to explore the role of algorithms in our everyday lives, and how a “right to explanation” for decisions might be achievable in practice:

Ed: There’s a lot of discussion about algorithmic “black boxes” — where decisions are made about us, using data and algorithms about which we (and perhaps the operator) have no direct understanding. How prevalent are these systems?

Sandra: Basically, every decision that can be made by a human can now be made by an algorithm, which can be a good thing. Algorithms (when we talk about artificial intelligence) are very good at spotting patterns and correlations that even experienced humans might miss, for example in predicting disease. They are also very cost efficient—they don’t get tired, and they don’t need holidays. This could help to cut costs, for example in healthcare.

Algorithms are also certainly more consistent than humans in making decisions. We have the famous example of judges varying the severity of their judgements depending on whether or not they’ve had lunch. That wouldn’t happen with an algorithm. That’s not to say algorithms are always going to make better decisions: but they do make more consistent ones. If the decision is bad, it’ll be distributed equally, but still be bad. Of course, in a certain way humans are also black boxes—we don’t understand what humans do either. But you can at least try to understand an algorithm: it can’t lie, for example.

Brent: In principle, any sector involving human decision-making could be prone to decision-making by algorithms. In practice, we already see algorithmic systems either making automated decisions or producing recommendations for human decision-makers in online search, advertising, shopping, medicine, criminal justice, etc. The information you consume online, the products you are recommended when shopping, the friends and contacts you are encouraged to engage with, even assessments of your likelihood to commit a crime in the immediate and long-term future—all of these tasks can currently be affected by algorithmic decision-making.

Ed: I can see that algorithmic decision-making could be faster and better than human decisions in many situations. Are there downsides?

Sandra: Simple algorithms that follow a basic decision tree (with parameters decided by people) can be easily understood. But we’re now also using much more complex systems like neural nets that act in a very unpredictable way, and that’s the problem. The system is also starting to become autonomous, rather than being under the full control of the operator. You will see the output, but not necessarily why it got there. This also happens with humans, of course: I could be told by a recruiter that my failure to land a job had nothing to do with my gender (even if it did); an algorithm, however, would not intentionally lie. But of course the algorithm might be biased against me if it’s trained on biased data—thereby reproducing the biases of our world.

We have seen that the COMPAS algorithm used by US judges to calculate the probability of re-offending when making sentencing and parole decisions is a major source of discrimination. Data provenance is massively important, and probably one of the reasons why we have biased decisions. We don’t necessarily know where the data comes from, and whether it’s accurate, complete, biased, etc. We need to have lots of standards in place to ensure that the data set is unbiased. Only then can the algorithm produce nondiscriminatory results.

A more fundamental problem with predictions is that you might never know what would have happened—as you’re just dealing with probabilities; with correlations in a population, rather than with causalities. Another problem is that algorithms might produce correct decisions, but not necessarily fair ones. We’ve been wrestling with the concept of fairness for centuries, without consensus. But lack of fairness is certainly something the system won’t correct itself—that’s something that society must correct.

Brent: The biases and inequalities that exist in the real world and in real people can easily be transferred to algorithmic systems. Humans training learning systems can inadvertently or purposefully embed biases into the model, for example through labelling content as ‘offensive’ or ‘inoffensive’ based on personal taste. Once learned, these biases can spread at scale, exacerbating existing inequalities. Eliminating these biases can be very difficult, hence we currently see much research done on the measurement of fairness or detection of discrimination in algorithmic systems.

These systems can also be very difficult—if not impossible—to understand, for experts as well as the general public. We might traditionally expect to be able to question the reasoning of a human decision-maker, even if imperfectly, but the rationale of many complex algorithmic systems can be highly inaccessible to people affected by their decisions. These potential risks aren’t necessarily reasons to forego algorithmic decision-making altogether; rather, they can be seen as potential effects to be mitigated through other means (e.g. a loan programme weighted towards historically disadvantaged communities), or at least to be weighed against the potential benefits when choosing whether or not to adopt a system.

Ed: So it sounds like many algorithmic decisions could be too complex to “explain” to someone, even if a right to explanation became law. But you propose “counterfactual explanations” as an alternative— i.e. explaining to the subject what would have to change (e.g. about a job application) for a different decision to be arrived at. How does this simplify things?

Brent: So rather than trying to explain the entire rationale of a highly complex decision-making process, counterfactuals allow us to provide simple statements about what would have needed to be different about an individual’s situation to get a different, preferred outcome. You basically work from the outcome: you say “I am here; what is the minimum I need to do to get there?” By providing simple statements that are generally meaningful, and that reveal a small bit of the rationale of a decision, the individual has grounds to change their situation or contest the decision, regardless of their technical expertise. Understanding even a bit of how a decision is made is better than being told “sorry, you wouldn’t understand”—at least in terms of fostering trust in the system.

Sandra: And the nice thing about counterfactuals is that they work with highly complex systems, like neural nets. They don’t explain why something happened, but they explain what happened. And three things people might want to know are:

(1) What happened: why did I not get the loan (or get refused parole, etc.)?

(2) Information so I can contest the decision if I think it’s inaccurate or unfair.

(3) Even if the decision was accurate and fair, tell me what I can do to improve my chances in the future.

Machine learning and neural nets make use of so much information that individuals have really no oversight of what they’re processing, so it’s much easier to give someone an explanation of the key variables that affected the decision. With the counterfactual idea of a “close possible world” you give an indication of the minimal changes required to get what you actually want.

Ed: So would a series of counterfactuals (e.g. “over 18” “no prior convictions” “no debt”) essentially define a space within which a certain decision is likely to be reached? This decision space could presumably be graphed quite easily, to help people understand what factors will likely be important in reaching a decision?

Brent: This would only work for highly simplistic, linear models, which are not normally the type that confound human capacities for understanding. The complex systems that we refer to as ‘black boxes’ are highly dimensional and involve a multitude of (probabilistic) dependencies between variables that can’t be graphed simply. It may be the case that if I were aged between 35-40 with an income of £30,000, I would not get a loan. But, I could be told that if I had an income of £35,000, I would have gotten the loan. I may then assume that an income over £35,000 guarantees me a loan in the future. But, it may turn out that I would be refused a loan with an income above £40,000 because of a change in tax bracket. Non-linear relationships of this type can make it misleading to graph decision spaces. For simple linear models, such a graph may be a very good idea, but not for black box systems; they could, in fact, be highly misleading.

Chris: As Brent says, we’re concerned with understanding complicated algorithms that don’t just use hard cut-offs based on binary features. To use your example, maybe a little bit of debt is acceptable, but it would increase your risk of default slightly, so the amount of money you need to earn would go up. Or maybe certain convictions committed in the past also only increase your risk of defaulting slightly, and can be compensated for with higher salary. It’s not at all obvious how you could graph these complicated interdependencies over many variables together. This is why we picked on counterfactuals as a way to give people a direct and easy to understand path to move from the decision they got now, to a more favourable one at a later date.

Ed: But could a counterfactual approach just end up kicking the can down the road, if we know “how” a particular decision was reached, but not “why” the algorithm was weighted in such a way to produce that decision?

Brent: It depends what we mean by “why”. If this is “why” in the sense of, why was the system designed this way, to consider this type of data for this task, then we should be asking these questions while these systems are designed and deployed. Counterfactuals address decisions that have already been made, but still can reveal uncomfortable knowledge about a system’s design and functionality. So it can certainly inform “why” questions.

Sandra: Just to echo Brent, we don’t want to imply that asking the “why” is unimportant—I think it’s very important, and interpretability as a field has to be pursued, particularly if we’re using algorithms in highly sensitive areas. Even if we have the “what”, the “why” question is still necessary to ensure the safety of those systems.

Chris: And anyone who’s talked to a three-year old knows there is an endless stream of “Why” questions that can be asked. But already, counterfactuals provide a major step forward in answering why, compared to previous approaches that were concerned with providing approximate descriptions of how algorithms make decisions—but not the “why” or the external facts leading to that decision. I think when judging the strength of an explanation, you also have to look at questions like “How easy is this to understand?” and “How does this help the person I’m explaining things to?” For me, counterfactuals are a more immediately useful explanation, than something which explains where the weights came from. Even if you did know, what could you do with that information?

Ed: I guess the question of algorithmic decision making in society involves a hugely complex intersection of industry, research, and policy making? Are we control of things?

Sandra: Artificial intelligence (and the technology supporting it) is an area where many sectors are now trying to work together, including in the crucial areas of fairness, transparency and accountability of algorithmic decision-making. I feel at the moment we see a very multi-stakeholder approach, and I hope that continues in the future. We can see for example that industry is very concerned with it—the Partnership in AI is addressing these topics and trying to come up with a set of industry guidelines, recognising the responsibilities inherent in producing these systems. There are also lots of data scientists (e.g. at the OII and Turing Institute) working on these questions. Policy-makers around the world (e.g. UK, EU, US, China) preparing their countries for the AI future, so it’s on everybody’s mind at the moment. It’s an extremely important topic.

Law and ethics obviously has an important role to play. The opacity, unpredictability of AI and its potentially discriminatory nature, requires that we think about the legal and ethical implications very early on. That starts with educating the coding community, and ensuring diversity. At the same time, it’s important to have an interdisciplinary approach. At the moment we’re focusing a bit too much on the STEM subjects; there’s a lot of funding going to those areas (which makes sense, obviously), but the social sciences are currently a bit neglected despite the major role they play in recognising things like discrimination and bias, which you might not recognise from just looking at code.

Brent: Yes—and we’ll need much greater interaction and collaboration between these sectors to stay ‘in control’ of things, so to speak. Policy always has a tendency to lag behind technological developments; the challenge here is to stay close enough to the curve to prevent major issues from arising. The potential for algorithms to transform society is massive, so ensuring a quicker and more reflexive relationship between these sectors than normal is absolutely critical.

Read the full article: Sandra Wachter, Brent Mittelstadt, Chris Russell (2018) Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR. Harvard Journal of Law & Technology (Forthcoming).

This work was supported by The Alan Turing Institute under the EPSRC grant EP/N510129/1.

Sandra Wachter, Brent Mittelstadt and Chris Russell were talking to blog editor David Sutcliffe.

Should there be a better accounting of the algorithms that choose our news for us?

The Facebook Wall, by René C. Nielsen (Flickr).

A central ideal of democracy is that political discourse should allow a fair and critical exchange of ideas and values. But political discourse is unavoidably mediated by the mechanisms and technologies we use to communicate and receive information—and content personalisation systems (think search engines, social media feeds and targeted advertising), and the algorithms they rely upon, create a new type of curated media that can undermine the fairness and quality of political discourse.

A new article by Brent Mittlestadt explores the challenges of enforcing a political right to transparency in content personalisation systems. Firstly, he explains the value of transparency to political discourse and suggests how content personalisation systems undermine open exchange of ideas and evidence among participants: at a minimum, personalisation systems can undermine political discourse by curbing the diversity of ideas that participants encounter. Second, he explores work on the detection of discrimination in algorithmic decision making, including techniques of algorithmic auditing that service providers can employ to detect political bias. Third, he identifies several factors that inhibit auditing and thus indicate reasonable limitations on the ethical duties incurred by service providers—content personalisation systems can function opaquely and be resistant to auditing because of poor accessibility and interpretability of decision-making frameworks. Finally, Brent concludes with reflections on the need for regulation of content personalisation systems.

He notes that no matter how auditing is pursued, standards to detect evidence of political bias in personalised content are urgently required. Methods are needed to routinely and consistently assign political value labels to content delivered by personalisation systems. This is perhaps the most pressing area for future work—to develop practical methods for algorithmic auditing.

The right to transparency in political discourse may seem unusual and farfetched. However, standards already set by the U.S. Federal Communication Commission’s fairness doctrine—no longer in force—and the British Broadcasting Corporation’s fairness principle both demonstrate the importance of the idealised version of political discourse described here. Both precedents promote balance in public political discourse by setting standards for delivery of politically relevant content. Whether it is appropriate to hold service providers that use content personalisation systems to a similar standard remains a crucial question.

Read the full article: Mittelstadt, B. (2016) Auditing for Transparency in Content Personalization Systems. International Journal of Communication 10(2016), 4991–5002.

We caught up with Brent to explore the broader implications of the study:

Ed: We basically accept that the tabloids will be filled with gross bias, populism and lies (in order to sell copy)—and editorial decisions are not generally transparent to us. In terms of their impact on the democratic process, what is the difference between the editorial boardroom and a personalising social media algorithm?

Brent: There are a number of differences. First, although not necessarily transparent to the public, one hopes that editorial boardrooms are at least transparent to those within the news organisations. Editors can discuss and debate the tone and factual accuracy of their stories, explain their reasoning to one another, reflect upon the impact of their decisions on their readers, and generally have a fair debate about the merits and weaknesses of particular content.

This is not the case for a personalising social media algorithm; those working with the algorithm inside a social media company are often unable to explain why the algorithm is functioning in a particular way, or determined a particular story or topic to be ‘trending’ or displayed to particular users, while others are not. It is also far more difficult to ‘fact check’ algorithmically curated news; a news item can be widely disseminated merely by many users posting or interacting with it, without any purposeful dissemination or fact checking by the platform provider.

Another big difference is the degree to which users can be aware of the bias of the stories they are reading. Whereas a reader of The Daily Mail or The Guardian will have some idea of the values of the paper, the same cannot be said of platforms offering algorithmically curated news and information. The platform can be neutral insofar as it disseminates news items and information reflecting a range of values and political viewpoints. A user will encounter items reflecting her particular values (or, more accurately, her history of interactions with the platform and the values inferred from them), but these values, and their impact on her exposure to alternative viewpoints, may not be apparent to the user.

Ed: And how is content “personalisation” different to content filtering (e.g. as we see with the Great Firewall of China) that people get very worked up about? Should we be more worried about personalisation?

Brent: Personalisation and filtering are essentially the same mechanism; information is tailored to a user or users according to some prevailing criteria. One difference is whether content is merely infeasible to access, or technically inaccessible. Content of all types will typically still be accessible in principle when personalisation is used, but the user will have to make an effort to access content that is not recommended or otherwise given special attention. Filtering systems, in contrast, will impose technical measures to make particular content inaccessible from a particular device or geographical area.

Another difference is the source of the criteria used to set the visibility of different types of content. In the case of personalisation, these criteria are typically based on the users (inferred) interests, values, past behaviours and explicit requests. Critically, these values are not necessarily apparent to the user. For filtering, criteria are typically externally determined by a third party, often a government. Some types of information are set off limits, according to the prevailing values of the third party. It is the imposition of external values, which limit the capacity of users to access content of their choosing, which often causes an outcry against filtering and censorship.

Importantly, the two mechanisms do not necessarily differ in terms of the transparency of the limiting factors or rules to users. In some cases, such as the recently proposed ban in the UK of adult websites that do not provide meaningful age verification mechanisms, the criteria that determine whether sites are off limits will be publicly known at a general level. In other cases, and especially with personalisation, the user inside the ‘filter bubble’ will be unaware of the rules that determine whether content is (in)accessible. And it is not always the case that the platform provider intentionally keeps these rules secret. Rather, the personalisation algorithms and background analytics that determine the rules can be too complex, inaccessible or poorly understood even by the provider to give the user any meaningful insight.

Ed: Where are these algorithms developed: are they basically all proprietary? i.e. how would you gain oversight of massively valuable and commercially sensitive intellectual property?

Brent: Personalisation algorithms tend to be proprietary, and thus are not normally open to public scrutiny in any meaningful sense. In one sense this is understandable; personalisation algorithms are valuable intellectual property. At the same time the lack of transparency is a problem, as personalisation fundamentally affects how users encounter and digest information on any number of topics. As recently argued, it may be the case that personalisation of news impacts on political and democratic processes. Existing regulatory mechanisms have not been successful in opening up the ‘black box’ so to speak.

It can be argued, however, that legal requirements should be adopted to require these algorithms to be open to public scrutiny due to the fundamental way they shape our consumption of news and information. Oversight can take a number of forms. As I argue in the article, algorithmic auditing is one promising route, performed both internally by the companies themselves, and externally by a government agency or researchers. A good starting point would be for the companies developing and deploying these algorithms to extend their cooperation with researchers, thereby allowing a third party to examine the effects these systems are having on political discourse, and society more broadly.

Ed: By “algorithm audit”—do you mean examining the code and inferring what the outcome might be in terms of bias, or checking the outcome (presumably statistically) and inferring that the algorithm must be introducing bias somewhere? And is it even possible to meaningfully audit personalisation algorithms, when they might rely on vast amounts of unpredictable user feedback to train the system?

Brent: Algorithm auditing can mean both of these things, and more. Audit studies are a tool already in use, whereby human participants introduce different inputs into a system, and examine the effect on the system’s outputs. Similar methods have long been used to detect discriminatory hiring practices, for instance. Code audits are another possibility, but are generally prohibitive due to problems of access and complexity. Also, even if you can access and understand the code of an algorithm, that tells you little about how the algorithm performs in practice when given certain input data. Both the algorithm and input data would need to be audited.

Alternatively, auditing can assess just the outputs of the algorithm; recent work to design mechanisms to detect disparate impact and discrimination, particularly in the Fairness, Accountability and Transparency in Machine Learning (FAT-ML) community, is a great example of this type of auditing. Algorithms can also be designed to attempt to prevent or detect discrimination and other harms as they occur. These methods are as much about the operation of the algorithm, as they are about the nature of the training and input data, which may itself be biased. In short, auditing is very difficult, but there are promising avenues of research and development. Once we have reliable auditing methods, the next major challenge will be to tailor them to specific sectors; a one-size-meets-all approach to auditing is not on the cards.

Ed: Do you think this is a real problem for our democracy? And what is the solution if so?

Brent: It’s difficult to say, in part because access and data to study the effects of personalisation systems are hard to come by. It is one thing to prove that personalisation is occurring on a particular platform, or to show that users are systematically displayed content reflecting a narrow range of values or interests. It is quite another to prove that these effects are having an overall harmful effect on democracy. Digesting information is one of the most basic elements of social and political life, so any mechanism that fundamentally changes how information is encountered should be subject to serious and sustained scrutiny.

Assuming personalisation actually harms democracy or political discourse, mitigating its effects is quite a different issue. Transparency is often treated as the solution, but merely opening up algorithms to public and individual scrutiny will not in itself solve the problem. Information about the functionality and effects of personalisation must be meaningful to users if anything is going to be accomplished.

At a minimum, users of personalisation systems should be given more information about their blind spots, about the types of information they are not seeing, or where they lie on the map of values or criteria used by the system to tailor content to users. A promising step would be proactively giving the user some idea of what the system thinks it knows about them, or how they are being classified or profiled, without the user first needing to ask.

Brent Mittelstadt was talking to blog editor David Sutcliffe.