Could Counterfactuals Explain Algorithmic Decisions Without Opening the Black Box?

The EU General Data Protection Regulation (GDPR) has sparked much discussion about the “right to explanation” for the algorithm-supported decisions made about us in our everyday lives. While there’s an obvious need for transparency in the automated decisions that are increasingly being made in areas like policing, education, healthcare and recruitment, explaining how these complex algorithmic decision-making systems arrive at any particular decision is a technically challenging problem—to put it mildly.

In their article “Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR” which is forthcoming in the Harvard Journal of Law & Technology, Sandra Wachter, Brent Mittelstadt, and Chris Russell present the concept of “unconditional counterfactual explanations” as a novel type of explanation of automated decisions that could address many of these challenges. Counterfactual explanations describe the minimum conditions that would have led to an alternative decision (e.g. a bank loan being approved), without the need to describe the full logic of the algorithm.

Relying on counterfactual explanations as a means to help us act rather than merely to understand could help us gauge the scope and impact of automated decisions in our lives. They might also help bridge the gap between the interests of data subjects and data controllers, which might otherwise be a barrier to a legally binding right to explanation.

We caught up with the authors to explore the role of algorithms in our everyday lives, and how a “right to explanation” for decisions might be achievable in practice:

Ed: There’s a lot of discussion about algorithmic “black boxes” — where decisions are made about us, using data and algorithms about which we (and perhaps the operator) have no direct understanding. How prevalent are these systems?

Sandra: Basically, every decision that can be made by a human can now be made by an algorithm. Which can be a good thing. Algorithms (when we talk about artificial intelligence) are very good at spotting patterns and correlations that even experienced humans might miss, for example in predicting disease. They are also very cost efficient—they don’t get tired, and they don’t need holidays. This could help to cut costs, for example in healthcare.

Algorithms are also certainly more consistent than humans in making decisions. We have the famous example of judges varying the severity of their judgements depending on whether or not they’ve had lunch. That wouldn’t happen with an algorithm. That’s not to say algorithms are always going to make better decisions: but they do make more consistent ones. If the decision is bad, it’ll be distributed equally, but still be bad. Of course, in a certain way humans are also black boxes—we don’t understand what humans do either. But you can at least try to understand an algorithm: it can’t lie, for example.

Brent: In principle, any sector involving human decision-making could be prone to decision-making by algorithms. In practice, we already see algorithmic systems either making automated decisions or producing recommendations for human decision-makers in online search, advertising, shopping, medicine, criminal justice, etc. The information you consume online, the products you are recommended when shopping, the friends and contacts you are encouraged to engage with, even assessments of your likelihood to commit a crime in the immediate and long-term future—all of these tasks can currently be affected by algorithmic decision-making.

Ed: I can see that algorithmic decision-making could be faster and better than human decisions in many situations. Are there downsides?

Sandra: Simple algorithms that follow a basic decision tree (with parameters decided by people) can be easily understood. But we’re now also using much more complex systems like neural nets that act in a very unpredictable way, and that’s the problem. The system is also starting to become autonomous, rather than being under the full control of the operator. You will see the output, but not necessarily why it got there. This also happens with humans, of course: I could be told by a recruiter that my failure to land a job had nothing to do with my gender (even if it did); an algorithm, however, would not intentionally lie. But of course the algorithm might be biased against me if it’s trained on biased data—thereby reproducing the biases of our world.

We have seen that the COMPAS algorithm used by US judges to calculate the probability of re-offending when making sentencing and parole decisions is a major source of discrimination. Data provenance is massively important, and probably one of the reasons why we have biased decisions. We don’t necessarily know where the data comes from, and whether it’s accurate, complete, biased, etc. We need to have lots of standards in place to ensure that the data set is unbiased. Only then can the algorithm produce nondiscriminatory results.

A more fundamental problem with predictions is that you might never know what would have happened—as you’re just dealing with probabilities; with correlations in a population, rather than with causalities. Another problem is that algorithms might produce correct decisions, but not necessarily fair ones. We’ve been wrestling with the concept of fairness for centuries, without consensus. But lack of fairness is certainly something the system won’t correct itself—that’s something that society must correct.

Brent: The biases and inequalities that exist in the real world and in real people can easily be transferred to algorithmic systems. Humans training learning systems can inadvertently or purposefully embed biases into the model, for example through labelling content as ‘offensive’ or ‘inoffensive’ based on personal taste. Once learned, these biases can spread at scale, exacerbating existing inequalities. Eliminating these biases can be very difficult, hence we currently see much research done on the measurement of fairness or detection of discrimination in algorithmic systems.

These systems can also be very difficult—if not impossible—to understand, for experts as well as the general public. We might traditionally expect to be able to question the reasoning of a human decision-maker, even if imperfectly, but the rationale of many complex algorithmic systems can be highly inaccessible to people affected by their decisions. These potential risks aren’t necessarily reasons to forego algorithmic decision-making altogether; rather, they can be seen as potential effects to be mitigated through other means (e.g. a loan programme weighted towards historically disadvantaged communities), or at least to be weighed against the potential benefits when choosing whether or not to adopt a system.

Ed: So it sounds like many algorithmic decisions could be too complex to “explain” to someone, even if a right to explanation became law. But you propose “counterfactual explanations” as an alternative— i.e. explaining to the subject what would have to change (e.g. about a job application) for a different decision to be arrived at. How does this simplify things?

Brent: So rather than trying to explain the entire rationale of a highly complex decision-making process, counterfactuals allow us to provide simple statements about what would have needed to be different about an individual’s situation to get a different, preferred outcome. You basically work from the outcome: you say “I am here; what is the minimum I need to do to get there?” By providing simple statements that are generally meaningful, and that reveal a small bit of the rationale of a decision, the individual has grounds to change their situation or contest the decision, regardless of their technical expertise. Understanding even a bit of how a decision is made is better than being told “sorry, you wouldn’t understand”—at least in terms of fostering trust in the system.

Sandra: And the nice thing about counterfactuals is that they work with highly complex systems, like neural nets. They don’t explain why something happened, but they explain what happened. And three things people might want to know are:

(1) What happened: why did I not get the loan (or get refused parole, etc.)?

(2) Information so I can contest the decision if I think it’s inaccurate or unfair.

(3) Even if the decision was accurate and fair, tell me what I can do to improve my chances in the future.

Machine learning and neural nets make use of so much information that individuals have really no oversight of what they’re processing, so it’s much easier to give someone an explanation of the key variables that affected the decision. With the counterfactual idea of a “close possible world” you give an indication of the minimal changes required to get what you actually want.

Ed: So would a series of counterfactuals (e.g. “over 18” “no prior convictions” “no debt”) essentially define a space within which a certain decision is likely to be reached? This decision space could presumably be graphed quite easily, to help people understand what factors will likely be important in reaching a decision?

Brent: This would only work for highly simplistic, linear models, which are not normally the type that confound human capacities for understanding. The complex systems that we refer to as ‘black boxes’ are highly dimensional and involve a multitude of (probabilistic) dependencies between variables that can’t be graphed simply. It may be the case that if I were aged between 35-40 with an income of £30,000, I would not get a loan. But, I could be told that if I had an income of £35,000, I would have gotten the loan. I may then assume that an income over £35,000 guarantees me a loan in the future. But, it may turn out that I would be refused a loan with an income above £40,000 because of a change in tax bracket. Non-linear relationships of this type can make it misleading to graph decision spaces. For simple linear models, such a graph may be a very good idea, but not for black box systems; they could, in fact, be highly misleading.

Chris: As Brent says, we’re concerned with understanding complicated algorithms that don’t just use hard cut-offs based on binary features. To use your example, maybe a little bit of debt is acceptable, but it would increase your risk of default slightly, so the amount of money you need to earn would go up. Or maybe certain convictions committed in the past also only increase your risk of defaulting slightly, and can be compensated for with higher salary. It’s not at all obvious how you could graph these complicated interdependencies over many variables together. This is why we picked on counterfactuals as a way to give people a direct and easy to understand path to move from the decision they got now, to a more favourable one at a later date.

Ed: But could a counterfactual approach just end up kicking the can down the road, if we know “how” a particular decision was reached, but not “why” the algorithm was weighted in such a way to produce that decision?

Brent: It depends what we mean by “why”. If this is “why” in the sense of, why was the system designed this way, to consider this type of data for this task, then we should be asking these questions while these systems are designed and deployed. Counterfactuals address decisions that have already been made, but still can reveal uncomfortable knowledge about a system’s design and functionality. So it can certainly inform “why” questions.

Sandra: Just to echo Brent, we don’t want to imply that asking the “why” is unimportant—I think it’s very important, and interpretability as a field has to be pursued, particularly if we’re using algorithms in highly sensitive areas. Even if we have the “what”, the “why” question is still necessary to ensure the safety of those systems.

Chris: And anyone who’s talked to a three-year old knows there is an endless stream of “Why” questions that can be asked. But already, counterfactuals provide a major step forward in answering why, compared to previous approaches that were concerned with providing approximate descriptions of how algorithms make decisions—but not the “why” or the external facts leading to that decision. I think when judging the strength of an explanation, you also have to look at questions like “How easy is this to understand?” and “How does this help the person I’m explaining things to?” For me, counterfactuals are a more immediately useful explanation, than something which explains where the weights came from. Even if you did know, what could you do with that information?

Ed: I guess the question of algorithmic decision making in society involves a hugely complex intersection of industry, research, and policy making? Are we control of things?

Sandra: Artificial intelligence (and the technology supporting it) is an area where many sectors are now trying to work together, including in the crucial areas of fairness, transparency and accountability of algorithmic decision-making. I feel at the moment we see a very multi-stakeholder approach, and I hope that continues in the future. We can see for example that industry is very concerned with it—the Partnership in AI is addressing these topics and trying to come up with a set of industry guidelines, recognising the responsibilities inherent in producing these systems. There are also lots of data scientists (eg at the OII and Turing Institute) working on these questions. Policy-makers around the world (e.g. UK, EU, US, China) preparing their countries for the AI future, so it’s on everybody’s mind at the moment. It’s an extremely important topic.

Law and ethics obviously has an important role to play. The opacity, unpredictability of AI and its potentially discriminatory nature, requires that we think about the legal and ethical implications very early on. That starts with educating the coding community, and ensuring diversity. At the same time, it’s important to have an interdisciplinary approach. At the moment we’re focusing a bit too much on the STEM subjects; there’s a lot of funding going to those areas (which makes sense, obviously), but the social sciences are currently a bit neglected despite the major role they play in recognising things like discrimination and bias, which you might not recognise from just looking at code.

Brent: Yes—and we’ll need much greater interaction and collaboration between these sectors to stay ‘in control’ of things, so to speak. Policy always has a tendency to lag behind technological developments; the challenge here is to stay close enough to the curve to prevent major issues from arising. The potential for algorithms to transform society is massive, so ensuring a quicker and more reflexive relationship between these sectors than normal is absolutely critical.

Read the full article: Sandra Wachter, Brent Mittelstadt, Chris Russell (2018) Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR. Harvard Journal of Law & Technology (Forthcoming).

This work was supported by The Alan Turing Institute under the EPSRC grant EP/N510129/1.


Sandra Wachter, Brent Mittelstadt and Chris Russell were talking to blog editor David Sutcliffe.

Latest Report by UN Special Rapporteur for the Right to Freedom of Expression is a Landmark Document

“The digital access industry is in the business of digital expression (…). Since privately owned networks are indispensable to the contemporary exercise of freedom of expression, their operators also assume critical social and public functions. The industry’s decisions (…) can directly impact freedom of expression and related human rights in both beneficial and detrimental ways.” [Report of the Special Rapporteur on the right to freedom of expression, June 2017]

The Internet is often portrayed as a disruptive equalizer, an information medium able to directly give individuals access to information and provide a platform to share their opinions unmediated. But the Internet is also a tool for surveillance, censorship, and information warfare. Often states drive such practices, but increasingly the private sector plays a role. While states have a clear obligation to protect human rights on the Internet, questions surrounding the human right accountability of the private sector are unclear. Which begs the question what the responsibility is of the private industry, which runs and owns much of the Internet, towards human rights?

During the 35th session of the United Nations (UN) Human Rights Council this month, David Kaye, UN Special Rapporteur (UNSR) for the right to freedom of expression, presented his latest report [1], which focuses on the role of the private sector in the provision of Internet and telecommunications access. The UNSR on freedom of expression is an independent expert, appointed by the Human Rights Council to analyse, document, and report on the state of freedom of expression globally [2]. The rapporteur is also expected to make recommendations towards ‘better promoting and protection of the right to freedom of expression’ [3]. In recent years, the UNSRs on freedom of expression increasingly focus on the intersection between access to information, expression, and the Internet [4].

This most recent report is a landmark document. Its focus on the role and responsibilities of the private sector towards the right to freedom of expression presents a necessary step forward in the debate about the responsibility for the realization of human rights online. The report takes on the legal difficulties surrounding the increased reliance of states on access to privately owned networks and data, whether by necessity, through cooperation, or through coercion, for surveillance, security, and service provision. It also tackles the legal responsibilities that private organizations have to respect human rights.

The first half of Kaye’s report emphasises the role of states in protecting the right to freedom of expression and access to information online, in particular in the context of state-mandated Internet shutdowns and private-public data sharing. Kaye highlights several major Internet shutdowns across the world and argues that considering ‘the number of essential activities and services they affect, shutdowns restrict expression and interfere with other fundamental rights’ [5]. In order to address this issue, he recommends that the Human Rights Council supplements and specifies resolution 32/13, on ‘the promotion, protection and enjoyment of human rights on the Internet’ [6], in which it condemns such disruptions to the network. On the interaction between private actors and the state, Kaye walks a delicate line. On the one hand, he argues that governments should not pressure or threaten companies to provide them with access to data. On the other hand, he also argues that states should not allow companies to make network management decisions that treat data differentially based on its origin.

The second half of the report focusses on the responsibility of the private sector. In this context, the UNSR highlights the responsibilities of private actors towards the right to freedom of expression. Kaye argues that this sector plays a crucial role in providing access to information and communication services to millions across the globe. He looks specifically at the role of telecommunication and Internet service providers, Internet exchange points, content delivery networks, network equipment vendors, and other private actors. He argues that four contextual factors are relevant to understanding the responsibility of private actors vis-à-vis human rights:

(1) private actors provide access to ‘a public good’,
(2) due to the technical nature of the Internet, any restrictions on access affect freedom of expression on a global level,
(3) the private sector is vulnerable to state pressure,
(4) but it is also in a unique position to respect users’ rights.

The report draws out the dilemma of the boundaries of responsibility. When should companies decide to comply with state policies that might undermine the rights of Internet end-users? What remedies should they offer end-users if they are complicit in human rights violations? How can private actors assess what impact their technologies might have on human rights?

Private actors across the spectrum, from multinational social media platforms to the garage-based start-ups are likely to run into these questions. As the Internet underpins a large part of the functioning of our societies, and will only further continue to do so as physical devices increasingly become part of the network (aka the Internet of Things), it is even more important to understand and allocate private sector responsibility for protecting human rights.

The report has a dedicated addendum [7] that specifically details the responsibility of Internet Standard Developing Organizations (SDOs). In it, Kaye relies on the article written by Corinne Cath and Luciano Floridi of the Oxford Internet Institute (OII) entitled ‘The Design of the Internet’s Architecture by the Internet Engineering Task Force (IETF) and Human Rights’ [8] to support his argument that SDOs should take on a credible approach to human rights accountability.

Overall, Kaye argues that companies should adopt the UN Guiding Principles on Business and Human Rights [9], which would provide a ‘minimum baseline for corporate human rights accountability’. To operationalize this commitment, the private sector will need to take several urgent steps. It should ensure that sufficient resources are reserved for meeting its responsibility towards human rights, and it should integrate the principles of due diligence, human rights by design, stakeholder engagement, mitigation of the harms of government-imposed restrictions, transparency, and effective remedies to complement its ‘high level commitment to human rights’.

While this report is not binding [10] on states or companies, it does set out a much-needed detailed blue print of how to address questions of corporate responsibility towards human rights in the digital age.

References

[1] https://documents-dds-ny.un.org/doc/UNDOC/GEN/G17/077/46/PDF/G1707746.pdf?OpenElement
[2] http://www.ijrcenter.org/un-special-procedures/
[3] http://www.ohchr.org/EN/Issues/FreedomOpinion/Pages/OpinionIndex.aspx
[4] http://www2.ohchr.org/english/bodies/hrcouncil/docs/17session/A.HRC.17.27_en.pdf
[5] The author of this blog has written about this issue here: https://www.cfr.org/blog-post/should-technical-actors-play-political-role-internet-age
[6] http://ap.ohchr.org/documents/dpage_e.aspx?si=A/HRC/32/L.20
[7] https://documents-dds-ny.un.org/doc/UNDOC/GEN/G17/141/31/PDF/G1714131.pdf?OpenElement
[8] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2912308
[9] http://www.ohchr.org/Documents/Publications/GuidingPrinciplesBusinessHR_EN.pdf
[10] http://www.ohchr.org/Documents/Publications/FactSheet27en.pdf

Latest Report by UN Special Rapporteur for the Right to Freedom of Expression is a Landmark Document

“The digital access industry is in the business of digital expression (…). Since privately owned networks are indispensable to the contemporary exercise of freedom of expression, their operators also assume critical social and public functions. The industry’s decisions (…) can directly impact freedom of expression and related human rights in both beneficial and detrimental ways.” [Report of the Special Rapporteur on the right to freedom of expression, June 2017]

The Internet is often portrayed as a disruptive equalizer, an information medium able to directly give individuals access to information and provide a platform to share their opinions unmediated. But the Internet is also a tool for surveillance, censorship, and information warfare. Often states drive such practices, but increasingly the private sector plays a role. While states have a clear obligation to protect human rights on the Internet, questions surrounding the human right accountability of the private sector are unclear. Which begs the question what the responsibility is of the private industry, which runs and owns much of the Internet, towards human rights?

During the 35th session of the United Nations (UN) Human Rights Council this month, David Kaye, UN Special Rapporteur (UNSR) for the right to freedom of expression, presented his latest report [1], which focuses on the role of the private sector in the provision of Internet and telecommunications access. The UNSR on freedom of expression is an independent expert, appointed by the Human Rights Council to analyse, document, and report on the state of freedom of expression globally [2]. The rapporteur is also expected to make recommendations towards ‘better promoting and protection of the right to freedom of expression’ [3]. In recent years, the UNSRs on freedom of expression increasingly focus on the intersection between access to information, expression, and the Internet [4].

This most recent report is a landmark document. Its focus on the role and responsibilities of the private sector towards the right to freedom of expression presents a necessary step forward in the debate about the responsibility for the realization of human rights online. The report takes on the legal difficulties surrounding the increased reliance of states on access to privately owned networks and data, whether by necessity, through cooperation, or through coercion, for surveillance, security, and service provision. It also tackles the legal responsibilities that private organizations have to respect human rights.

The first half of Kaye’s report emphasises the role of states in protecting the right to freedom of expression and access to information online, in particular in the context of state-mandated Internet shutdowns and private-public data sharing. Kaye highlights several major Internet shutdowns across the world and argues that considering ‘the number of essential activities and services they affect, shutdowns restrict expression and interfere with other fundamental rights’ [5]. In order to address this issue, he recommends that the Human Rights Council supplements and specifies resolution 32/13, on ‘the promotion, protection and enjoyment of human rights on the Internet’ [6], in which it condemns such disruptions to the network. On the interaction between private actors and the state, Kaye walks a delicate line. On the one hand, he argues that governments should not pressure or threaten companies to provide them with access to data. On the other hand, he also argues that states should not allow companies to make network management decisions that treat data differentially based on its origin.

The second half of the report focusses on the responsibility of the private sector. In this context, the UNSR highlights the responsibilities of private actors towards the right to freedom of expression. Kaye argues that this sector plays a crucial role in providing access to information and communication services to millions across the globe. He looks specifically at the role of telecommunication and Internet service providers, Internet exchange points, content delivery networks, network equipment vendors, and other private actors. He argues that four contextual factors are relevant to understanding the responsibility of private actors vis-à-vis human rights:

(1) private actors provide access to ‘a public good’,
(2) due to the technical nature of the Internet, any restrictions on access affect freedom of expression on a global level,
(3) the private sector is vulnerable to state pressure,
(4) but it is also in a unique position to respect users’ rights.

The report draws out the dilemma of the boundaries of responsibility. When should companies decide to comply with state policies that might undermine the rights of Internet end-users? What remedies should they offer end-users if they are complicit in human rights violations? How can private actors assess what impact their technologies might have on human rights?

Private actors across the spectrum, from multinational social media platforms to the garage-based start-ups are likely to run into these questions. As the Internet underpins a large part of the functioning of our societies, and will only further continue to do so as physical devices increasingly become part of the network (aka the Internet of Things), it is even more important to understand and allocate private sector responsibility for protecting human rights.

The report has a dedicated addendum [7] that specifically details the responsibility of Internet Standard Developing Organizations (SDOs). In it, Kaye relies on the article written by Corinne Cath and Luciano Floridi of the Oxford Internet Institute (OII) entitled ‘The Design of the Internet’s Architecture by the Internet Engineering Task Force (IETF) and Human Rights’ [8] to support his argument that SDOs should take on a credible approach to human rights accountability.

Overall, Kaye argues that companies should adopt the UN Guiding Principles on Business and Human Rights [9], which would provide a ‘minimum baseline for corporate human rights accountability’. To operationalize this commitment, the private sector will need to take several urgent steps. It should ensure that sufficient resources are reserved for meeting its responsibility towards human rights, and it should integrate the principles of due diligence, human rights by design, stakeholder engagement, mitigation of the harms of government-imposed restrictions, transparency, and effective remedies to complement its ‘high level commitment to human rights’.

While this report is not binding [10] on states or companies, it does set out a much-needed detailed blue print of how to address questions of corporate responsibility towards human rights in the digital age.

References

[1] https://documents-dds-ny.un.org/doc/UNDOC/GEN/G17/077/46/PDF/G1707746.pdf?OpenElement
[2] http://www.ijrcenter.org/un-special-procedures/
[3] http://www.ohchr.org/EN/Issues/FreedomOpinion/Pages/OpinionIndex.aspx
[4] http://www2.ohchr.org/english/bodies/hrcouncil/docs/17session/A.HRC.17.27_en.pdf
[5] The author of this blog has written about this issue here: https://www.cfr.org/blog-post/should-technical-actors-play-political-role-internet-age
[6] http://ap.ohchr.org/documents/dpage_e.aspx?si=A/HRC/32/L.20
[7] https://documents-dds-ny.un.org/doc/UNDOC/GEN/G17/141/31/PDF/G1714131.pdf?OpenElement
[8] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2912308
[9] http://www.ohchr.org/Documents/Publications/GuidingPrinciplesBusinessHR_EN.pdf
[10] http://www.ohchr.org/Documents/Publications/FactSheet27en.pdf

Our knowledge of how automated agents interact is rather poor (and that could be a problem)

Recent years have seen a huge increase in the number of bots online — including search engine Web crawlers, online customer service chat bots, social media spambots, and content-editing bots in online collaborative communities like Wikipedia. (Bots are important contributors to Wikipedia, completing about 15% of all Wikipedia edits in 2014 overally, and more than 50% in certain language editions.)

While the online world has turned into an ecosystem of bots (by which we mean computer scripts that automatically handle repetitive and mundane tasks), our knowledge of how these automated agents interact with each other is rather poor. But being automata without capacity for emotions, meaning-making, creativity, or sociality, we might expect bot interactions to be relatively predictable and uneventful.

In their PLOS ONE article “Even good bots fight: The case of Wikipedia“, Milena Tsvetkova, Ruth García-Gavilanes, Luciano Floridi, and Taha Yasseri analyze the interactions between bots that edit articles on Wikipedia. They track the extent to which bots undid each other’s edits over the period 2001–2010, model how pairs of bots interact over time, and identify different types of interaction outcomes. Although Wikipedia bots are intended to support the encyclopaedia — identifying and undoing vandalism, enforcing bans, checking spelling, creating inter-language links, importing content automatically, mining data, identifying copyright violations, greeting newcomers, etc. — the authors find they often undid each other’s edits, with these sterile “fights” sometimes continuing for years.

They suggest that even relatively “dumb” bots may give rise to complex interactions, carrying important implications for Artificial Intelligence research. Understanding these bot-bot interactions will be crucial for managing social media, providing adequate cyber-security, and designing autonomous vehicles (that don’t crash..).

We caught up with Taha Yasseri and Luciano Floridi to discuss the implications of the findings:

Ed.: Is there any particular difference between the way individual bots interact (and maybe get bogged down in conflict), and lines of vast and complex code interacting badly, or having unforeseen results (e.g. flash-crashes in automated trading): i.e. is this just (another) example of us not always being able to anticipate how code interacts in the wild?

Taha: There are similarities and differences. The most notable difference is that here bots are not competing. They all work based on same rules and more importantly to achieve the same goal that is to increase the quality of the encyclopedia. Considering these features, the rather antagonistic interactions between the bots come as a surprise.

Ed.: Wikipedia have said that they know about it, and that it’s a minor problem: but I suppose Wikipedia presents a nice, open, benevolent system to make a start on examining and understanding bot interactions. What other bot-systems are you aware of, or that you could have looked at?

Taha: In terms of content generating bots, Twitter bots have turned out to be very important in terms of online propaganda. The crawlers bots that collect information from social media or the web (such as personal information or email addresses) are also being heavily deployed. In fact we have come up with a first typology of the Internet bots based on their type of action and their intentions (benevolent vs malevolent), that is presented in the article.

Ed.: You’ve also done work on human collaborations (e.g. in the citizen science projects of the Zooniverse) — is there any work comparing human collaborations with bot collaborations — or even examining human-bot collaborations and interactions?

Taha: In the present work we do compare bot-bot interactions with human-human interactions to observe similarities and differences. The most striking difference is in the dynamics of negative interactions. While human conflicts heat up very quickly and then disappear after a while, bots undoing each others’ contribution comes as a steady flow which might persist over years. In the HUMANE project, we discuss the co-existence of humans and machines in the digital world from a theoretical point of view and there we discuss such ecosystems in details.

Ed.: Humans obviously interact badly, fairly often (despite being a social species) .. why should we be particularly worried about how bots interact with each other, given humans seem to expect and cope with social inefficiency, annoyances, conflict and break-down? Isn’t this just more of the same?

Luciano: The fact that bots can be as bad as humans is far from reassuring. The fact that this happens even when they are programmed to collaborate is more disconcerting than what happens among humans when these compete, or fight each other. Here are very elementary mechanisms that through simple interactions generate messy and conflictual outcomes. One may hope this is not evidence of what may happen when more complex systems and interactions are in question. The lesson I learnt from all this is that without rules or some kind of normative framework that promote collaboration, not even good mechanisms ensure a good outcome.

Read the full article: Tsvetkova M, Garcia-Gavilanes R, Floridi, L, Yasseri T (2017) Even good bots fight: The case of Wikipedia. PLoS ONE 12(2): e0171774. doi:10.1371/journal.pone.0171774


Taha Yasseri and Luciano Floridi were talking to blog editor David Sutcliffe.

Exploring the world of digital detoxing

As our social interactions become increasingly entangled with the online world, there are some who insist on the benefits of disconnecting entirely from digital technology. These advocates of “digital detoxing” view digital communication as eroding our ability to concentrate, to empathise, and to have meaningful conversations.

A 2016 survey by OnePoll found that 40% of respondents felt they had “not truly experienced valuable moments such as a child’s first steps or graduation” because “technology got in the way”, and OfCom’s 2016 survey showed that 15 million British Internet users (representing a third of those online), have already tried a digital detox. In recent years, America has sought to pathologise a perceived over-use of digital technology as “Internet addiction”. While the term is not recognized by the DSM, the idea is commonly used in media rhetoric and forms an important backdrop to digital detoxing.

The article Disconnect to reconnect: The food/technology metaphor in digital detoxing (First Monday) by Theodora Sutton presents a short ethnography of the digital detoxing community in the San Francisco Bay Area. Her informants attend an annual four-day digital detox and summer camp for adults in the Californian forest called Camp Grounded. She attended two Camp Grounded sessions in 2014, and followed up with semi-structured interviews with eight detoxers.

We caught up with Theodora to examine the implications of the study and to learn more about her PhD research, which focuses on the same field site.

Ed.: In your forthcoming article you say that Camp Grounded attendees used food metaphors (and words like “snacking” and “nutrition”) to understand their own use of technology and behaviour. How useful is this as an analogy?

Theodora: The food/technology analogy is an incredibly neat way to talk about something we think of as immaterial in a more tangible way. We know that our digital world relies on physical connections, but we forget that all the time. Another thing it does in lending a dietary connotation is to imply we should regulate our consumption of digital use; that there are healthy and unhealthy or inappropriate ways of using it.

I explore more pros and cons to the analogy in the paper, but the biggest con in my opinion is that while it’s neat, it’s often used to make value judgments about technology use. For example, saying that online sociality is like processed food is implying that it lacks authenticity. So the food analogy is a really useful way to understand how people are interpreting technology culturally, but it’s important to be aware of how it’s used.

Ed.: How do people rationalise ideas of the digital being somehow “less real” or “genuine” (less “nourishing”), despite the fact that it obviously is all real: just different? Is it just a peg to blame an “other” and excuse their own behaviour .. rather than just switching off their phones and going for a run / sail etc. (or any other “real” activity..).

Theodora: The idea of new technologies being somehow less real or less natural is a pretty established Western concept, and it’s been fundamental in moral panics following new technologies. That digital sociality is different, not lesser, is something we can academically agree on, but people very often believe otherwise.

My personal view is that figuring out what kind of digital usage suits you and then acting in moderation is ideal, without the need for extreme lengths, but in reality moderation can be quite difficult to achieve. And the thing is, we’re not just talking about choosing to text rather than meet in person, or read a book instead of go on Twitter. We’re talking about digital activities that are increasingly inescapable and part of life, like work e-mail or government services being moved online.

The ability to go for a run or go sailing are again privileged activities for people with free time. Many people think getting back to nature or meeting in person are really important for human needs. But increasingly, not everyone has the ability to get away from devices, especially if you don’t have enough money to visit friends or travel to a forest, or you’re just too tired from working all the time. So Camp Grounded is part of what they feel is an urgent conversation about whether the technology we design addresses human, emotional needs.

Ed.: You write in the paper that “upon arrival at Camp Grounded, campers are met with hugs and milk and cookies” .. not to sound horrible, but isn’t this replacing one type of (self-focused) reassurance with another? I mean, it sounds really nice (as does the rest of the Camp), but it sounds a tiny bit like their “problem” is being fetishised / enjoyed a little bit? Or maybe that their problem isn’t to do with technology, but rather with confidence, anxiety etc.

Theodora: The people who run Camp Grounded would tell you themselves that digital detoxing is not really about digital technology. That’s just the current scapegoat for all the alienating aspects of modern life. They also take away real names, work talk, watches, and alcohol. One of the biggest things Camp Grounded tries to do is build up attendees’ confidence to be silly and playful and have their identities less tied to their work persona, which is a bit of a backlash against Silicon Valley’s intense work ethic. Milk and cookies comes from childhood, or America’s summer camps which many attendees went to as children, so it’s one little thing they do to get you to transition into that more relaxed and childlike way of behaving.

I’m not sure about “fetishized,” but Camp Grounded really jumps on board with the technology idea, using really ironic things like an analog dating service called “embers,” a “human powered search” where you pin questions on a big noticeboard and other people answer, and an “inbox” where people leave you letters.

And you’re right, there is an aspect of digital detoxing which is very much a “middle class ailment” in that it can seem rather surface-level and indulgent, and tickets are pretty pricey, making it quite a privileged activity. But at the same time I think it is a genuine conversation starter about our relationship with technology and how it’s designed. I think a digital detox is more than just escapism or reassurance, for them it’s about testing a different lifestyle, seeing what works best for them and learning from that.

Ed.: Many of these technologies are designed to be “addictive” (to use the term loosely: maybe I mean “seductive”) in order to drive engagement and encourage retention: is there maybe an analogy here with foods that are too sugary, salty, fatty (i.e. addictive) for us? I suppose the line between genuine addiction and free choice / agency is a difficult one; and one that may depend largely on the individual. Which presumably makes any attempts to regulate (or even just question) these persuasive digital environments particularly difficult? Given the massive outcry over perfectly rational attempts to tax sugar, fat etc.

Theodora: The analogy between sugary, salty, or fatty foods and seductive technologies is drawn a lot — it was even made by danah boyd in 2009. Digital detoxing comes from a standpoint that tech companies aren’t necessarily working to enable meaningful connection, and are instead aiming to “hook” people in. That’s often compared to food companies that exist to make a profit rather than improve your individual nutrition, using whatever salt, sugar, flavourings, or packaging they have at their disposal to make you keep coming back.

There are two different ways of “fixing” perceived problems with tech: there’s technical fixes that might only let you use the site for certain amounts of time, or re-designing it so that it’s less seductive; then there’s normative fixes, which could be on an individual level deciding to make a change, or even society wide, like the French labour law giving the “right to disconnect” from work emails on evenings and weekends.

One that sort of embodies both of these is The Time Well Spent project, run by Tristan Harris and the OII’s James Williams. They suggest different metrics for tech platforms, such as how well they enable good experiences away from the computer altogether. Like organic food stickers, they’ve suggested putting a stamp on websites whose companies have these different metrics. That could encourage people to demand better online experiences, and encourage tech companies to design accordingly.

So that’s one way that people are thinking about regulating it, but I think we’re still in the stages of sketching out what the actual problems are and thinking about how we can regulate or “fix” them. At the moment, the issue seems to depend on what the individual wants to do. I’d be really interested to know what other ideas people have had to regulate it, though.

Ed.: Without getting into the immense minefield of evolutionary psychology (and whether or not we are creating environments that might be detrimental to us mentally or socially: just as the Big Mac and Krispy Kreme are not brilliant for us nutritionally) — what is the lay of the land — the academic trends and camps — for this larger question of “Internet addiction” .. and whether or not it’s even a thing?

Theodora: In my experience academics don’t consider it a real thing, just as you wouldn’t say someone had an addiction to books. But again, that doesn’t mean it isn’t used all the time as a shorthand. And there are some academics who use it, like Kimberly Young who proposed it in the 1990’s. She still runs an Internet addiction treatment centre in New York, and there’s another in Fall City, Washington state.

The term certainly isn’t going away any time soon and the centres treat people who genuinely seem to have a very problematic relationship with their technology. People like the OII’s Andrew Przybylski (@ShuhBillSkee) are working on untangling this kind of problematic digital use from the idea of addiction, which can be a bit of a defeatist and dramatic term.

Ed.: As an ethnographer working at the Camp according to its rules (hand-written notes, analogue camera) .. did it affect your thinking or subsequent behaviour / habits in any way?

Theodora: Absolutely. In a way that’s a struggle, because I never felt that I wanted or needed a digital detox, yet having been to it three times now I can see the benefits. Going to camp made a strong case for the argument to be more careful with my technology use, for example not checking my phone mid-conversation, and I’ve been much more aware of it since. For me, that’s been part of an on-going debate that I have in my own life, which I think is a really useful fuel towards continuing to unravel this topic in my studies.

Ed.: So what are your plans now for your research in this area — will you be going back to Camp Grounded for another detox?

Theodora: Yes — I’ll be doing an ethnography of the digital detoxing community again this summer for my PhD and that will include attending Camp Grounded again. So far I’ve essentially done just preliminary fieldwork and visited to touch base with my informants. It’s easy to listen to the rhetoric around digital detoxing, but I think what’s been missing is someone spending time with them to really understand their point of view, especially their values, that you can’t always capture in a survey or in interviews.

In my PhD I hope to understand things like: how digital detoxers even think about technology, what kind of strategies they have to use it appropriately once they return from a detox, and how metaphor and language work in talking about the need to “unplug.” The food analogy is just one preliminary finding that shows how fascinating the topic is as soon as you start scratching away the surface.

Read the full article: Sutton, T. (2017) Disconnect to reconnect: The food/technology metaphor in digital detoxing. First Monday 22 (6).


OII DPhil student Theodora Sutton was talking to blog editor David Sutcliffe.

Exploring the world of digital detoxing

As our social interactions become increasingly entangled with the online world, there are some who insist on the benefits of disconnecting entirely from digital technology. These advocates of “digital detoxing” view digital communication as eroding our ability to concentrate, to empathise, and to have meaningful conversations.

A 2016 survey by OnePoll found that 40% of respondents felt they had “not truly experienced valuable moments such as a child’s first steps or graduation” because “technology got in the way”, and OfCom’s 2016 survey showed that 15 million British Internet users (representing a third of those online), have already tried a digital detox. In recent years, America has sought to pathologise a perceived over-use of digital technology as “Internet addiction”. While the term is not recognized by the DSM, the idea is commonly used in media rhetoric and forms an important backdrop to digital detoxing.

The article Disconnect to reconnect: The food/technology metaphor in digital detoxing (First Monday) by Theodora Sutton presents a short ethnography of the digital detoxing community in the San Francisco Bay Area. Her informants attend an annual four-day digital detox and summer camp for adults in the Californian forest called Camp Grounded. She attended two Camp Grounded sessions in 2014, and followed up with semi-structured interviews with eight detoxers.

We caught up with Theodora to examine the implications of the study and to learn more about her PhD research, which focuses on the same field site.

Ed.: In your forthcoming article you say that Camp Grounded attendees used food metaphors (and words like “snacking” and “nutrition”) to understand their own use of technology and behaviour. How useful is this as an analogy?

Theodora: The food/technology analogy is an incredibly neat way to talk about something we think of as immaterial in a more tangible way. We know that our digital world relies on physical connections, but we forget that all the time. Another thing it does in lending a dietary connotation is to imply we should regulate our consumption of digital use; that there are healthy and unhealthy or inappropriate ways of using it.

I explore more pros and cons to the analogy in the paper, but the biggest con in my opinion is that while it’s neat, it’s often used to make value judgments about technology use. For example, saying that online sociality is like processed food is implying that it lacks authenticity. So the food analogy is a really useful way to understand how people are interpreting technology culturally, but it’s important to be aware of how it’s used.

Ed.: How do people rationalise ideas of the digital being somehow “less real” or “genuine” (less “nourishing”), despite the fact that it obviously is all real: just different? Is it just a peg to blame an “other” and excuse their own behaviour .. rather than just switching off their phones and going for a run / sail etc. (or any other “real” activity..).

Theodora: The idea of new technologies being somehow less real or less natural is a pretty established Western concept, and it’s been fundamental in moral panics following new technologies. That digital sociality is different, not lesser, is something we can academically agree on, but people very often believe otherwise.

My personal view is that figuring out what kind of digital usage suits you and then acting in moderation is ideal, without the need for extreme lengths, but in reality moderation can be quite difficult to achieve. And the thing is, we’re not just talking about choosing to text rather than meet in person, or read a book instead of go on Twitter. We’re talking about digital activities that are increasingly inescapable and part of life, like work e-mail or government services being moved online.

The ability to go for a run or go sailing are again privileged activities for people with free time. Many people think getting back to nature or meeting in person are really important for human needs. But increasingly, not everyone has the ability to get away from devices, especially if you don’t have enough money to visit friends or travel to a forest, or you’re just too tired from working all the time. So Camp Grounded is part of what they feel is an urgent conversation about whether the technology we design addresses human, emotional needs.

Ed.: You write in the paper that “upon arrival at Camp Grounded, campers are met with hugs and milk and cookies” .. not to sound horrible, but isn’t this replacing one type of (self-focused) reassurance with another? I mean, it sounds really nice (as does the rest of the Camp), but it sounds a tiny bit like their “problem” is being fetishised / enjoyed a little bit? Or maybe that their problem isn’t to do with technology, but rather with confidence, anxiety etc.

Theodora: The people who run Camp Grounded would tell you themselves that digital detoxing is not really about digital technology. That’s just the current scapegoat for all the alienating aspects of modern life. They also take away real names, work talk, watches, and alcohol. One of the biggest things Camp Grounded tries to do is build up attendees’ confidence to be silly and playful and have their identities less tied to their work persona, which is a bit of a backlash against Silicon Valley’s intense work ethic. Milk and cookies comes from childhood, or America’s summer camps which many attendees went to as children, so it’s one little thing they do to get you to transition into that more relaxed and childlike way of behaving.

I’m not sure about “fetishized,” but Camp Grounded really jumps on board with the technology idea, using really ironic things like an analog dating service called “embers,” a “human powered search” where you pin questions on a big noticeboard and other people answer, and an “inbox” where people leave you letters.

And you’re right, there is an aspect of digital detoxing which is very much a “middle class ailment” in that it can seem rather surface-level and indulgent, and tickets are pretty pricey, making it quite a privileged activity. But at the same time I think it is a genuine conversation starter about our relationship with technology and how it’s designed. I think a digital detox is more than just escapism or reassurance, for them it’s about testing a different lifestyle, seeing what works best for them and learning from that.

Ed.: Many of these technologies are designed to be “addictive” (to use the term loosely: maybe I mean “seductive”) in order to drive engagement and encourage retention: is there maybe an analogy here with foods that are too sugary, salty, fatty (i.e. addictive) for us? I suppose the line between genuine addiction and free choice / agency is a difficult one; and one that may depend largely on the individual. Which presumably makes any attempts to regulate (or even just question) these persuasive digital environments particularly difficult? Given the massive outcry over perfectly rational attempts to tax sugar, fat etc.

Theodora: The analogy between sugary, salty, or fatty foods and seductive technologies is drawn a lot — it was even made by danah boyd in 2009. Digital detoxing comes from a standpoint that tech companies aren’t necessarily working to enable meaningful connection, and are instead aiming to “hook” people in. That’s often compared to food companies that exist to make a profit rather than improve your individual nutrition, using whatever salt, sugar, flavourings, or packaging they have at their disposal to make you keep coming back.

There are two different ways of “fixing” perceived problems with tech: there’s technical fixes that might only let you use the site for certain amounts of time, or re-designing it so that it’s less seductive; then there’s normative fixes, which could be on an individual level deciding to make a change, or even society wide, like the French labour law giving the “right to disconnect” from work emails on evenings and weekends.

One that sort of embodies both of these is The Time Well Spent project, run by Tristan Harris and the OII’s James Williams. They suggest different metrics for tech platforms, such as how well they enable good experiences away from the computer altogether. Like organic food stickers, they’ve suggested putting a stamp on websites whose companies have these different metrics. That could encourage people to demand better online experiences, and encourage tech companies to design accordingly.

So that’s one way that people are thinking about regulating it, but I think we’re still in the stages of sketching out what the actual problems are and thinking about how we can regulate or “fix” them. At the moment, the issue seems to depend on what the individual wants to do. I’d be really interested to know what other ideas people have had to regulate it, though.

Ed.: Without getting into the immense minefield of evolutionary psychology (and whether or not we are creating environments that might be detrimental to us mentally or socially: just as the Big Mac and Krispy Kreme are not brilliant for us nutritionally) — what is the lay of the land — the academic trends and camps — for this larger question of “Internet addiction” .. and whether or not it’s even a thing?

Theodora: In my experience academics don’t consider it a real thing, just as you wouldn’t say someone had an addiction to books. But again, that doesn’t mean it isn’t used all the time as a shorthand. And there are some academics who use it, like Kimberly Young who proposed it in the 1990’s. She still runs an Internet addiction treatment centre in New York, and there’s another in Fall City, Washington state.

The term certainly isn’t going away any time soon and the centres treat people who genuinely seem to have a very problematic relationship with their technology. People like the OII’s Andrew Przybylski (@ShuhBillSkee) are working on untangling this kind of problematic digital use from the idea of addiction, which can be a bit of a defeatist and dramatic term.

Ed.: As an ethnographer working at the Camp according to its rules (hand-written notes, analogue camera) .. did it affect your thinking or subsequent behaviour / habits in any way?

Theodora: Absolutely. In a way that’s a struggle, because I never felt that I wanted or needed a digital detox, yet having been to it three times now I can see the benefits. Going to camp made a strong case for the argument to be more careful with my technology use, for example not checking my phone mid-conversation, and I’ve been much more aware of it since. For me, that’s been part of an on-going debate that I have in my own life, which I think is a really useful fuel towards continuing to unravel this topic in my studies.

Ed.: So what are your plans now for your research in this area — will you be going back to Camp Grounded for another detox?

Theodora: Yes — I’ll be doing an ethnography of the digital detoxing community again this summer for my PhD and that will include attending Camp Grounded again. So far I’ve essentially done just preliminary fieldwork and visited to touch base with my informants. It’s easy to listen to the rhetoric around digital detoxing, but I think what’s been missing is someone spending time with them to really understand their point of view, especially their values, that you can’t always capture in a survey or in interviews.

In my PhD I hope to understand things like: how digital detoxers even think about technology, what kind of strategies they have to use it appropriately once they return from a detox, and how metaphor and language work in talking about the need to “unplug.” The food analogy is just one preliminary finding that shows how fascinating the topic is as soon as you start scratching away the surface.

Read the full article: Sutton, T. (2017) Disconnect to reconnect: The food/technology metaphor in digital detoxing. First Monday 22 (6).


OII DPhil student Theodora Sutton was talking to blog editor David Sutcliffe.

Five Pieces You Should Probably Read On: Fake News and Filter Bubbles

This is the second post in a series that will uncover great writing by faculty and students at the Oxford Internet Institute, things you should probably know, and things that deserve to be brought out for another viewing. This week: Fake News and Filter Bubbles!

Fake news, post-truth, “alternative facts”, filter bubbles — this is the news and media environment we apparently now inhabit, and that has formed the fabric and backdrop of Brexit (“£350 million a week”) and Trump (“This was the largest audience to ever witness an inauguration — period”). Do social media divide us, hide us from each other? Are you particularly aware of what content is personalised for you, what it is you’re not seeing? How much can we do with machine-automated or crowd-sourced verification of facts? And are things really any worse now than when Bacon complained in 1620 about the false notions that “are now in possession of the human understanding, and have taken deep root therein”?

 

1. Bernie Hogan: How Facebook divides us [Times Literary Supplement]

27 October 2016 / 1000 words / 5 minutes

“Filter bubbles can create an increasingly fractured population, such as the one developing in America. For the many people shocked by the result of the British EU referendum, we can also partially blame filter bubbles: Facebook literally filters our friends’ views that are least palatable to us, yielding a doctored account of their personalities.”

Bernie Hogan says it’s time Facebook considered ways to use the information it has about us to bring us together across political, ideological and cultural lines, rather than hide us from each other or push us into polarized and hostile camps. He says it’s not only possible for Facebook to help mitigate the issues of filter bubbles and context collapse; it’s imperative, and it’s surprisingly simple.

 

2. Luciano Floridi: Fake news and a 400-year-old problem: we need to resolve the ‘post-truth’ crisis [the Guardian]

29 November 2016 / 1000 words / 5 minutes

“The internet age made big promises to us: a new period of hope and opportunity, connection and empathy, expression and democracy. Yet the digital medium has aged badly because we allowed it to grow chaotically and carelessly, lowering our guard against the deterioration and pollution of our infosphere. […] some of the costs of misinformation may be hard to reverse, especially when confidence and trust are undermined. The tech industry can and must do better to ensure the internet meets its potential to support individuals’ wellbeing and social good.”

The Internet echo chamber satiates our appetite for pleasant lies and reassuring falsehoods, and has become the defining challenge of the 21st century, says Luciano Floridi. So far, the strategy for technology companies has been to deal with the ethical impact of their products retrospectively, but this is not good enough, he says. We need to shape and guide the future of the digital, and stop making it up as we go along. It is time to work on an innovative blueprint for a better kind of infosphere.

 

3. Philip Howard: Facebook and Twitter’s real sin goes beyond spreading fake news

3 January 2017 / 1000 words / 5 minutes

“With the data at their disposal and the platforms they maintain, social media companies could raise standards for civility by refusing to accept ad revenue for placing fake news. They could let others audit and understand the algorithms that determine who sees what on a platform. Just as important, they could be the platforms for doing better opinion, exit and deliberative polling.”

Only Facebook and Twitter know how pervasive fabricated news stories and misinformation campaigns have become during referendums and elections, says Philip Howard — and allowing fake news and computational propaganda to target specific voters is an act against democratic values. But in a time of weakening polling systems, withholding data about public opinion is actually their major crime against democracy, he says.

 

4. Brent Mittelstadt: Should there be a better accounting of the algorithms that choose our news for us?

7 December 2016 / 1800 words / 8 minutes

“Transparency is often treated as the solution, but merely opening up algorithms to public and individual scrutiny will not in itself solve the problem. Information about the functionality and effects of personalisation must be meaningful to users if anything is going to be accomplished. At a minimum, users of personalisation systems should be given more information about their blind spots, about the types of information they are not seeing, or where they lie on the map of values or criteria used by the system to tailor content to users.”

A central ideal of democracy is that political discourse should allow a fair and critical exchange of ideas and values. But political discourse is unavoidably mediated by the mechanisms and technologies we use to communicate and receive information, says Brent Mittelstadt. And content personalization systems and the algorithms they rely upon create a new type of curated media that can undermine the fairness and quality of political discourse.

 

5. Heather Ford: Verification of crowd-sourced information: is this ‘crowd wisdom’ or machine wisdom?

19 November 2013 / 1400 words / 6 minutes

“A key question being asked in the design of future verification mechanisms is the extent to which verification work should be done by humans or non-humans (machines). Here, verification is not a binary categorisation, but rather there is a spectrum between human and non-human verification work, and indeed, projects like Ushahidi, Wikipedia and Galaxy Zoo have all developed different verification mechanisms.”

‘Human’ verification, a process of checking whether a particular report meets a group’s truth standards, is an acutely social process, says Heather Ford. If code is law and if other aspects in addition to code determine how we can act in the world, it is important that we understand the context in which code is deployed. Verification is a practice that determines how we can trust information coming from a variety of sources — only by illuminating such practices and the variety of impacts that code can have in different environments can we begin to understand how code regulates our actions in crowdsourcing environments.

 

.. and just to prove we’re capable of understanding and acknowledging and assimilating multiple viewpoints on complex things, here’s Helen Margetts, with a different slant on filter bubbles: “Even if political echo chambers were as efficient as some seem to think, there is little evidence that this is what actually shapes election results. After all, by definition echo chambers preach to the converted. It is the undecided people who (for example) the Leave and Trump campaigns needed to reach. And from the research, it looks like they managed to do just that.”

 

The Authors

Bernie Hogan is a Research Fellow at the OII; his research interests lie at the intersection of social networks and media convergence.

Luciano Floridi is the OII’s Professor of Philosophy and Ethics of Information. His  research areas are the philosophy of Information, information and computer ethics, and the philosophy of technology.

Philip Howard is the OII’s Professor of Internet Studies. He investigates the impact of digital media on political life around the world.

Brent Mittelstadt is an OII Postdoc His research interests include the ethics of information handled by medical ICT, theoretical developments in discourse and virtue ethics, and epistemology of information.

Heather Ford completed her doctorate at the OII, where she studied how Wikipedia editors write history as it happens. She is now a University Academic Fellow in Digital Methods at the University of Leeds. Her forthcoming book “Fact Factories: Wikipedia’s Quest for the Sum of All Human Knowledge” will be published by MIT Press.

Helen Margetts is the OII’s Director, and Professor of Society and the Internet. She specialises in digital era government, politics and public policy, and data science and experimental methods. Her most recent book is Political Turbulence (Princeton).

 

Coming up! .. It’s the economy, stupid / Augmented reality and ambient fun / The platform economy / Power and development / Internet past and future / Government / Labour rights / The disconnected / Ethics / Staying critical

Should there be a better accounting of the algorithms that choose our news for us?

A central ideal of democracy is that political discourse should allow a fair and critical exchange of ideas and values. But political discourse is unavoidably mediated by the mechanisms and technologies we use to communicate and receive information — and content personalization systems (think search engines, social media feeds and targeted advertising), and the algorithms they rely upon, create a new type of curated media that can undermine the fairness and quality of political discourse.

A new article by Brent Mittlestadt explores the challenges of enforcing a political right to transparency in content personalization systems. Firstly, he explains the value of transparency to political discourse and suggests how content personalization systems undermine open exchange of ideas and evidence among participants: at a minimum, personalization systems can undermine political discourse by curbing the diversity of ideas that participants encounter. Second, he explores work on the detection of discrimination in algorithmic decision making, including techniques of algorithmic auditing that service providers can employ to detect political bias. Third, he identifies several factors that inhibit auditing and thus indicate reasonable limitations on the ethical duties incurred by service providers — content personalization systems can function opaquely and be resistant to auditing because of poor accessibility and interpretability of decision-making frameworks. Finally, Brent concludes with reflections on the need for regulation of content personalization systems.

He notes that no matter how auditing is pursued, standards to detect evidence of political bias in personalized content are urgently required. Methods are needed to routinely and consistently assign political value labels to content delivered by personalization systems. This is perhaps the most pressing area for future work—to develop practical methods for algorithmic auditing.

The right to transparency in political discourse may seem unusual and farfetched. However, standards already set by the U.S. Federal Communication Commission’s fairness doctrine — no longer in force — and the British Broadcasting Corporation’s fairness principle both demonstrate the importance of the idealized version of political discourse described here. Both precedents promote balance in public political discourse by setting standards for delivery of politically relevant content. Whether it is appropriate to hold service providers that use content personalization systems to a similar standard remains a crucial question.

Read the full article: Mittelstadt, B. (2016) Auditing for Transparency in Content Personalization Systems. International Journal of Communication 10(2016), 4991–5002.

We caught up with Brent to explore the broader implications of the study:

Ed: We basically accept that the tabloids will be filled with gross bias, populism and lies (in order to sell copy) — and editorial decisions are not generally transparent to us. In terms of their impact on the democratic process, what is the difference between the editorial boardroom and a personalising social media algorithm?

Brent: There are a number of differences. First, although not necessarily transparent to the public, one hopes that editorial boardrooms are at least transparent to those within the news organisations. Editors can discuss and debate the tone and factual accuracy of their stories, explain their reasoning to one another, reflect upon the impact of their decisions on their readers, and generally have a fair debate about the merits and weaknesses of particular content.

This is not the case for a personalising social media algorithm; those working with the algorithm inside a social media company are often unable to explain why the algorithm is functioning in a particular way, or determined a particular story or topic to be ‘trending’ or displayed to particular users, while others are not. It is also far more difficult to ‘fact check’ algorithmically curated news; a news item can be widely disseminated merely by many users posting or interacting with it, without any purposeful dissemination or fact checking by the platform provider.

Another big difference is the degree to which users can be aware of the bias of the stories they are reading. Whereas a reader of The Daily Mail or The Guardian will have some idea of the values of the paper, the same cannot be said of platforms offering algorithmically curated news and information. The platform can be neutral insofar as it disseminates news items and information reflecting a range of values and political viewpoints. A user will encounter items reflecting her particular values (or, more accurately, her history of interactions with the platform and the values inferred from them), but these values, and their impact on her exposure to alternative viewpoints, may not be apparent to the user.

Ed: And how is content “personalisation” different to content filtering (e.g. as we see with the Great Firewall of China) that people get very worked up about? Should we be more worried about personalisation?

Brent: Personalisation and filtering are essentially the same mechanism; information is tailored to a user or users according to some prevailing criteria. One difference is whether content is merely infeasible to access, or technically inaccessible. Content of all types will typically still be accessible in principle when personalisation is used, but the user will have to make an effort to access content that is not recommended or otherwise given special attention. Filtering systems, in contrast, will impose technical measures to make particular content inaccessible from a particular device or geographical area.

Another difference is the source of the criteria used to set the visibility of different types of content. In the case of personalisation, these criteria are typically based on the users (inferred) interests, values, past behaviours and explicit requests. Critically, these values are not necessarily apparent to the user. For filtering, criteria are typically externally determined by a third party, often a government. Some types of information are set off limits, according to the prevailing values of the third party. It is the imposition of external values, which limit the capacity of users to access content of their choosing, which often causes an outcry against filtering and censorship.

Importantly, the two mechanisms do not necessarily differ in terms of the transparency of the limiting factors or rules to users. In some cases, such as the recently proposed ban in the UK of adult websites that do not provide meaningful age verification mechanisms, the criteria that determine whether sites are off limits will be publicly known at a general level. In other cases, and especially with personalisation, the user inside the ‘filter bubble’ will be unaware of the rules that determine whether content is (in)accessible. And it is not always the case that the platform provider intentionally keeps these rules secret. Rather, the personalisation algorithms and background analytics that determine the rules can be too complex, inaccessible or poorly understood even by the provider to give the user any meaningful insight.

Ed: Where are these algorithms developed: are they basically all proprietary? i.e. how would you gain oversight of massively valuable and commercially sensitive intellectual property?

Brent: Personalisation algorithms tend to be proprietary, and thus are not normally open to public scrutiny in any meaningful sense. In one sense this is understandable; personalisation algorithms are valuable intellectual property. At the same time the lack of transparency is a problem, as personalisation fundamentally affects how users encounter and digest information on any number of topics. As recently argued, it may be the case that personalisation of news impacts on political and democratic processes. Existing regulatory mechanisms have not been successful in opening up the ‘black box’ so to speak.

It can be argued, however, that legal requirements should be adopted to require these algorithms to be open to public scrutiny due to the fundamental way they shape our consumption of news and information. Oversight can take a number of forms. As I argue in the article, algorithmic auditing is one promising route, performed both internally by the companies themselves, and externally by a government agency or researchers. A good starting point would be for the companies developing and deploying these algorithms to extend their cooperation with researchers, thereby allowing a third party to examine the effects these systems are having on political discourse, and society more broadly.

Ed: By “algorithm audit” — do you mean examining the code and inferring what the outcome might be in terms of bias, or checking the outcome (presumably statistically) and inferring that the algorithm must be introducing bias somewhere? And is it even possible to meaningfully audit personalisation algorithms, when they might rely on vast amounts of unpredictable user feedback to train the system?

Brent: Algorithm auditing can mean both of these things, and more. Audit studies are a tool already in use, whereby human participants introduce different inputs into a system, and examine the effect on the system’s outputs. Similar methods have long been used to detect discriminatory hiring practices, for instance. Code audits are another possibility, but are generally prohibitive due to problems of access and complexity. Also, even if you can access and understand the code of an algorithm, that tells you little about how the algorithm performs in practice when given certain input data. Both the algorithm and input data would need to be audited.

Alternatively, auditing can assess just the outputs of the algorithm; recent work to design mechanisms to detect disparate impact and discrimination, particularly in the Fairness, Accountability and Transparency in Machine Learning (FAT-ML) community, is a great example of this type of auditing. Algorithms can also be designed to attempt to prevent or detect discrimination and other harms as they occur. These methods are as much about the operation of the algorithm, as they are about the nature of the training and input data, which may itself be biased. In short, auditing is very difficult, but there are promising avenues of research and development. Once we have reliable auditing methods, the next major challenge will be to tailor them to specific sectors; a one-size-meets-all approach to auditing is not on the cards.

Ed: Do you think this is a real problem for our democracy? And what is the solution if so?

Brent: It’s difficult to say, in part because access and data to study the effects of personalisation systems are hard to come by. It is one thing to prove that personalisation is occurring on a particular platform, or to show that users are systematically displayed content reflecting a narrow range of values or interests. It is quite another to prove that these effects are having an overall harmful effect on democracy. Digesting information is one of the most basic elements of social and political life, so any mechanism that fundamentally changes how information is encountered should be subject to serious and sustained scrutiny.

Assuming personalisation actually harms democracy or political discourse, mitigating its effects is quite a different issue. Transparency is often treated as the solution, but merely opening up algorithms to public and individual scrutiny will not in itself solve the problem. Information about the functionality and effects of personalisation must be meaningful to users if anything is going to be accomplished.

At a minimum, users of personalisation systems should be given more information about their blind spots, about the types of information they are not seeing, or where they lie on the map of values or criteria used by the system to tailor content to users. A promising step would be proactively giving the user some idea of what the system thinks it knows about them, or how they are being classified or profiled, without the user first needing to ask.


Brent Mittelstadt was talking to blog editor David Sutcliffe.

Alan Turing Institute and OII: Summit on Data Science for Government and Policy Making

The benefits of big data and data science for the private sector are well recognised. So far, considerably less attention has been paid to the power and potential of the growing field of data science for policy-making and public services. On Monday 14th March 2016 the Oxford Internet Institute (OII) and the Alan Turing Institute (ATI) hosted a Summit on Data Science for Government and Policy Making, funded by the EPSRC. Leading policy makers, data scientists and academics came together to discuss how the ATI and government could work together to develop data science for the public good. The convenors of the Summit, Professors Helen Margetts (OII) and Tom Melham (Computer Science), report on the day’s proceedings.

The Alan Turing Institute will build on the UK’s existing academic strengths in the analysis and application of big data and algorithm research to place the UK at the forefront of world-wide research in data science. The University of Oxford is one of five university partners, and the OII is the only partnering department in the social sciences. The aim of the summit on Data Science for Government and Policy-Making was to understand how government can make better use of big data and the ATI – with the academic partners in listening mode.

We hoped that the participants would bring forward their own stories, hopes and fears regarding data science for the public good. Crucially, we wanted to work out a roadmap for how different stakeholders can work together on the distinct challenges facing government, as opposed to commercial organisations. At the same time, data science research and development has much to gain from the policy-making community. Some of the things that government does – collect tax from the whole population, or give money away at scale, or possess the legitimate use of force – it does by virtue of being government. So the sources of data and some of the data science challenges that public agencies face are unique and tackling them could put government working with researchers at the forefront of data science innovation.

During the Summit a range of stakeholders provided insight from their distinctive perspectives; the Government Chief Scientific Advisor, Sir Mark Walport; Deputy Director of the ATI, Patrick Wolfe; the National Statistician and Director of ONS, John Pullinger; Director of Data at the Government Digital Service, Paul Maltby. Representatives of frontline departments recounted how algorithmic decision-making is already bringing predictive capacity into operational business, improving efficiency and effectiveness.

Discussion revolved around the challenges of how to build core capability in data science across government, rather than outsourcing it (as happened in an earlier era with information technology) or confining it to a data science profession. Some delegates talked of being in the ‘foothills’ of data science. The scale, heterogeneity and complexity of some government departments currently works against data science innovation, particularly when larger departments can operate thousands of databases, creating legacy barriers to interoperability. Out-dated policies can work against data science methodologies. Attendees repeatedly voiced concerns about sharing data across government departments, in some case because of limitations of legal protections; in others because people were unsure what they can and cannot do.

The potential power of data science creates an urgent need for discussion of ethics. Delegates and speakers repeatedly affirmed the importance of an ethical framework and for thought leadership in this area, so that ethics is ‘part of the science’. The clear emergent option was a national Council for Data Ethics (along the lines of the Nuffield Council for Bioethics) convened by the ATI, as recommended in the recent Science and Technology parliamentary committee report The big data dilemma and the government response. Luciano Floridi (OII’s professor of the philosophy and ethics of information) warned that we cannot reduce ethics to mere compliance. Ethical problems do not normally have a single straightforward ‘right’ answer, but require dialogue and thought and extend far beyond individual privacy. There was consensus that the UK has the potential to provide global thought leadership and to set the standard for the rest of Europe. It was announced during the Summit that an ATI Working Group on the Ethics of Data Science has been confirmed, to take these issues forward.

So what happens now?

Throughout the Summit there were calls from policy makers for more data science leadership. We hope that the ATI will be instrumental in providing this, and an interface both between government, business and academia, and between separate Government departments. This Summit showed just how much real demand – and enthusiasm – there is from policy makers to develop data science methods and harness the power of big data. No-one wants to repeat with data science the history of government information technology – where in the 1950s and 60s, government led the way as an innovator, but has struggled to maintain this position ever since. We hope that the ATI can act to prevent the same fate for data science and provide both thought leadership and the ‘time and space’ (as one delegate put it) for policy-makers to work with the Institute to develop data science for the public good.

So since the Summit, in response to the clear need that emerged from the discussion and other conversations with stakeholders, the ATI has been designing a Policy Innovation Unit, with the aim of working with government departments on ‘data science for public good’ issues. Activities could include:

  • Secondments at the ATI for data scientists from government
  • Short term projects in government departments for ATI doctoral students and postdoctoral researchers
  • Developing ATI as an accredited data facility for public data, as suggested in the current Cabinet Office consultation on better use of data in government
  • ATI pilot policy projects, using government data
  • Policy symposia focused on specific issues and challenges
  • ATI representation in regular meetings at the senior level (for example, between Chief Scientific Advisors, the Cabinet Office, the Office for National Statistics, GO-Science).
  • ATI acting as an interface between public and private sectors, for example through knowledge exchange and the exploitation of non-government sources as well as government data
  • ATI offering a trusted space, time and a forum for formulating questions and developing solutions that tackle public policy problems and push forward the frontiers of data science
  • ATI as a source of cross-fertilization of expertise between departments
  • Reviewing the data science landscape in a department or agency, identifying feedback loops – or lack thereof – between policy-makers, analysts, front-line staff and identifying possibilities for an ‘intelligent centre’ model through strategic development of expertise.

The Summit, and a series of Whitehall Roundtables convened by GO-Science which led up to it, have initiated a nascent network of stakeholders across government, which we aim to build on and develop over the coming months. If you are interested in being part of this, please do be in touch with us

Helen Margetts, Oxford Internet Institute, University of Oxford (director@oii.ox.ac.uk)

Tom Melham, Department of Computer Science, University of Oxford