Five Pieces You Should Probably Read On: Fake News and Filter Bubbles

This is the second post in a series that will uncover great writing by faculty and students at the Oxford Internet Institute, things you should probably know, and things that deserve to be brought out for another viewing. This week: Fake News and Filter Bubbles!

Fake news, post-truth, “alternative facts”, filter bubbles — this is the news and media environment we apparently now inhabit, and that has formed the fabric and backdrop of Brexit (“£350 million a week”) and Trump (“This was the largest audience to ever witness an inauguration — period”). Do social media divide us, hide us from each other? Are you particularly aware of what content is personalised for you, what it is you’re not seeing? How much can we do with machine-automated or crowd-sourced verification of facts? And are things really any worse now than when Bacon complained in 1620 about the false notions that “are now in possession of the human understanding, and have taken deep root therein”?


1. Bernie Hogan: How Facebook divides us [Times Literary Supplement]

27 October 2016 / 1000 words / 5 minutes

“Filter bubbles can create an increasingly fractured population, such as the one developing in America. For the many people shocked by the result of the British EU referendum, we can also partially blame filter bubbles: Facebook literally filters our friends’ views that are least palatable to us, yielding a doctored account of their personalities.”

Bernie Hogan says it’s time Facebook considered ways to use the information it has about us to bring us together across political, ideological and cultural lines, rather than hide us from each other or push us into polarized and hostile camps. He says it’s not only possible for Facebook to help mitigate the issues of filter bubbles and context collapse; it’s imperative, and it’s surprisingly simple.


2. Luciano Floridi: Fake news and a 400-year-old problem: we need to resolve the ‘post-truth’ crisis [the Guardian]

29 November 2016 / 1000 words / 5 minutes

“The internet age made big promises to us: a new period of hope and opportunity, connection and empathy, expression and democracy. Yet the digital medium has aged badly because we allowed it to grow chaotically and carelessly, lowering our guard against the deterioration and pollution of our infosphere. […] some of the costs of misinformation may be hard to reverse, especially when confidence and trust are undermined. The tech industry can and must do better to ensure the internet meets its potential to support individuals’ wellbeing and social good.”

The Internet echo chamber satiates our appetite for pleasant lies and reassuring falsehoods, and has become the defining challenge of the 21st century, says Luciano Floridi. So far, the strategy for technology companies has been to deal with the ethical impact of their products retrospectively, but this is not good enough, he says. We need to shape and guide the future of the digital, and stop making it up as we go along. It is time to work on an innovative blueprint for a better kind of infosphere.


3. Philip Howard: Facebook and Twitter’s real sin goes beyond spreading fake news

3 January 2017 / 1000 words / 5 minutes

“With the data at their disposal and the platforms they maintain, social media companies could raise standards for civility by refusing to accept ad revenue for placing fake news. They could let others audit and understand the algorithms that determine who sees what on a platform. Just as important, they could be the platforms for doing better opinion, exit and deliberative polling.”

Only Facebook and Twitter know how pervasive fabricated news stories and misinformation campaigns have become during referendums and elections, says Philip Howard — and allowing fake news and computational propaganda to target specific voters is an act against democratic values. But in a time of weakening polling systems, withholding data about public opinion is actually their major crime against democracy, he says.


4. Brent Mittelstadt: Should there be a better accounting of the algorithms that choose our news for us?

7 December 2016 / 1800 words / 8 minutes

“Transparency is often treated as the solution, but merely opening up algorithms to public and individual scrutiny will not in itself solve the problem. Information about the functionality and effects of personalisation must be meaningful to users if anything is going to be accomplished. At a minimum, users of personalisation systems should be given more information about their blind spots, about the types of information they are not seeing, or where they lie on the map of values or criteria used by the system to tailor content to users.”

A central ideal of democracy is that political discourse should allow a fair and critical exchange of ideas and values. But political discourse is unavoidably mediated by the mechanisms and technologies we use to communicate and receive information, says Brent Mittelstadt. And content personalization systems and the algorithms they rely upon create a new type of curated media that can undermine the fairness and quality of political discourse.


5. Heather Ford: Verification of crowd-sourced information: is this ‘crowd wisdom’ or machine wisdom?

19 November 2013 / 1400 words / 6 minutes

“A key question being asked in the design of future verification mechanisms is the extent to which verification work should be done by humans or non-humans (machines). Here, verification is not a binary categorisation, but rather there is a spectrum between human and non-human verification work, and indeed, projects like Ushahidi, Wikipedia and Galaxy Zoo have all developed different verification mechanisms.”

‘Human’ verification, a process of checking whether a particular report meets a group’s truth standards, is an acutely social process, says Heather Ford. If code is law and if other aspects in addition to code determine how we can act in the world, it is important that we understand the context in which code is deployed. Verification is a practice that determines how we can trust information coming from a variety of sources — only by illuminating such practices and the variety of impacts that code can have in different environments can we begin to understand how code regulates our actions in crowdsourcing environments.


.. and just to prove we’re capable of understanding and acknowledging and assimilating multiple viewpoints on complex things, here’s Helen Margetts, with a different slant on filter bubbles: “Even if political echo chambers were as efficient as some seem to think, there is little evidence that this is what actually shapes election results. After all, by definition echo chambers preach to the converted. It is the undecided people who (for example) the Leave and Trump campaigns needed to reach. And from the research, it looks like they managed to do just that.”


The Authors

Bernie Hogan is a Research Fellow at the OII; his research interests lie at the intersection of social networks and media convergence.

Luciano Floridi is the OII’s Professor of Philosophy and Ethics of Information. His  research areas are the philosophy of Information, information and computer ethics, and the philosophy of technology.

Philip Howard is the OII’s Professor of Internet Studies. He investigates the impact of digital media on political life around the world.

Brent Mittelstadt is an OII Postdoc His research interests include the ethics of information handled by medical ICT, theoretical developments in discourse and virtue ethics, and epistemology of information.

Heather Ford completed her doctorate at the OII, where she studied how Wikipedia editors write history as it happens. She is now a University Academic Fellow in Digital Methods at the University of Leeds. Her forthcoming book “Fact Factories: Wikipedia’s Quest for the Sum of All Human Knowledge” will be published by MIT Press.

Helen Margetts is the OII’s Director, and Professor of Society and the Internet. She specialises in digital era government, politics and public policy, and data science and experimental methods. Her most recent book is Political Turbulence (Princeton).


Coming up! .. It’s the economy, stupid / Augmented reality and ambient fun / The platform economy / Power and development / Internet past and future / Government / Labour rights / The disconnected / Ethics / Staying critical

Should there be a better accounting of the algorithms that choose our news for us?

A central ideal of democracy is that political discourse should allow a fair and critical exchange of ideas and values. But political discourse is unavoidably mediated by the mechanisms and technologies we use to communicate and receive information — and content personalization systems (think search engines, social media feeds and targeted advertising), and the algorithms they rely upon, create a new type of curated media that can undermine the fairness and quality of political discourse.

A new article by Brent Mittlestadt explores the challenges of enforcing a political right to transparency in content personalization systems. Firstly, he explains the value of transparency to political discourse and suggests how content personalization systems undermine open exchange of ideas and evidence among participants: at a minimum, personalization systems can undermine political discourse by curbing the diversity of ideas that participants encounter. Second, he explores work on the detection of discrimination in algorithmic decision making, including techniques of algorithmic auditing that service providers can employ to detect political bias. Third, he identifies several factors that inhibit auditing and thus indicate reasonable limitations on the ethical duties incurred by service providers — content personalization systems can function opaquely and be resistant to auditing because of poor accessibility and interpretability of decision-making frameworks. Finally, Brent concludes with reflections on the need for regulation of content personalization systems.

He notes that no matter how auditing is pursued, standards to detect evidence of political bias in personalized content are urgently required. Methods are needed to routinely and consistently assign political value labels to content delivered by personalization systems. This is perhaps the most pressing area for future work—to develop practical methods for algorithmic auditing.

The right to transparency in political discourse may seem unusual and farfetched. However, standards already set by the U.S. Federal Communication Commission’s fairness doctrine — no longer in force — and the British Broadcasting Corporation’s fairness principle both demonstrate the importance of the idealized version of political discourse described here. Both precedents promote balance in public political discourse by setting standards for delivery of politically relevant content. Whether it is appropriate to hold service providers that use content personalization systems to a similar standard remains a crucial question.

Read the full article: Mittelstadt, B. (2016) Auditing for Transparency in Content Personalization Systems. International Journal of Communication 10(2016), 4991–5002.

We caught up with Brent to explore the broader implications of the study:

Ed: We basically accept that the tabloids will be filled with gross bias, populism and lies (in order to sell copy) — and editorial decisions are not generally transparent to us. In terms of their impact on the democratic process, what is the difference between the editorial boardroom and a personalising social media algorithm?

Brent: There are a number of differences. First, although not necessarily transparent to the public, one hopes that editorial boardrooms are at least transparent to those within the news organisations. Editors can discuss and debate the tone and factual accuracy of their stories, explain their reasoning to one another, reflect upon the impact of their decisions on their readers, and generally have a fair debate about the merits and weaknesses of particular content.

This is not the case for a personalising social media algorithm; those working with the algorithm inside a social media company are often unable to explain why the algorithm is functioning in a particular way, or determined a particular story or topic to be ‘trending’ or displayed to particular users, while others are not. It is also far more difficult to ‘fact check’ algorithmically curated news; a news item can be widely disseminated merely by many users posting or interacting with it, without any purposeful dissemination or fact checking by the platform provider.

Another big difference is the degree to which users can be aware of the bias of the stories they are reading. Whereas a reader of The Daily Mail or The Guardian will have some idea of the values of the paper, the same cannot be said of platforms offering algorithmically curated news and information. The platform can be neutral insofar as it disseminates news items and information reflecting a range of values and political viewpoints. A user will encounter items reflecting her particular values (or, more accurately, her history of interactions with the platform and the values inferred from them), but these values, and their impact on her exposure to alternative viewpoints, may not be apparent to the user.

Ed: And how is content “personalisation” different to content filtering (e.g. as we see with the Great Firewall of China) that people get very worked up about? Should we be more worried about personalisation?

Brent: Personalisation and filtering are essentially the same mechanism; information is tailored to a user or users according to some prevailing criteria. One difference is whether content is merely infeasible to access, or technically inaccessible. Content of all types will typically still be accessible in principle when personalisation is used, but the user will have to make an effort to access content that is not recommended or otherwise given special attention. Filtering systems, in contrast, will impose technical measures to make particular content inaccessible from a particular device or geographical area.

Another difference is the source of the criteria used to set the visibility of different types of content. In the case of personalisation, these criteria are typically based on the users (inferred) interests, values, past behaviours and explicit requests. Critically, these values are not necessarily apparent to the user. For filtering, criteria are typically externally determined by a third party, often a government. Some types of information are set off limits, according to the prevailing values of the third party. It is the imposition of external values, which limit the capacity of users to access content of their choosing, which often causes an outcry against filtering and censorship.

Importantly, the two mechanisms do not necessarily differ in terms of the transparency of the limiting factors or rules to users. In some cases, such as the recently proposed ban in the UK of adult websites that do not provide meaningful age verification mechanisms, the criteria that determine whether sites are off limits will be publicly known at a general level. In other cases, and especially with personalisation, the user inside the ‘filter bubble’ will be unaware of the rules that determine whether content is (in)accessible. And it is not always the case that the platform provider intentionally keeps these rules secret. Rather, the personalisation algorithms and background analytics that determine the rules can be too complex, inaccessible or poorly understood even by the provider to give the user any meaningful insight.

Ed: Where are these algorithms developed: are they basically all proprietary? i.e. how would you gain oversight of massively valuable and commercially sensitive intellectual property?

Brent: Personalisation algorithms tend to be proprietary, and thus are not normally open to public scrutiny in any meaningful sense. In one sense this is understandable; personalisation algorithms are valuable intellectual property. At the same time the lack of transparency is a problem, as personalisation fundamentally affects how users encounter and digest information on any number of topics. As recently argued, it may be the case that personalisation of news impacts on political and democratic processes. Existing regulatory mechanisms have not been successful in opening up the ‘black box’ so to speak.

It can be argued, however, that legal requirements should be adopted to require these algorithms to be open to public scrutiny due to the fundamental way they shape our consumption of news and information. Oversight can take a number of forms. As I argue in the article, algorithmic auditing is one promising route, performed both internally by the companies themselves, and externally by a government agency or researchers. A good starting point would be for the companies developing and deploying these algorithms to extend their cooperation with researchers, thereby allowing a third party to examine the effects these systems are having on political discourse, and society more broadly.

Ed: By “algorithm audit” — do you mean examining the code and inferring what the outcome might be in terms of bias, or checking the outcome (presumably statistically) and inferring that the algorithm must be introducing bias somewhere? And is it even possible to meaningfully audit personalisation algorithms, when they might rely on vast amounts of unpredictable user feedback to train the system?

Brent: Algorithm auditing can mean both of these things, and more. Audit studies are a tool already in use, whereby human participants introduce different inputs into a system, and examine the effect on the system’s outputs. Similar methods have long been used to detect discriminatory hiring practices, for instance. Code audits are another possibility, but are generally prohibitive due to problems of access and complexity. Also, even if you can access and understand the code of an algorithm, that tells you little about how the algorithm performs in practice when given certain input data. Both the algorithm and input data would need to be audited.

Alternatively, auditing can assess just the outputs of the algorithm; recent work to design mechanisms to detect disparate impact and discrimination, particularly in the Fairness, Accountability and Transparency in Machine Learning (FAT-ML) community, is a great example of this type of auditing. Algorithms can also be designed to attempt to prevent or detect discrimination and other harms as they occur. These methods are as much about the operation of the algorithm, as they are about the nature of the training and input data, which may itself be biased. In short, auditing is very difficult, but there are promising avenues of research and development. Once we have reliable auditing methods, the next major challenge will be to tailor them to specific sectors; a one-size-meets-all approach to auditing is not on the cards.

Ed: Do you think this is a real problem for our democracy? And what is the solution if so?

Brent: It’s difficult to say, in part because access and data to study the effects of personalisation systems are hard to come by. It is one thing to prove that personalisation is occurring on a particular platform, or to show that users are systematically displayed content reflecting a narrow range of values or interests. It is quite another to prove that these effects are having an overall harmful effect on democracy. Digesting information is one of the most basic elements of social and political life, so any mechanism that fundamentally changes how information is encountered should be subject to serious and sustained scrutiny.

Assuming personalisation actually harms democracy or political discourse, mitigating its effects is quite a different issue. Transparency is often treated as the solution, but merely opening up algorithms to public and individual scrutiny will not in itself solve the problem. Information about the functionality and effects of personalisation must be meaningful to users if anything is going to be accomplished.

At a minimum, users of personalisation systems should be given more information about their blind spots, about the types of information they are not seeing, or where they lie on the map of values or criteria used by the system to tailor content to users. A promising step would be proactively giving the user some idea of what the system thinks it knows about them, or how they are being classified or profiled, without the user first needing to ask.

Brent Mittelstadt was talking to blog editor David Sutcliffe.