Edit wars! Measuring and mapping society’s most controversial topics

Ed: How did you construct your quantitative measure of ‘conflict’? Did you go beyond just looking at content flagged by editors as controversial?

Taha: Yes we did … actually, we have shown that controversy measures based on “controversial” flags are not inclusive at all and although they might have high precision, they have very low recall. Instead, we constructed an automated algorithm to locate and quantify the editorial wars taking place on the Wikipedia platform. Our algorithm is based on reversions, i.e. when editors undo each other’s contributions. We focused specifically on mutual reverts between pairs of editors and we assigned a maturity score to each editor, based on the total volume of their previous contributions. While counting the mutual reverts, we used more weight for those ones committed by/on editors with higher maturity scores; as a revert between two experienced editors indicates a more serious problem. We always validated our method and compared it with other methods, using human judgement on a random selection of articles.

Ed: Was there any discrepancy between the content deemed controversial by your own quantitative measure, and what the editors themselves had flagged?

Taha: We were able to capture all the flagged content, but not all the articles found to be controversial by our method are flagged. And when you check the editorial history of those articles, you soon realise that they are indeed controversial but for some reason have not been flagged. It’s worth mentioning that the flagging process is not very well implemented in smaller language editions of Wikipedia. Even if the controversy is detected and flagged in English Wikipedia, it might not be in the smaller language editions. Our model is of course independent of the size and editorial conventions of different language editions.

Ed: Were there any differences in the way conflicts arose / were resolved in the different language versions?

Taha: We found the main differences to be the topics of controversial articles. Although some topics are globally debated, like religion and politics, there are many topics which are controversial only in a single language edition. This reflects the local preferences and importances assigned to topics by different editorial communities. And then the way editorial wars initiate and more importantly fade to consensus is also different in different language editions. In some languages moderators interfere very soon, while in others the war might go on for a long time without any moderation.

Ed: In general, what were the most controversial topics in each language? And overall?

Taha: Generally, religion, politics, and geographical places like countries and cities (sometimes even villages) are the topics of debates. But each language edition has also its own focus, for example football in Spanish and Portuguese, animations and TV series in Chinese and Japanese, sex and gender-related topics in Czech, and Science and Technology related topics in French Wikipedia are very often behind editing wars.

Ed: What other quantitative studies of this sort of conflict -ie over knowledge and points of view- are there?

Taha: My favourite work is one by researchers from Barcelona Media Lab. In their paper Jointly They Edit: Examining the Impact of Community Identification on Political Interaction in Wikipedia they provide quantitative evidence that editors interested in political topics identify themselves more significantly as Wikipedians than as political activists, even though they try hard to reflect their opinions and political orientations in the articles they contribute to. And I think that’s the key issue here. While there are lots of debates and editorial wars between editors, at the end what really counts for most of them is Wikipedia as a whole project, and the concept of shared knowledge. It might explain how Wikipedia really works despite all the diversity among its editors.

Ed: How would you like to extend this work?

Taha: Of course some of the controversial topics change over time. While Jesus might stay a controversial figure for a long time, I’m sure the article on President (W) Bush will soon reach a consensus and most likely disappear from the list of the most controversial articles. In the current study we examined the aggregated data from the inception of each Wikipedia-edition up to March 2010. One possible extension that we are working on now is to study the dynamics of these controversy-lists and the positions of topics in them.

Read the full paper: Yasseri, T., Spoerri, A., Graham, M. and Kertész, J. (2014) The most controversial topics in Wikipedia: A multilingual and geographical analysis. In: P.Fichman and N.Hara (eds) Global Wikipedia: International and cross-cultural issues in online collaboration. Scarecrow Press.

Taha was talking to blog editor David Sutcliffe.

Taha Yasseri is the Big Data Research Officer at the OII. Prior to coming to the OII, he spent two years as a Postdoctoral Researcher at the Budapest University of Technology and Economics, working on the socio-physical aspects of the community of Wikipedia editors, focusing on conflict and editorial wars, along with Big Data analysis to understand human dynamics, language complexity, and popularity spread. He has interests in analysis of Big Data to understand human dynamics, government-society interactions, mass collaboration, and opinion dynamics.

Harnessing ‘generative friction’: can conflict actually improve quality in open systems?

Image from “The Iraq War: A Historiography of Wikipedia Changelogs“, a twelve-volume set of all changes to the Wikipedia article on the Iraq War (totalling over 12,000 changes and almost 7,000 pages), by STML.

Ed: I really like the way that, contrary to many current studies on conflict and Wikipedia, you focus on how conflict can actually be quite productive. How did this insight emerge?

Kim: I was initially looking for instances of collaboration in Wikipedia to see how popular debates about peer production played out in reality. What I found was that conflict was significantly more prevalent than I had assumed. It struck me as interesting, as most of the popular debates at the time framed conflict as hindering the collaborative editorial process. After several stages of coding, I found that the conversations that involved even a minor degree of conflict were fascinating. A pattern emerged where disagreements about the editorial process resulted in community members taking positive actions to solve the discord and achieve consensus. This was especially prominent in early discussions prior to 2005 before many of the policies that regulate content production in the encyclopaedia were formulated. The more that differing points of view and differing evaluative frames came into contact, the more the community worked together to generate rules and norms to regulate and improve the production of articles.

Ed: You use David Stark’s concept of generative friction to describe how conflict is ‘central to the editorial processes of Wikipedia’. Can you explain why this is important?

Kim: Having different points of view come into contact is the premise of Wikipedia’s collaborative editing model. When these views meet, Stark maintains there is an overlap of individuals’ evaluative frames, or worldviews, and it is in this overlap that creative solutions to problems can occur. People come across solutions they may not otherwise have encountered in the typical homogeneous, hierarchical system that is traditionally the standard for institutions trying to maximize efficiency. In this respect, conflict is central to the process as it is about the struggle to negotiate meaning and achieve a consensus among editors with differing opinions and perspectives. Conflict can therefore be framed as generative, given it can result in innovative solutions to problems identified in the editorial process. In Wikipedia’s case this can be seen through the creation of policies to regulate this process, or developing technical tools to automate repetitive editing tasks, and the like. When thinking about large, collaborative systems where more views are coming into contact, then this research points to the fact that opening up processes that have traditionally been closed, like encyclopaedic print production, or indeed government or institutional processes, can result in creative and innovative solutions to problems.

Ed: This ‘generative friction’ is different from what you sometimes see on Wikipedia articles, where conflict degenerates into personal attacks. Did you find any evidence of this in your case study? Can this type of conflict ‘poison’ the others?

Kim: I actually found relatively few discussions where competing evaluative frames resulted in editors engaging in personal attacks. I was initially quite surprised by this finding as I was familiar with Wikipedia’s early edit wars. On further examination of the conversations, I found that editors often referred to Wikipedia’s policies as a way to manage debate and keep conflict to a minimum. For example, by referring to policies on civility to keep behaviour within community norms, or by referring to policies on verifiability to explain why some content sources aren’t acceptable, relatively few instances of conflict devolved into personal attacks.

I do, however, feel that it is really important to further examine the role that conflict plays in the editorial process. At what point does conflict stop being productive and actually start to impede the production of quality content? What role does conflict play in the participation pattern of different social groups? There is still considerable research to be done on the role of conflict in Wikipedia, especially if we are to have a more nuanced understanding of how the encyclopaedia actually works.

Similarly, if we are to apply this to the concept of open government and politics, or transparency in public policy and public institutions, then these forums will need to know whether they are providing truly open and inclusive online or open spaces, or simply reflecting the most dominant voices.

Ed: You refer in your paper to how Wikipedia has changed over time. Can you talk a bit more about this and whether there are good longitudinal studies that you referred to?

Kim: Tracing conversations about an article over time has provided a snapshot of not only how the topic has been viewed and constructed in that time period, but also of how Wikipedia has been constructed as both a platform and an encyclopaedia. When Wikipedia’s Australia article (which my case study was based on) was a new entry, editors worked together to discuss and talk out larger structural and ideological issues about the article. Who would be reading the article? Where should the inbox go? Should there be a standardised format across the encyclopaedia? How should articles be organised? As the article matured and the editorial community grew, discussions on the article talk page tended to be more content specific.

This finding should be taken in light of the study by Viégas et al. (2007) who found that active editors’ involvement with Wikipedia changes over time, from initially having a local (article) focus, to being more involved with issues of quality and the overall health of the community. This may account for how early active contributors to the “Australia” article were not present in more recent discussions on the talk page of the article. Indeed there have been a number of excellent studies and accounts of how the behaviour of editors has changed over time, including Suh et al. (2009) who found participation in Wikipedia to be declining, attributable in part to the conflict between existing active editors and new contributors, along with increased costs for managing the community as a whole.

These studies, and others like them, are really important for contributing to a wider understanding of Wikipedia and how it works, as it is only with more research about open collaboration and how it is played out, that we can apply the lessons learned to other situations.

Ed: What do you think is the relevance of this research to other avenues?

Kim: Societies are becoming more aware of the importance of active citizenship and involving diverse sections of the community in public consultation, and much of this activity can be carried out over the Internet. I would hope that this research adds to scholarship about participation in online spaces, be they social, political, cultural or civic. While it is about Wikipedia in particular, I hope that it adds to a growing knowledge base from which we can start to draw similarities and differences about how a variety of online communities operate, and the role of conflict in these spaces. So that rather than relying on discourses about the conflict that results when many voices and views meet in an open space, we can start as researchers, to investigate how friction and debate play out in reality. Because I do think that it is important to recognise the constructive role that conflict can play in a community like Wikipedia.

I also feel it’s really important to conduct more research on the role of conflict in online communities, as we don’t really know yet at what point the conflict stops being generative and starts to hinder the processes of a particular community. For instance, how does it affect the participation of conflict-avoiding cultures in different Wikipedias? How does it affect the participation of women? We know from the Wikimedia Foundation’s own research that these groups are significantly under-represented in the editorial community. So while conflict can play a positive role in content creation and production and this needs to be acknowledged, further research on conflict needs to consider how it affects participation in open spaces.


Kim Osman is a PhD candidate at the ARC Centre of Excellence for Creative Industries and Innovation at the Queensland University of Technology. She is currently investigating the history of Wikipedia as a new media institution. Kim’s research interests include regulation and diversity in open environments, the social construction of technologies, and controversies in the history of technology.

Kim Osman was talking to blog editor Heather Ford.