Could Voting Advice Applications force politicians to keep their manifesto promises?

In many countries, Voting Advice Applications (VAAs) have become an almost indispensable part of the electoral process, playing an important role in the campaigning activities of parties and candidates, an essential element of media coverage of the elections, and being widely used by citizens. A number of studies have shown that VAA use has an impact on the cognitive behaviour of users, on their likelihood to participate in elections, and on the choice of the party they vote for.

These applications are based on the idea of issue and proximity voting — the parties and candidates recommended by VAAs are those with the highest number of matching positions on a number of political questions and issues. Many of these questions are much more specific and detailed than party programs and electoral platforms, and show the voters exactly what the party or candidates stand for and how they will vote in parliament once elected. In his Policy & Internet article “Do VAAs Encourage Issue Voting and Promissory Representation? Evidence From the Swiss Smartvote,” Andreas Ladner examines the extent to which VAAs alter the way voters perceive the meaning of elections, and encourage them to hold politicians to account for election promises.

His main hypothesis is that VAAs lead to “promissory representation” — where parties and candidates are elected for their promises and sanctioned by the electorate if they don’t keep them. He suggests that as these tools become more popular, the “delegate model” is likely to increase in popularity: i.e. one in which politicians are regarded as delegates voted into parliament to keep their promises, rather than being voted a free mandate to act how they see fit (the “trustee model”).

We caught up with Andreas to discuss his findings:

Ed.: You found that issue-voters were more likely (than other voters) to say they would sanction a politician who broke their election promises. But also that issue voters are less politically engaged. So is this maybe a bit moot: i.e. if the people most likely to force the “delegate model” system are the least likely to enforce it?

Andreas: It perhaps looks a bit moot in the first place, but what happens if the less engaged are given the possibility to sanction them more easily or by default. Sanctioning a politician who breaks an election promise is not per se a good thing, it depends on the reason why he or she broke it, on the situation, and on the promise. VAA can easily provide information to what extent candidates keep their promises — and then it gets very easy to sanction them simply for that without taking other arguments into consideration.

Ed.: Do voting advice applications work best in complex, multi-party political systems? (I’m not sure anyone would need one to distinguish between Trump / Clinton, for example?)

Andreas: Yes, I believe that in very complex systems – like for example in the Swiss case where voters not only vote for parties but also for up to 35 different candidates – VAAs are particularly useful since they help to process a huge amount of information. If the choice is only between two parties or two candidates which are completely different, than VAAs are less helpful.

Ed.: I guess the recent elections / referendum I am most familiar with (US, UK, France) have been particularly lurid and nasty: but I guess VAAs rely on a certain quiet rationality to work as intended? How do you see your Swiss results (and Swiss elections, generally) comparing with these examples? Do VAAs not just get lost in the noise?

Andreas: The idea of VAAs is to help voters to make better informed choices. This is, of course, opposed to decisions based on emotions. In Switzerland, elections are not of outmost importance, due to specific features of our political system such as direct democracy and power sharing, but voters seem to appreciate the information provided by smartvote. Almost 20% of the voter cast their vote after having consulted the website.

Ed.: Macron is a recent example of someone who clearly sought (and received) a general mandate, rather than presenting a detailed platform of promises. Is that unusual? He was criticised in his campaign for being “too vague,” but it clearly worked for him. What use are manifesto pledges in politics — as opposed to simply making clear to the electorate where you stand on the political spectrum?

Andreas: Good VAAs combine electoral promises on concrete issues as well as more general political positions. Voters can base their decisions on either of them, or on a combination of both of them. I am not arguing in favour of one or the other, but they clearly have different implications. The former is closer to the delegate model, the latter to the trustee model. I think good VAAs should make the differences clear and should even allow the voters to choose.

Ed.: I guess Trump is a contrasting example of someone whose campaign was all about promises (while also seeking a clear mandate to “make America great again”), but who has lied, and broken these (impossible) promises seemingly faster than people can keep track of them. Do you think his supporters care, though?

Andreas: His promises were too far away from what he can possibly keep. Quite a few of his voters, I believe, do not want them to be fully realized but rather that the US move a bit more into this direction.

Ed.: I suppose another example of an extremely successful quasi-pledge was the Brexit campaign’s obviously meaningless — but hugely successful — “We send the EU £350 million a week; let’s fund our NHS instead.” Not to sound depressing, but do promises actually mean anything? Is it the candidate / issue that matters (and the media response to that), or the actual pledges?

Andreas: I agree that the media play an important role and not always into the direction they intend to do. I do not think that it is the £350 million a week which made the difference. It is much more a general discontent and a situation which was not sufficiently explained and legitimized which led to this unexpected decision. If you lose the support for your policy than it gets much easier for your opponents. It is difficult to imagine that you can get a majority built on nothing.

Ed.: I’ve read all the articles in the Policy & Internet special issue on VAAs: one thing that struck me is that there’s lots of incomplete data, e.g. no knowledge of how people actually voted in the end (or would vote in future). What are the strengths and weaknesses of VAAs as a data source for political research?

Andreas: The quality of the data varies between countries and voting systems. We have a self-selection bias in the use of VAAs and often also into the surveys conducted among the users. In general we don’t know how they voted, and we have to believe them what they tell us. In many respects the data does not differ that much from what we get from classic electoral studies, especially since they also encounter difficulties in addressing a representative sample. VAAs usually have much larger Ns on the side of the voters, generate more information about their political positions and preferences, and provide very interesting information about the candidates and parties.

Read the full article: Ladner, A. (2016) Do VAAs Encourage Issue Voting and Promissory Representation? Evidence From the Swiss Smartvote. Policy & Internet 8 (4). DOI: doi:10.1002/poi3.137.

Andreas Ladner was talking to blog editor David Sutcliffe.

What explains the worldwide patterns in user-generated geographical content?

The geographies of codified knowledge have always been uneven, affording some people and places greater voice and visibility than others. While the rise of the geosocial Web seemed to promise a greater diversity of voices, opinions, and narratives about places, many regions remain largely absent from the websites and services that represent them to the rest of the world. These highly uneven geographies of codified information matter because they shape what is known and what can be known. As geographic content and geospatial information becomes increasingly integral to our everyday lives, places that are left off the ‘map of knowledge’ will be absent from our understanding of, and interaction with, the world.

We know that Wikipedia is important to the construction of geographical imaginations of place, and that it has immense power to augment our spatial understandings and interactions (Graham et al. 2013). In other words, the presences and absences in Wikipedia matter. If a person’s primary free source of information about the world is the Persian or Arabic or Hebrew Wikipedia, then the world will look fundamentally different from the world presented through the lens of the English Wikipedia. The capacity to represent oneself to outsiders is especially important in those parts of the world that are characterized by highly uneven power relationships: Brunn and Wilson (2013) and Graham and Zook (2013) have already demonstrated the power of geospatial content to reinforce power in a South African township and Jerusalem, respectively.

Until now, there has been no large-scale empirical analysis of the factors that explain information geographies at the global scale; this is something we have aimed to address in this research project on Mapping and measuring local knowledge production and representation in the Middle East and North Africa. Using regression models of geolocated Wikipedia data we have identified what are likely to be the necessary conditions for representation at the country level, and have also identified the outliers, i.e. those countries that fare considerably better or worse than expected. We found that a large part of the variation could be explained by just three factors: namely, (1) country population, (2) availability of broadband Internet, and (3) the number of edits originating in that country. [See the full paper for an explanation of the data and the regression models.]

But how do we explain the significant inequalities in the geography of user-generated information that remain after adjusting for differing conditions using our regression model? While these three variables help to explain the sparse amount of content written about much of Sub-Saharan Africa, most of the Middle East and North Africa have quantities of geographic information below their expected values. For example, despite high levels of wealth and connectivity, Qatar and the United Arab Emirates have far fewer articles than we might expect from the model.

These three factors independently matter, but they will also be subject to a number of constraints. A country’s population will probably affect the number of human sites, activities, and practices of interest; ie the number of things one might want to write about. The size of the potential audience might also be influential, encouraging editors in denser-populated regions and those writing in major languages. However, societal attitudes towards learning and information sharing will probably also affect the propensity of people in some places to contribute content. Factors discouraging the number of edits to local content might include a lack of local Wikimedia chapters, the attractiveness of writing content about other (better-represented) places, or contentious disputes in local editing communities that divert time into edit wars and away from content generation.

We might also be seeing a principle of increasing informational poverty. Not only is a broader base of traditional source material (such as books, maps, and images) needed for the generation of any Wikipedia article, but it is likely that the very presence of content itself is a generative factor behind the production of further content. This makes information produced about information-sparse regions most useful for people in informational cores — who are used to integrating digital information into their everyday practices — rather than those in informational peripheries.

Various practices and procedures of Wikipedia editing likely amplify this effect. There are strict guidelines on how knowledge can be created and represented in Wikipedia, including a ban on original research, and the need to source key assertions. Editing incentives and constraints probably also encourage work around existing content (which is relatively straightforward to edit) rather than creation of entirely new material. In other words, the very policies and norms that govern the encyclopedia’s structure make it difficult to populate the white space with new geographic content. In addressing these patterns of increasing informational poverty, we need to recognize that no one of these three conditions can ever be sufficient for the generation of geographic knowledge. As well as highlighting the presences and absences in user-generated content, we also need to ask what factors encourage or limit production of that content.

In interpreting our model, we have come to a stark conclusion: increasing representation doesn’t occur in a linear fashion, but it accelerates in a virtuous cycle, benefitting those with strong editing cultures in local languages. For example, Britain, Sweden, Japan and Germany are extensively georeferenced on Wikipedia, whereas much of the MENA region has not kept pace, even accounting for their levels of connectivity, population, and editors. Thus, while some countries are experiencing the virtuous cycle of more edits and broadband begetting more georeferenced content, those on the periphery of these information geographies might fail to reach a critical mass of editors, or even dismiss Wikipedia as a legitimate site for user-generated geographic content: a problem that will need to be addressed if Wikipedia is indeed to be considered as the “sum of all human knowledge”.

Read the full paper: Graham, M., Hogan, B., Straumann, R.K., and Medhat, A. (2014) Uneven Geographies of User-Generated Information: Patterns of Increasing Informational Poverty. Annals of the Association of American Geographers.


Brunn S. D., and M. W. Wilson. 2013. Cape Town’s million plus black township of Khayelitsha: Terrae incognitae and the geographies and cartographies of silence, Habitat International. 39 284-294.

Graham M., and M. Zook. (2013) Augmented Realities and Uneven Geographies: Exploring the Geolinguistic Contours of the Web. Environment and Planning A 45(1): 77–99.

Graham M, M. Zook, and A. Boulton. 2013. Augmented Reality in the Urban Environment: Contested Content and the Duplicity of Code. Transactions of the Institute of British Geographers. 38(3) 464-479.

Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

What is stopping greater representation of the MENA region?

Negotiating the wider politics of Wikipedia can be a daunting task, particularly when in it comes to content about the MENA region. Image of the Dome of the Rock (Qubbat As-Sakhrah), Jerusalem, by 1yen

Wikipedia has famously been described as a project that “ works great in practice and terrible in theory”. One of the ways in which it succeeds is through its extensive consensus-based governance structure. While this has led to spectacular success –over 4.5 million articles in the English Wikipedia alone — the governance structure is neither obvious nor immediately accessible, and can present a barrier for those seeking entry. Editing Wikipedia can be a tough challenge – an often draining and frustrating task, involving heated disputes and arguments where it is often the most tenacious, belligerent, or connected editor who wins out in the end.

Broadband access and literacy are not the only pre-conditions for editing Wikipedia; ‘digital literacy’ is also crucial. This includes the ability to obtain and critically evaluate online sources, locate Wikipedia’s editorial and governance policies, master Wiki syntax, and confidently articulate and assert one’s views about an article or topic. Experienced editors know how to negotiate the rules, build a consensus with some editors to block others, and how to influence administrators during dispute resolution. This strict adherence to the word (if not the spirit) of Wikipedia’s ‘law’ can lead to marginalization or exclusion of particular content, particularly when editors are scared off by unruly mobs who ‘weaponize’ policies to fit a specific agenda.

Governing such a vast collaborative platform as Wikipedia obviously presents a difficult balancing act between being open enough to attract volume of contributions, and moderated enough to ensure their quality. Many editors consider Wikipedia’s governance structure (which varies significantly between the different language versions) essential to ensuring the quality of its content, even if it means that certain editors can (for example) arbitrarily ban other users, lock down certain articles, and exclude moderate points of view. One of the editors we spoke to noted that: “A number of articles I have edited with quality sources, have been subjected to editors cutting information that doesn’t fit their ideas […] I spend a lot of time going back to reinstate information. Today’s examples are in the ‘Battle of Nablus (1918)’ and the ‘Third Transjordan attack’ articles. Bullying does occur from time to time […] Having tried the disputes process I wouldn’t recommend it.” Community building might help support MENA editors faced with discouragement or direct opposition as they try to build content about the region, but easily locatable translations of governance materials would also help. Few of the extensive Wikipedia policy discussions have been translated into Arabic, leading to replication of discussions or ambiguity surrounding correct dispute resolution.

Beyond arguments with fractious editors over minutiae (something that comes with the platform), negotiating the wider politics of Wikipedia can be a daunting task, particularly when in it comes to content about the MENA region. It would be an understatement to say that the Middle East is a politically sensitive region, with more than its fair share of apparently unresolvable disputes, competing ideologies (it’s the birthplace of three world religions…), repressive governments, and ongoing and bloody conflicts. Editors shared stories with us about meddling from state actors (eg Tunisia, Iran) and a lack of trust with a platform that is generally considered to be a foreign, and sometimes explicitly American, tool. Rumors abound that several states (eg Israel, Iran) have concerted efforts to work on Wikipedia content, creating a chilling effect for new editors who might feel that editing certain pages might prove dangerous, or simply frustrating or impossible. Some editors spoke of being asked by Syrian government officials for advice on how to remove critical content, or how to identify the editors responsible for putting it there. Again: the effect is chilling.

A lack of locally produced and edited content about the region clearly can’t be blamed entirely on ‘outsiders’. Many editors in the Arabic Wikipedia have felt snubbed by the creation of an explicitly “Egyptian Arabic” Wikipedia, which has not only forked the content and editorial effort, but also stymied any ‘pan-Arab’ identity on the platform. There is a culture of administrators deleting articles they do not think are locally appropriate; often relating to politically (or culturally) sensitive topics. Due to Arabic Wikipedia’s often vicious edit wars, it is heavily moderated (unlike for example the English version), and anonymous edits do not appear instantly.

Some editors at the workshops noted other systemic and cultural issues, for example complaining of an education system that encourages rote learning, reinforcing the notion that only experts should edit (or moderate) a topic, rather than amateurs with local familiarity. Editors also noted the notable gender disparities on the site; a longstanding issue for other Wikipedia versions as well. None of these discouragements are helped by what some editors noted as a larger ‘image problem’ with editing in the Arabic Wikipedia, given it would always be overshadowed by the dominant English Wikipedia, one editor commenting that: “the English Wikipedia is vastly larger than its Arabic counterpart, so it is not unthinkable that there is more content, even about Arab-world subjects, in English. From my (unscientific) observation, many times, content in Arabic about a place or a tribe is not very encyclopedic, but promotional, and lacks citations”. Translating articles into Arabic might be seen as menial and unrewarding work, when the exciting debates about an article are happening elsewhere.

When we consider the coming-together of all of these barriers, it might be surprising that Wikipedia is actually as large as it is. However, the editors we spoke with were generally optimistic about the site, considering it an important activity that serves the greater good. Wikipedia is without doubt one of the most significant cultural and political forces on the Internet. Wikipedians are remarkably generous with their time, and it’s their efforts that are helping to document, record, and represent much of the world – including places where documentation is scarce. Most of the editors at our workshop ultimately considered Wikipedia a path to a more just society; through not just consensus, voting, and an aspiration to record certain truths — seeing it not just as a site of conflict, but also a site of regional (and local) pride. When asked why he writes geographic content, one editor simply replied: “It’s my own town”.

Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

How well represented is the MENA region in Wikipedia?

There are more Wikipedia articles in English than Arabic about almost every Arabic speaking country in the Middle East. Image of rock paintings in the Tadrart Acacus region of Libya by Luca Galuzzi.
There are more Wikipedia articles in English than Arabic about almost every Arabic speaking country in the Middle East. Image of rock paintings in the Tadrart Acacus region of Libya by Luca Galuzzi.
Wikipedia is often seen to be both an enabler and an equalizer. Every day hundreds of thousands of people collaborate on an (encyclopaedic) range of topics; writing, editing and discussing articles, and uploading images and video content. This structural openness combined with Wikipedia’s tremendous visibility has led some commentators to highlight it as “a technology to equalize the opportunity that people have to access and participate in the construction of knowledge and culture, regardless of their geographic placing” (Lessig 2003). However, despite Wikipedia’s openness, there are also fears that the platform is simply reproducing worldviews and knowledge created in the Global North at the expense of Southern viewpoints (Graham 2011; Ford 2011). Indeed, there are indications that global coverage in the encyclopaedia is far from ‘equal’, with some parts of the world heavily represented on the platform, and others largely left out (Hecht and Gergle 2009; Graham 2011, 2013, 2014).

These second-generation digital divides are not merely divides of Internet access (so discussed in the late 1990s), but gaps in representation and participation (Hargittai and Walejko 2008). Whereas most Wikipedia articles written about most European and East Asian countries are written in their dominant languages, for much of the Global South we see a dominance of articles written in English. These geographic differences in the coverage of different language versions of Wikipedia matter, because fundamentally different narratives can be (and are) created about places and topics in different languages (Graham and Zook 2013; Graham 2014).

If we undertake a ‘global analysis’ of this pattern by examining the number of geocoded articles (ie about a specific place) across Wikipedia’s main language versions (Figure 1), the first thing we can observe is the incredible human effort that has gone into describing ‘place’ in Wikipedia. The second is the clear and highly uneven geography of information, with Europe and North America home to 84% of all geolocated articles. Almost all of Africa is poorly represented in the encyclopaedia — remarkably, there are more Wikipedia articles written about Antarctica (14,959) than any country in Africa, and more geotagged articles relating to Japan (94,022) than the entire MENA region (88,342). In Figure 2 it is even more obvious that Europe and North America lead in terms of representation on Wikipedia.

Figure 1. Total number of geotagged Wikipedia articles across all 44 surveyed languages.
Figure 1. Total number of geotagged Wikipedia articles across all 44 surveyed languages.
Figure 2. Number of regional geotagged articles and population.
Figure 2. Number of regional geotagged articles and population.

Knowing how many articles describe a place only tells a part of the ‘representation story’. Figure 3 adds the linguistic element, showing the dominant language of Wikipedia articles per country. The broad pattern is that some countries largely define themselves in their own languages, and others appear to be largely defined from outside. For instance, almost all European countries have more articles about themselves in their dominant language; that is, most articles about the Czech Republic are written in Czech. Most articles about Germany are written in German (not English).

Figure 3. Language with the most geocoded articles by country (across 44 top languages on Wikipedia).
Figure 3. Language with the most geocoded articles by country (across 44 top languages on Wikipedia).

We do not see this pattern across much of the South, where English dominates across much of Africa, the Middle East, South and East Asia, and even parts of South and Central America. French dominates in five African countries, and German is dominant in one former German colony (Namibia) and a few other countries (e.g. Uruguay, Bolivia, East Timor).

The scale of these differences is striking. Not only are there more Wikipedia articles in English than Arabic about almost every Arabic speaking country in the Middle East, but there are more English articles about North Korea than there are Arabic articles about Saudi Arabia, Libya, and the UAE. Not only do we see most of the world’s content written about global cores, but it is largely dominated by a relatively few languages.

Figure 4 shows the total number of geotagged Wikipedia articles in English per country. The sheer density of this layer of information over some parts of the world is astounding (with 928,542 articles about places in English), nonetheless, in this layer of geotagged English content, only 3.23% of the articles are about Africa, and 1.67% are about the MENA region.

Figure 4. Number of geotagged articles in the English Wikipedia by country.
Figure 4. Number of geotagged articles in the English Wikipedia by country.

We see a somewhat different pattern when looking at the global geography of the 22,548 geotagged articles of the Arabic Wikipedia (Figure 5). Algeria and Syria are both defined by a relatively high number of articles in Arabic (as are the US, Italy, Spain, Russia and Greece). These information densities are substantially greater than what we see for many other MENA countries in which Arabic is an official language (such as Egypt, Morocco, and Saudi Arabia). This is even more surprising when we realise that the Italian and Spanish populations are smaller than the Egyptian, but there are nonetheless far more geotagged articles in Arabic about Italy (2,428) and Spain (1,988) than about Egypt (433).

Figure 5. Total number of geotagged articles in the Arabic Wikipedia by country.
Figure 5. Total number of geotagged articles in the Arabic Wikipedia by country.

By mapping the geography of Wikipedia articles in both global and regional languages, we can begin to examine the layers of representation that ‘augment’ the world we live in. We have seen that, notable exceptions aside (e.g. ‘Iran’ in Farsi and ‘Israel’ in Hebrew) the MENA region tends to be massively underrepresented — not just in major world languages, but also in its own: Arabic. Clearly, much is being left unsaid about that part of the world. Although we entered the project anticipating that the MENA region would be under-represented in English, we did not anticipate the degree to which it is under-represented in Arabic.


Ford, H. (2011) The Missing Wikipedians. In Critical Point of View: A Wikipedia Reader, ed. G. Lovink and N. Tkacz, 258-268. Amsterdam: Institute of Network Cultures.

Graham, M. (2014) The Knowledge Based Economy and Digital Divisions of Labour. In Companion to Development Studies, 3rd edition, eds v. Desai, and R. Potter. Hodder, pp. 189-195.

Graham, M. (2013) The Virtual Dimension. In Global City Challenges: Debating a Concept, Improving the Practice. Eds. Acuto, M. and Steele, W. London: Palgrave.

Graham, M. (2011) Wiki Space: Palimpsests and the Politics of Exclusion. In Critical Point of View: A Wikipedia Reader. Eds. Lovink, G. and Tkacz, N. Amsterdam: Institute of Network Cultures, pp. 269-282.

Graham M., and M. Zook (2013) Augmented Realities and Uneven Geographies: Exploring the Geolinguistic Contours of the Web. Environment and Planning A 45 (1) 77–99.

Hargittai, E. and G. Walejko (2008) The Participation Divide: Content Creation and Sharing in the Digital Age. Information, Communication and Society 11 (2) 239–256.

Hecht B., and D. Gergle (2009) Measuring self-focus bias in community-maintained knowledge repositories. In Proceedings of the 4th International Conference on Communities and Technologies, Penn State University, 2009, pp. 11–20. New York: ACM.

Lessig, L. (2003) An Information Society: Free or Feudal. Talk given at the World Summit on the Information Society, Geneva, 2003.

Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

The sum of (some) human knowledge: Wikipedia and representation in the Arab World

Arabic is one of the least represented major world languages on Wikipedia: few languages have more speakers and fewer articles than Arabic. Image of the Umayyad Mosque (Damascus) by Travel Aficionado

Wikipedia currently contains over 9 million articles in 272 languages, far surpassing any other publicly available information repository. Being the first point of contact for most general topics (therefore an effective site for framing any subsequent representations) it is an important platform from which we can learn whether the Internet facilitates increased open participation across cultures — or reinforces existing global hierarchies and power dynamics. Because the underlying political, geographic and social structures of Wikipedia are hidden from users, and because there have not been any large scale studies of the geography of these structures and their relationship to online participation, entire groups of people (and regions) may be marginalized without their knowledge.

This process is important to understand, for the simple reason that Wikipedia content has begun to form a central part of services offered elsewhere on the Internet. When you look for information about a place on Facebook, the description of that place (including its geographic coordinates) comes from Wikipedia. If you want to “check in” to a museum in Doha to signify you were there to their friends, the place you check in to was created with Wikipedia data. When you Google “House of Saud” you are presented not only with a list of links (with Wikipedia at the top) but also with a special ‘card’ summarising the House. This data comes from Wikipedia. When you look for people or places, Google now has these terms inside its ‘knowledge graph’, a network of related concepts with data coming directly from Wikipedia. Similarly, on Google maps, Wikipedia descriptions for landmarks are presented as part of the default information.

Ironically, Wikipedia editorship is actually on a slow and steady decline, even as its content and readership increases year on year. Since 2007 and the introduction of significant devolution of administrative powers to volunteers, Wikipedia has not been able to effectively retain newcomers, something which has been noted as a concern by many at the Wikimedia Foundation. Some think Wikipedia might be levelling off because there’s only so much to write about. This is extremely far from the truth; there are still substantial gaps in geographic content in English and overwhelming gaps in other languages. Wikipedia often brands itself as aspiring to contain “the sum of human knowledge”, but behind this mantra lie policy pitfalls, tedious editor debates and delicate sourcing issues that hamper greater representation of the region. Of course these challenges form part of Wikipedia’s continuing evolution as the de facto source for online reference information, but they also (disturbingly) act to entrench particular ways of “knowing” — and ways of validating what is known.

There are over 260,000 articles in Arabic, receiving 240,000 views per hour. This actually translates as one of the least represented major world languages on Wikipedia: few languages have more speakers and fewer articles than Arabic. This relative lack of MENA voice and representation means that the tone and content of this globally useful resource, in many cases, is being determined by outsiders with a potential misunderstanding of the significance of local events, sites of interest and historical figures. In an area that has seen substantial social conflict and political upheaval, greater participation from local actors would help to ensure balance in content about contentious issues. Unfortunately, most research on MENA’s Internet presence has so far been drawn from anecdotal evidence, and no comprehensive studies currently exist.

In this project we wanted to understand where place-based content comes from, to explain reasons for the relative lack of Wikipedia articles in Arabic and about the MENA region, and to understand which parts of the region are particularly underrepresented. We also wanted to understand the relationship between Wikipedia’s administrative structure and the treatment of new editors; in particular, we wanted to know whether editors from the MENA region have less of a voice than their counterparts from elsewhere, and whether the content they create is considered more or less legitimate, as measured through the number of reverts; ie the overriding of their work by other editors.

Our practical objectives involved a consolidation of Middle Eastern Wikipedians though a number of workshops focusing on how to create more equitable and representative content, with the ultimate goal of making Wikipedia a more generative and productive site for reference information about the region. Capacity building among key Wikipedians can create greater understanding of barriers to participation and representation and offset much of the (often considerable) emotional labour required to sustain activity on the site in the face of intense arguments and ideological biases. Potential systematic structures of exclusion that could be a barrier to participation include such competitive practices as content deletion, indifference to content produced by MENA authors, and marginalization through bullying and dismissal.

However, a distinct lack of sources — owing both to a lack of legitimacy for MENA journalism and a paucity of open access government documents — is also inhibiting further growth of content about the region. When inclusion of a topic is contested by editors it is typically because there is not enough external source material about it to establish “notability”. As Ford (2011) has already discussed, notability is often culturally mediated. For example, a story in Al Jazeera would not have been considered a sufficient criterion of notability a couple of years ago. However, this has changed dramatically since its central role in reporting on the Arab Spring.

Unfortunately, notability can create a feedback loop. If an area of the world is underreported, there are no sources. If there are no sources, then journalists do not always have enough information to report about that part of the world. ‘Correct’ sourcing trumps personal experience on Wikipedia; even if an author is from a place, and is watching a building being destroyed, their Wikipedia edit will not be accepted by the community unless the event is discussed in another ‘official’ medium. Often the edit will either be branded with a ‘citation needed’ tag, eliminated, or discussed in the talk page. Particularly aggressive editors and administrators will nominate the page for ‘speedy deletion’ (ie deletion without discussion), a practice that makes responses from an author difficult

Why does any of this matter in practical terms? For the simple reason that biases, absences and contestations on Wikipedia spill over into numerous other domains that are in regular and everyday use (Graham and Zook, 2013). If a place is not on Wikipedia, this might have a chilling effect on business and stifle journalism; if a place is represented poorly on Wikipedia this can lead to misunderstandings about the place. Wikipedia is not a legislative body. However, in the court of public opinion, Wikipedia represents one of the world’s strongest forces, as it quietly inserts itself into representations of place worldwide (Graham et. al 2013; Graham 2013).

Wikipedia is not merely a site of reference information, but is rapidly becoming the de facto site for representing the world to itself. We need to understand more about that representation.

Further Reading

Allagui, I., Graham, M., and Hogan, B. 2014. Wikipedia Arabe et la Construction Collective du Savoir In Wikipedia, objet scientifique non identifie. eds. Barbe, L., and Merzeau, L. Paris: Presses Universitaries du Paris Ouest (in press).

Graham, M., Hogan, B., Straumann, R. K., and Medhat, A. 2014. Uneven Geographies of User-Generated Information: Patterns of Increasing Informational Poverty. Annals of the Association of American Geographers (forthcoming).

Graham, M. 2012. Die Welt in Der Wikipedia Als Politik der Exklusion: Palimpseste des Ortes und selective Darstellung. In Wikipedia. eds. S. Lampe, and P. Bäumer. Bundeszentrale für politische Bildung/bpb, Bonn.

Graham, M. 2011. Wiki Space: Palimpsests and the Politics of Exclusion. In Critical Point of View: A Wikipedia Reader. Eds. Lovink, G. and Tkacz, N. Amsterdam: Institute of Network Cultures, 269-282.


Ford, H. (2011) The Missing Wikipedians. In Geert Lovink and Nathaniel Tkacz (eds), Critical Point of View: A Wikipedia Reader, Amsterdam: Institute of Network Cultures, 2011. ISBN: 978-90-78146-13-1.

Graham, M., M. Zook., and A. Boulton. 2013. Augmented Reality in the Urban Environment: contested content and the duplicity of code. Transactions of the Institute of British Geographers. 38(3), 464-479.

Graham, M and M. Zook. 2013. Augmented Realities and Uneven Geographies: Exploring the Geo-linguistic Contours of the Web. Environment and Planning A 45(1) 77-99.

Graham, M. 2013. The Virtual Dimension. In Global City Challenges: debating a concept, improving the practice. eds. M. Acuto and W. Steele. London: Palgrave. 117-139.

Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

Who represents the Arab world online?

Editors from all over the world have played some part in writing about Egypt; in fact, only 13% of all edits actually originate in the country (38% are from the US). More: Who edits Wikipedia? by Mark Graham.

Ed: In basic terms, what patterns of ‘information geography’ are you seeing in the region?

Mark: The first pattern that we see is that the Middle East and North Africa are relatively under-represented in Wikipedia. Even after accounting for factors like population, Internet access, and literacy, we still see less contact than would be expected. Second, of the content that exists, a lot of it is in European and French rather than in Arabic (or Farsi or Hebrew). In other words, there is even less in local languages.

And finally, if we look at contributions (or edits), not only do we also see a relatively small number of edits originating in the region, but many of those edits are being used to write about other parts of the word rather than their own region. What this broadly seems to suggest is that the participatory potentials of Wikipedia aren’t yet being harnessed in order to even out the differences between the world’s informational cores and peripheries.

Ed: How closely do these online patterns in representation correlate with regional (offline) patterns in income, education, language, access to technology (etc.) Can you map one to the other?

Mark: Population and broadband availability alone explain a lot of the variance that we see. Other factors like income and education also play a role, but it is population and broadband that have the greatest explanatory power here. Interestingly, it is most countries in the MENA region that fail to fit well to those predictors.

Ed: How much do you think these patterns result from the systematic imposition of a particular view point – such as official editorial policies – as opposed to the (emergent) outcome of lots of users and editors acting independently?

Mark: Particular modes of governance in Wikipedia likely do play a factor here. The Arabic Wikipedia, for instance, to combat vandalism has a feature whereby changes to articles need to be reviewed before being made public. This alone seems to put off some potential contributors. Guidelines around sourcing in places where there are few secondary sources also likely play a role.

Ed: How much discussion (in the region) is there around this issue? Is this even acknowledged as a fact or problem?

Mark: I think it certainly is recognised as an issue now. But there are few viable alternatives to Wikipedia. Our goal is hopefully to identify problems that lead to solutions, rather than simply discouraging people from even using the platform.

Ed: This work has been covered by the Guardian, Wired, the Huffington Post (etc.) How much interest has there been from the non-Western press or bloggers in the region?

Mark: There has been a lot of coverage from the non-Western press, particularly in Latin America and Asia. However, I haven’t actually seen that much coverage from the MENA region.

Ed: As an academic, do you feel at all personally invested in this, or do you see your role to be simply about the objective documentation and analysis of these patterns?

Mark: I don’t believe there is any such thing as ‘objective documentation.’ All research has particular effects in and on the world, and I think it is important to be aware of the debates, processes, and practices surrounding any research project. Personally, I think Wikipedia is one of humanity’s greatest achievements. No previous single platform or repository of knowledge has ever even come close to Wikipedia in terms of its scale or reach. However, that is all the more reason to critically investigate what exactly is, and isn’t, contained within this fantastic resource. By revealing some of the biases and imbalances in Wikipedia, I hope that we’re doing our bit to improving it.

Ed: What factors do you think would lead to greater representation in the region? For example: is this a matter of voices being actively (or indirectly) excluded, or are they maybe just not all that bothered?

Mark: This is certainly a complicated question. I think the most important step would be to encourage participation from the region, rather than just representation of the region. Some of this involves increasing some of the enabling factors that are the prerequisites for participation; factors like: increasing broadband access, increasing literacy, encouraging more participation from women and minority groups.

Some of it is then changing perceptions around Wikipedia. For instance, many people that we spoke to in the region framed Wikipedia as an American our outside project rather than something that is locally created. Unfortunately we seem to be currently stuck in a vicious cycle in which few people from the region participate, therefore fulfilling the very reason why some people think that they shouldn’t participate. There is also the issue of sources. Not only does Wikipedia require all assertions to be properly sourced, but secondary sources themselves can be a great source of raw informational material for Wikipedia articles. However, if few sources about a place exist, then it adds an additional burden to creating content about that place. Again, a vicious cycle of geographic representation.

My hope is that by both working on some of the necessary conditions to participation, and engaging in a diverse range of initiatives to encourage content generation, we can start to break out of some of these vicious cycles.

Ed: The final moonshot question: How would you like to extend this work; time and money being no object?

Mark: Ideally, I’d like us to better understand the geographies of representation and participation outside of just the MENA region. This would involve mixed-methods (large scale big data approaches combined with in-depth qualitative studies) work focusing on multiple parts of the world. More broadly, I’m trying to build a research program that maintains a focus on a wide range of Internet and information geographies. The goal here is to understand participation and representation through a diverse range of online and offline platforms and practices and to share that work through a range of publicly accessible media: for instance the ‘Atlas of the Internet’ that we’re putting together.

Mark Graham was talking to blog editor David Sutcliffe.

Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

Crowdsourcing translation during crisis situations: are ‘real voices’ being excluded from the decisions and policies it supports?

As revolution spread across North Africa and the Middle East in 2011, participants and observers of the events were keen to engage via social media. However, saturation by Arab-language content demanded a new translation strategy for those outside the region to follow the information flows — and for those inside to reach beyond their domestic audience. Crowdsourcing was seen as the most efficient strategy in terms of cost and time to meet the demand, and translation applications that harnessed volunteers across the internet were integrated with nearly every type of ICT project. For example, as Steve Stottlemyre has already mentioned on this blog, translation played a part in tools like the Libya Crisis Map, and was essential for harnessing tweets from the region’s ‘voices on the ground.’

If you have ever worried about media bias then you should really worry about the impact of translation. Before the revolutions, the translation software for Egyptian Arabic was almost non-existent. Few translation applications were able to handle the different Arabic dialects or supply coding labor and capital to build something that could contend with internet blackouts. Google’s Speak to Tweet became the dominant application used in the Egyptian uprisings, delivering one homogenized source of information that fed the other sources. In 2011, this collaboration helped circumvent the problem of Internet connectivity in Egypt by allowing cellphone users to call their tweet into a voicemail to be transcribed and translated. A crowd of volunteers working for Twitter enhanced translation of Egyptian Arabic after the Tweets were first transcribed by a Mechanical Turk application trained from an initial 10 hours of speech.

Continue reading “Crowdsourcing translation during crisis situations: are ‘real voices’ being excluded from the decisions and policies it supports?”

Papers on Policy, Activism, Government and Representation: New Issue of Policy and Internet

We are pleased to present the combined third and fourth issue of Volume 4 of Policy and Internet. It contains eleven articles, each of which investigates the relationship between Internet-based applications and data and the policy process. The papers have been grouped into the broad themes of policy, government, representation, and activism.

POLICY: In December 2011, the European Parliament Directive on Combating the Sexual Abuse, Sexual Exploitation of Children and Child Pornography was adopted. The directive’s much-debated Article 25 requires Member States to ensure the prompt removal of child pornography websites hosted in their territory and to endeavor to obtain the removal of such websites hosted outside their territory. Member States are also given the option to block access to such websites to users within their territory. Both these policy choices have been highly controversial and much debated; Karel Demeyer, Eva Lievens, and Jos Dumortie analyse the technical and legal means of blocking and removing illegal child sexual content from the Internet, clarifying the advantages and drawbacks of the various policy options.

Another issue of jurisdiction surrounds government use of cloud services. While cloud services promise to render government service delivery more effective and efficient, they are also potentially stateless, triggering government concern over data sovereignty. Kristina Irion explores these issues, tracing the evolution of individual national strategies and international policy on data sovereignty. She concludes that data sovereignty presents national governments with a legal risk that can’t be addressed through technology or contractual arrangements alone, and recommends that governments retain sovereignty over their information.

Continue reading “Papers on Policy, Activism, Government and Representation: New Issue of Policy and Internet”