Examining the data-driven value chains that are changing Rwanda’s tea sector

Behind the material movement that takes tea from the slopes of Rwanda’s ‘thousand hills’ to a box on a shelf in Tesco, is a growing set of less visible digital data flows. Image by pasunejen.
Production of export commodity goods like tea, coffee and chocolate is an important contributor to economies in Africa. Producers sell their goods into international markets, with the final products being sold in supermarkets, here in the UK and throughout the world. So what role is new Internet connectivity playing in changing these sectors — which are often seen as slow to adopt new technologies? As part of our work examining the impacts of growing Internet connectivity and new digital ICTs in East Africa we explored uses of the Internet and ICTs in the tea sector in Rwanda.

Tea is a sector with well-established practices and relations in the region, so we were curious if ICT might be changing it. Of course, one cannot ignore the movements of material goods when you research the tea sector. Tea is Rwanda’s main export by value, and in 2012 it moved over 21,000 tonnes of tea, accruing around $56m in value. During our fieldwork we interviewed cooperatives in remote offices surrounded by tea plantations in the temperate Southern highlands, tea processors in noisy tea factories heavy with the overpowering smell of fermenting tea leaves, and tea buyers and sellers surrounded by corridors piled high with sacks of tea.

But behind the material movement that takes tea from the slopes of Rwanda’s ‘thousand hills’ to a box on a shelf in Tesco, is a growing set of less visible digital data flows. Whilst the adoption of digital technologies is not comprehensive in the Rwandan tea sector (with, for example, very low Internet use among tea growers), we did find growing use of the Internet and ICTs. More importantly, where they were present, digital flows of information (such as tea-batch tracking, logistics and sales prices) were increasingly important to the ability of firms to improve production and ultimately to increase their profit share from tea. We have termed this a ‘data-driven value chain’ to highlight that these new digital information flows are becoming as important as the flows of material goods.

So why is tea production becoming increasingly ‘data-driven’? We found two principal drivers at work. Firstly, production of commodities like tea has shifted to private ownership. In Rwanda, tea processing factories are no longer owned by the government (as they were a decade ago) but by private firms, including several multinational tea firms. Prices for buying and selling tea are also no longer fixed by the government, but depend on the market — flat rate prices stopped at the end of 2012. Data on everything from international prices, tea quality and logistics has become increasingly important as Rwandan tea firms look to be part of the global market, by better coordinating production and improving the prices of their tea. For instance, privately owned tea factories (often in remote locations) connect via satellite or microwave Internet links to head offices, and systems integration allows multi-national tea firms the ability to track and monitor production at the touch of a button.

Secondly, we need to understand new product innovation in the tea sector. In recent years new products have particularly revolved around growing demand in the retail market for differentiated products — such as ‘environmental’, fair trade or high quality teas — for which the consumer is willing to pay more. This relates most obviously to the activities in the fields and tea processors, but digital information is also crucial in order to allow for ‘traceability’ of tea. As this guarantees that tea batches have satisfied conditions around location, food safety, chemical use, fair labour (etc.) a key component of new product innovation is therefore data — because it is integral to firms’ abilities to prove their value-added production or methods.

The idea of agricultural value chains — of analysing agricultural production from the perspective of a fragmented network of interconnected firms — has become increasingly influential in strategies and policy making supported by large donors such as the World Bank and the International Fund for Agriculture Development (IFAD), an agency of the UN.

These value chain approaches explore the amount of economic ‘value’ that different actors in the supply chain are able to capture in production. For instance, Rwandan tea farmers are only able to capture very small proportions of the final retail prices — we estimate they are paid less than 6% of the cost of the eventual retail product, and only 22% of the cost of the raw tea that is sold to retailers. Value chain analysis has been popular for policy makers and donors in that it helps them to formulate policies to support how firms in countries like Rwanda improve their value through innovation, improving processes of production, or reaching new customers.

Yet, at the moment it appears that the types of analysis being done by policy makers and donors pay very little attention to the importance of digital data, and so they are presenting an unclear picture of the ways to improve — with a tendency to focus on material matters such as machinery or business models.

Our research particularly highlighted the importance of considering how to adapt digital data flows. The ways that digital information is codified, digitised and accessed can be exclusionary, reducing the ability for smaller actors in Rwanda to compete. For instance, we found that lack of access to clear information about prices, tea quality and wider market information means that smallholders, small processors and cooperatives may not compete as well as they could, or be missing on wider innovations in tea production.

While we have focused here only on tea production, our discussions with those working in other agricultural sectors — and in other countries — suggest that our observations have significance across other agricultural sectors. In agricultural production, strategy, policy and researchers mainly focus on the material elements of production — those which are more visible and quantifiable. However, we suggest that often underlying such actions is a growing layer of digital data activity. It is only through more coherent analysis of the role of digital technologies and data that we can better analyse production — and build appropriate policy and strategies to support commodity producers in sectors like Rwandan tea.

Read the full report: Foster, C., and Graham, M. (2015) Connectivity and the Tea Sector in Rwanda. Value Chains and Networks of Connectivity-Based Enterprises in Rwanda. Project Report, Oxford Internet Institute, University of Oxford.

Chris Foster is a researcher at the Oxford Internet Institute. His research focus is on technologies and innovation in developing and emerging markets, with a particular interest on how ICTs can support development of low income groups.

Why haven’t digital platforms transformed firms in developing countries? The Rwandan tourism sector explored

Tourism is becoming an increasingly important contributor to Rwanda’s economy. Image of Homo sapiens and Gorilla beringei beringei meeting in Rwanda’s Volcanoes National Park by Andries3.

One of the great hopes for new Internet connectivity in the developing world is that it will allow those in developing countries who offer products and services to link to and profit from global customers. With the landing of undersea Internet infrastructure in East Africa, there have been hopes that as firms begin to use the Internet more extensively that improved links to markets will positively impact them.

Central to enabling new customer transactions is the emergence of platforms — digital services, websites and online exchanges — that allow more direct customer-producer interactions to occur. As part of our work exploring the impacts of growing internet connectivity and digital ICTs in East Africa, we wanted to explore how digital platforms were affecting Rwandan firms. Have Rwandan firms been able to access online platforms? What impact has access to these platforms had on firms?

Tourism is becoming an increasingly important contributor to Rwanda’s economy, with 3.1% direct contribution to GDP, and representing 7% of employment. Tourism is typically focused on affluent international tourists who come to explore the wildlife of the country, most notably as the most accessible location to see the mountain gorilla. Rwandan policy makers see tourism as a potential area for expansion, and new connectivity could be one key driver in making the country more accessible to customers.

Tourist service providers in Rwanda have a very high Internet adoption, and even the smallest hotel or tour agency is likely to have at least one mobile Internet-connected laptop. Many of the global platforms also have a presence in the region: online travel agents such as Expedia and Hotels.com work with Rwandan hotels, common social media used by tourists such as TripAdvisor and Facebook are also well-known, and firms have been encouraged by the government to integrate into payment platforms like Visa.

So, in the case of Rwandan tourism, Internet connectivity, Internet access and sector-wide platforms are certainly available for tourism firms. During our fieldwork, however (and to our surprise) we found adoption of digital tourism platforms to be low, and the impact on Rwandan tourism minimal. Why? This came down to three mismatches – essentially to do with integration, with fit, and with interactions.

Global tourism platforms offer the potential for Rwandan firms to seamlessly reach a wider range of potential tourists around the globe. However, we found that the requirements for integration into global platforms were often unclear for Rwandan firms, and there was a poor fit with the existing systems and skills. For example, hotels and lodges normally integrate into online travel agencies through integration of internal information systems, which track bookings and availability within hotels. However, in Rwanda, whilst a few larger hotels used booking systems, even the medium-sized hotels lacked internal booking systems, with booking based on custom Excel spreadsheets, or even paper diaries. When tourism firms attempted to integrate into online services they thus ran into problems, and only the large (international) hotel chains tended to be fully integrated.

Integration of East African tourism service providers into global platforms was also limited by the nature of the activities in the region. Global platforms have typically focused on providing facilities for online information, booking and payment for discrete tourism components — a hotel, a flight, a review of an attraction. However, in East Africa much international tourism is ‘packaged’, meaning a third-party (normally a tour operator) will build an itinerary and make all the bookings for customers. This means that online tourism platforms don’t provide a particularly good fit, either for tourists or Rwandan service providers. A tourist will not want the complication of booking a full itinerary online, and a small lodge that gets most of its bookings through tour operators will see little potential in integrating into a global online platform.

Interaction of Rwandan tourism service providers with online platforms is inevitably undertaken over digital networks, based on remote interactions, payments and information flows. This arms-length relationship often becomes problematic where the skills and ability of service providers are lower. For example, Rwandan tourism service providers often require additional information, help or even training on how best to use platforms which are frequently changing. In contexts where lower cost Internet can at times be inconsistent, and payment systems can be busy, having the ability to connect to local help and discuss issues is important. Yet, this is the very element that global platforms like online travel agents are often trying to remove.

So in general, we found that tourism platforms supported the large international hotels and resorts where systems and structures were already in place for seamless integration into platforms. Indeed, as the Rwandan government looks to expand the tourism sector (such as through new national parks and regional integration), there is a risk that the digital domain will support generic international chains entering the country — over the expansion of local firms.

There are potential ways forward, though. Ironically, the most successful online travel agency in Rwanda is one that has contracted a local firm in the capital Kigali to allow for ‘thicker’ interactions between Rwandan service providers and platform providers. There are also a number of South African and Kenyan online platforms in the early stages of development that are more attuned to the regional contexts of tourism (for example Safari Now, a dynamic Safari scheduling platform; Nights Bridge, an online platform for smaller hotels; and WETU, an itinerary sharing platform for service providers), and these may eventually offer a better solution for Rwandan tourism service providers.

We came to similar conclusions in the other sectors we examined as part of our research in East Africa (looking at tea production and Business Process Outsourcing) — that is, that use of online platforms faces limitations in the region. Even as firms find themselves able to access the Internet, the way these global platforms are designed presents a poor fit to the facilities, activities and needs of firms in developing countries. Indeed, in globalised sectors (such as tourism and business outsourcing) platforms can be actively exclusionary, aiding international firms entering developing countries over those local firms seeking to expand outwards.

For platform owners and developers focusing on such developing markets, the impacts of greater access to the Internet are therefore liable to come when platforms are able to balance between global reach and standards — while also being able to integrate some of the specific needs and contexts of developing countries.

Read the full report: Foster, C., and Graham, M. (2015) The Internet and Tourism in Rwanda. Value Chains and Networks of Connectivity-Based Enterprises in Rwanda. Project Report, Oxford Internet Institute, University of Oxford.

Chris Foster is a researcher at the Oxford Internet Institute. His research focus is on technologies and innovation in developing and emerging markets, with a particular interest on how ICTs can support development of low income groups.

What explains the worldwide patterns in user-generated geographical content?

The geographies of codified knowledge have always been uneven, affording some people and places greater voice and visibility than others. While the rise of the geosocial Web seemed to promise a greater diversity of voices, opinions, and narratives about places, many regions remain largely absent from the websites and services that represent them to the rest of the world. These highly uneven geographies of codified information matter because they shape what is known and what can be known. As geographic content and geospatial information becomes increasingly integral to our everyday lives, places that are left off the ‘map of knowledge’ will be absent from our understanding of, and interaction with, the world.

We know that Wikipedia is important to the construction of geographical imaginations of place, and that it has immense power to augment our spatial understandings and interactions (Graham et al. 2013). In other words, the presences and absences in Wikipedia matter. If a person’s primary free source of information about the world is the Persian or Arabic or Hebrew Wikipedia, then the world will look fundamentally different from the world presented through the lens of the English Wikipedia. The capacity to represent oneself to outsiders is especially important in those parts of the world that are characterized by highly uneven power relationships: Brunn and Wilson (2013) and Graham and Zook (2013) have already demonstrated the power of geospatial content to reinforce power in a South African township and Jerusalem, respectively.

Until now, there has been no large-scale empirical analysis of the factors that explain information geographies at the global scale; this is something we have aimed to address in this research project on Mapping and measuring local knowledge production and representation in the Middle East and North Africa. Using regression models of geolocated Wikipedia data we have identified what are likely to be the necessary conditions for representation at the country level, and have also identified the outliers, i.e. those countries that fare considerably better or worse than expected. We found that a large part of the variation could be explained by just three factors: namely, (1) country population, (2) availability of broadband Internet, and (3) the number of edits originating in that country. [See the full paper for an explanation of the data and the regression models.]

But how do we explain the significant inequalities in the geography of user-generated information that remain after adjusting for differing conditions using our regression model? While these three variables help to explain the sparse amount of content written about much of Sub-Saharan Africa, most of the Middle East and North Africa have quantities of geographic information below their expected values. For example, despite high levels of wealth and connectivity, Qatar and the United Arab Emirates have far fewer articles than we might expect from the model.

These three factors independently matter, but they will also be subject to a number of constraints. A country’s population will probably affect the number of human sites, activities, and practices of interest; ie the number of things one might want to write about. The size of the potential audience might also be influential, encouraging editors in denser-populated regions and those writing in major languages. However, societal attitudes towards learning and information sharing will probably also affect the propensity of people in some places to contribute content. Factors discouraging the number of edits to local content might include a lack of local Wikimedia chapters, the attractiveness of writing content about other (better-represented) places, or contentious disputes in local editing communities that divert time into edit wars and away from content generation.

We might also be seeing a principle of increasing informational poverty. Not only is a broader base of traditional source material (such as books, maps, and images) needed for the generation of any Wikipedia article, but it is likely that the very presence of content itself is a generative factor behind the production of further content. This makes information produced about information-sparse regions most useful for people in informational cores — who are used to integrating digital information into their everyday practices — rather than those in informational peripheries.

Various practices and procedures of Wikipedia editing likely amplify this effect. There are strict guidelines on how knowledge can be created and represented in Wikipedia, including a ban on original research, and the need to source key assertions. Editing incentives and constraints probably also encourage work around existing content (which is relatively straightforward to edit) rather than creation of entirely new material. In other words, the very policies and norms that govern the encyclopedia’s structure make it difficult to populate the white space with new geographic content. In addressing these patterns of increasing informational poverty, we need to recognize that no one of these three conditions can ever be sufficient for the generation of geographic knowledge. As well as highlighting the presences and absences in user-generated content, we also need to ask what factors encourage or limit production of that content.

In interpreting our model, we have come to a stark conclusion: increasing representation doesn’t occur in a linear fashion, but it accelerates in a virtuous cycle, benefitting those with strong editing cultures in local languages. For example, Britain, Sweden, Japan and Germany are extensively georeferenced on Wikipedia, whereas much of the MENA region has not kept pace, even accounting for their levels of connectivity, population, and editors. Thus, while some countries are experiencing the virtuous cycle of more edits and broadband begetting more georeferenced content, those on the periphery of these information geographies might fail to reach a critical mass of editors, or even dismiss Wikipedia as a legitimate site for user-generated geographic content: a problem that will need to be addressed if Wikipedia is indeed to be considered as the “sum of all human knowledge”.

Read the full paper: Graham, M., Hogan, B., Straumann, R.K., and Medhat, A. (2014) Uneven Geographies of User-Generated Information: Patterns of Increasing Informational Poverty. Annals of the Association of American Geographers.


Brunn S. D., and M. W. Wilson. 2013. Cape Town’s million plus black township of Khayelitsha: Terrae incognitae and the geographies and cartographies of silence, Habitat International. 39 284-294.

Graham M., and M. Zook. (2013) Augmented Realities and Uneven Geographies: Exploring the Geolinguistic Contours of the Web. Environment and Planning A 45(1): 77–99.

Graham M, M. Zook, and A. Boulton. 2013. Augmented Reality in the Urban Environment: Contested Content and the Duplicity of Code. Transactions of the Institute of British Geographers. 38(3) 464-479.

Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

What is stopping greater representation of the MENA region?

Negotiating the wider politics of Wikipedia can be a daunting task, particularly when in it comes to content about the MENA region. Image of the Dome of the Rock (Qubbat As-Sakhrah), Jerusalem, by 1yen

Wikipedia has famously been described as a project that “ works great in practice and terrible in theory”. One of the ways in which it succeeds is through its extensive consensus-based governance structure. While this has led to spectacular success –over 4.5 million articles in the English Wikipedia alone — the governance structure is neither obvious nor immediately accessible, and can present a barrier for those seeking entry. Editing Wikipedia can be a tough challenge – an often draining and frustrating task, involving heated disputes and arguments where it is often the most tenacious, belligerent, or connected editor who wins out in the end.

Broadband access and literacy are not the only pre-conditions for editing Wikipedia; ‘digital literacy’ is also crucial. This includes the ability to obtain and critically evaluate online sources, locate Wikipedia’s editorial and governance policies, master Wiki syntax, and confidently articulate and assert one’s views about an article or topic. Experienced editors know how to negotiate the rules, build a consensus with some editors to block others, and how to influence administrators during dispute resolution. This strict adherence to the word (if not the spirit) of Wikipedia’s ‘law’ can lead to marginalization or exclusion of particular content, particularly when editors are scared off by unruly mobs who ‘weaponize’ policies to fit a specific agenda.

Governing such a vast collaborative platform as Wikipedia obviously presents a difficult balancing act between being open enough to attract volume of contributions, and moderated enough to ensure their quality. Many editors consider Wikipedia’s governance structure (which varies significantly between the different language versions) essential to ensuring the quality of its content, even if it means that certain editors can (for example) arbitrarily ban other users, lock down certain articles, and exclude moderate points of view. One of the editors we spoke to noted that: “A number of articles I have edited with quality sources, have been subjected to editors cutting information that doesn’t fit their ideas […] I spend a lot of time going back to reinstate information. Today’s examples are in the ‘Battle of Nablus (1918)’ and the ‘Third Transjordan attack’ articles. Bullying does occur from time to time […] Having tried the disputes process I wouldn’t recommend it.” Community building might help support MENA editors faced with discouragement or direct opposition as they try to build content about the region, but easily locatable translations of governance materials would also help. Few of the extensive Wikipedia policy discussions have been translated into Arabic, leading to replication of discussions or ambiguity surrounding correct dispute resolution.

Beyond arguments with fractious editors over minutiae (something that comes with the platform), negotiating the wider politics of Wikipedia can be a daunting task, particularly when in it comes to content about the MENA region. It would be an understatement to say that the Middle East is a politically sensitive region, with more than its fair share of apparently unresolvable disputes, competing ideologies (it’s the birthplace of three world religions…), repressive governments, and ongoing and bloody conflicts. Editors shared stories with us about meddling from state actors (eg Tunisia, Iran) and a lack of trust with a platform that is generally considered to be a foreign, and sometimes explicitly American, tool. Rumors abound that several states (eg Israel, Iran) have concerted efforts to work on Wikipedia content, creating a chilling effect for new editors who might feel that editing certain pages might prove dangerous, or simply frustrating or impossible. Some editors spoke of being asked by Syrian government officials for advice on how to remove critical content, or how to identify the editors responsible for putting it there. Again: the effect is chilling.

A lack of locally produced and edited content about the region clearly can’t be blamed entirely on ‘outsiders’. Many editors in the Arabic Wikipedia have felt snubbed by the creation of an explicitly “Egyptian Arabic” Wikipedia, which has not only forked the content and editorial effort, but also stymied any ‘pan-Arab’ identity on the platform. There is a culture of administrators deleting articles they do not think are locally appropriate; often relating to politically (or culturally) sensitive topics. Due to Arabic Wikipedia’s often vicious edit wars, it is heavily moderated (unlike for example the English version), and anonymous edits do not appear instantly.

Some editors at the workshops noted other systemic and cultural issues, for example complaining of an education system that encourages rote learning, reinforcing the notion that only experts should edit (or moderate) a topic, rather than amateurs with local familiarity. Editors also noted the notable gender disparities on the site; a longstanding issue for other Wikipedia versions as well. None of these discouragements are helped by what some editors noted as a larger ‘image problem’ with editing in the Arabic Wikipedia, given it would always be overshadowed by the dominant English Wikipedia, one editor commenting that: “the English Wikipedia is vastly larger than its Arabic counterpart, so it is not unthinkable that there is more content, even about Arab-world subjects, in English. From my (unscientific) observation, many times, content in Arabic about a place or a tribe is not very encyclopedic, but promotional, and lacks citations”. Translating articles into Arabic might be seen as menial and unrewarding work, when the exciting debates about an article are happening elsewhere.

When we consider the coming-together of all of these barriers, it might be surprising that Wikipedia is actually as large as it is. However, the editors we spoke with were generally optimistic about the site, considering it an important activity that serves the greater good. Wikipedia is without doubt one of the most significant cultural and political forces on the Internet. Wikipedians are remarkably generous with their time, and it’s their efforts that are helping to document, record, and represent much of the world – including places where documentation is scarce. Most of the editors at our workshop ultimately considered Wikipedia a path to a more just society; through not just consensus, voting, and an aspiration to record certain truths — seeing it not just as a site of conflict, but also a site of regional (and local) pride. When asked why he writes geographic content, one editor simply replied: “It’s my own town”.

Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

How well represented is the MENA region in Wikipedia?

There are more Wikipedia articles in English than Arabic about almost every Arabic speaking country in the Middle East. Image of rock paintings in the Tadrart Acacus region of Libya by Luca Galuzzi.
There are more Wikipedia articles in English than Arabic about almost every Arabic speaking country in the Middle East. Image of rock paintings in the Tadrart Acacus region of Libya by Luca Galuzzi.
Wikipedia is often seen to be both an enabler and an equalizer. Every day hundreds of thousands of people collaborate on an (encyclopaedic) range of topics; writing, editing and discussing articles, and uploading images and video content. This structural openness combined with Wikipedia’s tremendous visibility has led some commentators to highlight it as “a technology to equalize the opportunity that people have to access and participate in the construction of knowledge and culture, regardless of their geographic placing” (Lessig 2003). However, despite Wikipedia’s openness, there are also fears that the platform is simply reproducing worldviews and knowledge created in the Global North at the expense of Southern viewpoints (Graham 2011; Ford 2011). Indeed, there are indications that global coverage in the encyclopaedia is far from ‘equal’, with some parts of the world heavily represented on the platform, and others largely left out (Hecht and Gergle 2009; Graham 2011, 2013, 2014).

These second-generation digital divides are not merely divides of Internet access (so discussed in the late 1990s), but gaps in representation and participation (Hargittai and Walejko 2008). Whereas most Wikipedia articles written about most European and East Asian countries are written in their dominant languages, for much of the Global South we see a dominance of articles written in English. These geographic differences in the coverage of different language versions of Wikipedia matter, because fundamentally different narratives can be (and are) created about places and topics in different languages (Graham and Zook 2013; Graham 2014).

If we undertake a ‘global analysis’ of this pattern by examining the number of geocoded articles (ie about a specific place) across Wikipedia’s main language versions (Figure 1), the first thing we can observe is the incredible human effort that has gone into describing ‘place’ in Wikipedia. The second is the clear and highly uneven geography of information, with Europe and North America home to 84% of all geolocated articles. Almost all of Africa is poorly represented in the encyclopaedia — remarkably, there are more Wikipedia articles written about Antarctica (14,959) than any country in Africa, and more geotagged articles relating to Japan (94,022) than the entire MENA region (88,342). In Figure 2 it is even more obvious that Europe and North America lead in terms of representation on Wikipedia.

Figure 1. Total number of geotagged Wikipedia articles across all 44 surveyed languages.
Figure 1. Total number of geotagged Wikipedia articles across all 44 surveyed languages.
Figure 2. Number of regional geotagged articles and population.
Figure 2. Number of regional geotagged articles and population.

Knowing how many articles describe a place only tells a part of the ‘representation story’. Figure 3 adds the linguistic element, showing the dominant language of Wikipedia articles per country. The broad pattern is that some countries largely define themselves in their own languages, and others appear to be largely defined from outside. For instance, almost all European countries have more articles about themselves in their dominant language; that is, most articles about the Czech Republic are written in Czech. Most articles about Germany are written in German (not English).

Figure 3. Language with the most geocoded articles by country (across 44 top languages on Wikipedia).
Figure 3. Language with the most geocoded articles by country (across 44 top languages on Wikipedia).

We do not see this pattern across much of the South, where English dominates across much of Africa, the Middle East, South and East Asia, and even parts of South and Central America. French dominates in five African countries, and German is dominant in one former German colony (Namibia) and a few other countries (e.g. Uruguay, Bolivia, East Timor).

The scale of these differences is striking. Not only are there more Wikipedia articles in English than Arabic about almost every Arabic speaking country in the Middle East, but there are more English articles about North Korea than there are Arabic articles about Saudi Arabia, Libya, and the UAE. Not only do we see most of the world’s content written about global cores, but it is largely dominated by a relatively few languages.

Figure 4 shows the total number of geotagged Wikipedia articles in English per country. The sheer density of this layer of information over some parts of the world is astounding (with 928,542 articles about places in English), nonetheless, in this layer of geotagged English content, only 3.23% of the articles are about Africa, and 1.67% are about the MENA region.

Figure 4. Number of geotagged articles in the English Wikipedia by country.
Figure 4. Number of geotagged articles in the English Wikipedia by country.

We see a somewhat different pattern when looking at the global geography of the 22,548 geotagged articles of the Arabic Wikipedia (Figure 5). Algeria and Syria are both defined by a relatively high number of articles in Arabic (as are the US, Italy, Spain, Russia and Greece). These information densities are substantially greater than what we see for many other MENA countries in which Arabic is an official language (such as Egypt, Morocco, and Saudi Arabia). This is even more surprising when we realise that the Italian and Spanish populations are smaller than the Egyptian, but there are nonetheless far more geotagged articles in Arabic about Italy (2,428) and Spain (1,988) than about Egypt (433).

Figure 5. Total number of geotagged articles in the Arabic Wikipedia by country.
Figure 5. Total number of geotagged articles in the Arabic Wikipedia by country.

By mapping the geography of Wikipedia articles in both global and regional languages, we can begin to examine the layers of representation that ‘augment’ the world we live in. We have seen that, notable exceptions aside (e.g. ‘Iran’ in Farsi and ‘Israel’ in Hebrew) the MENA region tends to be massively underrepresented — not just in major world languages, but also in its own: Arabic. Clearly, much is being left unsaid about that part of the world. Although we entered the project anticipating that the MENA region would be under-represented in English, we did not anticipate the degree to which it is under-represented in Arabic.


Ford, H. (2011) The Missing Wikipedians. In Critical Point of View: A Wikipedia Reader, ed. G. Lovink and N. Tkacz, 258-268. Amsterdam: Institute of Network Cultures.

Graham, M. (2014) The Knowledge Based Economy and Digital Divisions of Labour. In Companion to Development Studies, 3rd edition, eds v. Desai, and R. Potter. Hodder, pp. 189-195.

Graham, M. (2013) The Virtual Dimension. In Global City Challenges: Debating a Concept, Improving the Practice. Eds. Acuto, M. and Steele, W. London: Palgrave.

Graham, M. (2011) Wiki Space: Palimpsests and the Politics of Exclusion. In Critical Point of View: A Wikipedia Reader. Eds. Lovink, G. and Tkacz, N. Amsterdam: Institute of Network Cultures, pp. 269-282.

Graham M., and M. Zook (2013) Augmented Realities and Uneven Geographies: Exploring the Geolinguistic Contours of the Web. Environment and Planning A 45 (1) 77–99.

Hargittai, E. and G. Walejko (2008) The Participation Divide: Content Creation and Sharing in the Digital Age. Information, Communication and Society 11 (2) 239–256.

Hecht B., and D. Gergle (2009) Measuring self-focus bias in community-maintained knowledge repositories. In Proceedings of the 4th International Conference on Communities and Technologies, Penn State University, 2009, pp. 11–20. New York: ACM.

Lessig, L. (2003) An Information Society: Free or Feudal. Talk given at the World Summit on the Information Society, Geneva, 2003.

Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

The sum of (some) human knowledge: Wikipedia and representation in the Arab World

Arabic is one of the least represented major world languages on Wikipedia: few languages have more speakers and fewer articles than Arabic. Image of the Umayyad Mosque (Damascus) by Travel Aficionado

Wikipedia currently contains over 9 million articles in 272 languages, far surpassing any other publicly available information repository. Being the first point of contact for most general topics (therefore an effective site for framing any subsequent representations) it is an important platform from which we can learn whether the Internet facilitates increased open participation across cultures — or reinforces existing global hierarchies and power dynamics. Because the underlying political, geographic and social structures of Wikipedia are hidden from users, and because there have not been any large scale studies of the geography of these structures and their relationship to online participation, entire groups of people (and regions) may be marginalized without their knowledge.

This process is important to understand, for the simple reason that Wikipedia content has begun to form a central part of services offered elsewhere on the Internet. When you look for information about a place on Facebook, the description of that place (including its geographic coordinates) comes from Wikipedia. If you want to “check in” to a museum in Doha to signify you were there to their friends, the place you check in to was created with Wikipedia data. When you Google “House of Saud” you are presented not only with a list of links (with Wikipedia at the top) but also with a special ‘card’ summarising the House. This data comes from Wikipedia. When you look for people or places, Google now has these terms inside its ‘knowledge graph’, a network of related concepts with data coming directly from Wikipedia. Similarly, on Google maps, Wikipedia descriptions for landmarks are presented as part of the default information.

Ironically, Wikipedia editorship is actually on a slow and steady decline, even as its content and readership increases year on year. Since 2007 and the introduction of significant devolution of administrative powers to volunteers, Wikipedia has not been able to effectively retain newcomers, something which has been noted as a concern by many at the Wikimedia Foundation. Some think Wikipedia might be levelling off because there’s only so much to write about. This is extremely far from the truth; there are still substantial gaps in geographic content in English and overwhelming gaps in other languages. Wikipedia often brands itself as aspiring to contain “the sum of human knowledge”, but behind this mantra lie policy pitfalls, tedious editor debates and delicate sourcing issues that hamper greater representation of the region. Of course these challenges form part of Wikipedia’s continuing evolution as the de facto source for online reference information, but they also (disturbingly) act to entrench particular ways of “knowing” — and ways of validating what is known.

There are over 260,000 articles in Arabic, receiving 240,000 views per hour. This actually translates as one of the least represented major world languages on Wikipedia: few languages have more speakers and fewer articles than Arabic. This relative lack of MENA voice and representation means that the tone and content of this globally useful resource, in many cases, is being determined by outsiders with a potential misunderstanding of the significance of local events, sites of interest and historical figures. In an area that has seen substantial social conflict and political upheaval, greater participation from local actors would help to ensure balance in content about contentious issues. Unfortunately, most research on MENA’s Internet presence has so far been drawn from anecdotal evidence, and no comprehensive studies currently exist.

In this project we wanted to understand where place-based content comes from, to explain reasons for the relative lack of Wikipedia articles in Arabic and about the MENA region, and to understand which parts of the region are particularly underrepresented. We also wanted to understand the relationship between Wikipedia’s administrative structure and the treatment of new editors; in particular, we wanted to know whether editors from the MENA region have less of a voice than their counterparts from elsewhere, and whether the content they create is considered more or less legitimate, as measured through the number of reverts; ie the overriding of their work by other editors.

Our practical objectives involved a consolidation of Middle Eastern Wikipedians though a number of workshops focusing on how to create more equitable and representative content, with the ultimate goal of making Wikipedia a more generative and productive site for reference information about the region. Capacity building among key Wikipedians can create greater understanding of barriers to participation and representation and offset much of the (often considerable) emotional labour required to sustain activity on the site in the face of intense arguments and ideological biases. Potential systematic structures of exclusion that could be a barrier to participation include such competitive practices as content deletion, indifference to content produced by MENA authors, and marginalization through bullying and dismissal.

However, a distinct lack of sources — owing both to a lack of legitimacy for MENA journalism and a paucity of open access government documents — is also inhibiting further growth of content about the region. When inclusion of a topic is contested by editors it is typically because there is not enough external source material about it to establish “notability”. As Ford (2011) has already discussed, notability is often culturally mediated. For example, a story in Al Jazeera would not have been considered a sufficient criterion of notability a couple of years ago. However, this has changed dramatically since its central role in reporting on the Arab Spring.

Unfortunately, notability can create a feedback loop. If an area of the world is underreported, there are no sources. If there are no sources, then journalists do not always have enough information to report about that part of the world. ‘Correct’ sourcing trumps personal experience on Wikipedia; even if an author is from a place, and is watching a building being destroyed, their Wikipedia edit will not be accepted by the community unless the event is discussed in another ‘official’ medium. Often the edit will either be branded with a ‘citation needed’ tag, eliminated, or discussed in the talk page. Particularly aggressive editors and administrators will nominate the page for ‘speedy deletion’ (ie deletion without discussion), a practice that makes responses from an author difficult

Why does any of this matter in practical terms? For the simple reason that biases, absences and contestations on Wikipedia spill over into numerous other domains that are in regular and everyday use (Graham and Zook, 2013). If a place is not on Wikipedia, this might have a chilling effect on business and stifle journalism; if a place is represented poorly on Wikipedia this can lead to misunderstandings about the place. Wikipedia is not a legislative body. However, in the court of public opinion, Wikipedia represents one of the world’s strongest forces, as it quietly inserts itself into representations of place worldwide (Graham et. al 2013; Graham 2013).

Wikipedia is not merely a site of reference information, but is rapidly becoming the de facto site for representing the world to itself. We need to understand more about that representation.

Further Reading

Allagui, I., Graham, M., and Hogan, B. 2014. Wikipedia Arabe et la Construction Collective du Savoir In Wikipedia, objet scientifique non identifie. eds. Barbe, L., and Merzeau, L. Paris: Presses Universitaries du Paris Ouest (in press).

Graham, M., Hogan, B., Straumann, R. K., and Medhat, A. 2014. Uneven Geographies of User-Generated Information: Patterns of Increasing Informational Poverty. Annals of the Association of American Geographers (forthcoming).

Graham, M. 2012. Die Welt in Der Wikipedia Als Politik der Exklusion: Palimpseste des Ortes und selective Darstellung. In Wikipedia. eds. S. Lampe, and P. Bäumer. Bundeszentrale für politische Bildung/bpb, Bonn.

Graham, M. 2011. Wiki Space: Palimpsests and the Politics of Exclusion. In Critical Point of View: A Wikipedia Reader. Eds. Lovink, G. and Tkacz, N. Amsterdam: Institute of Network Cultures, 269-282.


Ford, H. (2011) The Missing Wikipedians. In Geert Lovink and Nathaniel Tkacz (eds), Critical Point of View: A Wikipedia Reader, Amsterdam: Institute of Network Cultures, 2011. ISBN: 978-90-78146-13-1.

Graham, M., M. Zook., and A. Boulton. 2013. Augmented Reality in the Urban Environment: contested content and the duplicity of code. Transactions of the Institute of British Geographers. 38(3), 464-479.

Graham, M and M. Zook. 2013. Augmented Realities and Uneven Geographies: Exploring the Geo-linguistic Contours of the Web. Environment and Planning A 45(1) 77-99.

Graham, M. 2013. The Virtual Dimension. In Global City Challenges: debating a concept, improving the practice. eds. M. Acuto and W. Steele. London: Palgrave. 117-139.

Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

The economic expectations and potentials of broadband Internet in East Africa

Ed: There has a lot of excitement about the potential of increased connectivity in the region: where did this come from? And what sort of benefits were promised?

Chris: Yes, at the end of the 2000s when the first fibre cables landed in East Africa, there was much anticipation about what this new connectivity would mean for the region. I remember I was in Tanzania at the time, and people were very excited about this development – being tired of the slow and expensive satellite connections where even simple websites could take a minute to load. The perception, both in the international press and from East African politicians was that the cables would be a game changer. Firms would be able to market and sell more directly to customers and reduce inefficient ‘intermediaries’. Connectivity would allow new types of digital-driven business, and it would provide opportunity for small and medium firms to become part of the global economy. We wanted to revisit this discussion. Were firms adopting internet, as it became cheaper? Had this new connectivity had the effects that were anticipated, or was it purely hype?

Ed:  So what is the current level and quality of broadband access in Rwanda? ie how connected are people on the ground?

Chris: Internet access has greatly improved over the previous few years, and the costs of bandwidth have declined markedly. The government has installed a ‘backbone’ fibre network and in the private sector there has also been a growth in the number of firms providing Internet service. There are still some problems though. Prices are still are quite high, particularly for dedicated broadband connections, and in the industries we looked at (tea and tourism) many firms couldn’t afford it. Secondly, we heard a lot of complaints that lower bandwidth connections – WiMax and mobile internet – are unreliable and become saturated at peak times. So, Rwanda has come a long way, but we expect there will be more improvements in the future.

Ed: How much impact has the Internet had on Rwanda’s economy generally? And who is it actually helping, if so?

Chris: Economists in the World Bank have calculated that in developing economies a 10% improvement in Internet access leads to an increase in growth of 1.3%, so the effects should be taken seriously. In Rwanda, it’s too early to concretely see the effects in bottom line economic growth. In Rwanda, it’s too early to concretely see the effects in bottom line economic growth. In this work we wanted to examine the effect on already established sectors to get insight on Internet adoption and use. In general, we can say that firms are increasingly adopting Internet connectivity in some form, and that firms have been able take advantage and improve operations. However, it seems that wider transformational effects of connectivity have so far been limited.

Ed: And specifically in terms of the Rwandan tea and tourism industries: has the Internet had much effect?

Chris: The global tourism industry is driven by Internet use, and so tour firms, guides and hotels in Rwanda have been readily adopting it. We can see that the Internet has been beneficial, particularly for those firms coordinating tourism in Rwanda, who can better handle volumes of tourists. In the tea industry, adoption is a little lower but the Internet is used in similar ways – to coordinate the movement of tea from production to processing to selling, and this simplifies management for firms. So, connectivity has had benefits by improvements in efficiency, and this complements the fact that both sectors are looking to attract international investment and become better integrated into markets. In that sense, one can say that the growth in Internet connectivity is playing a significant role in strategies of private sector development.

Ed: The project partly focuses on value chains: ie where value is captured at different stages of a chain, leading (for example) from Rwandan tea bush to UK Tesco shelf. How have individual actors in the chain been affected? And has there been much in the way of (the often promised) disintermediation — ie are Rwandan tea farmers and tour operators now able to ‘plug directly’ into international markets?

Chris: Value chains allow us to pay more attention to who are the winners (and losers) of the processes described above, and particularly to see if this benefits Rwandan firms who are linked into global markets. One of the potential benefits originally discussed around new connectivity was that with the growth of online channels and platforms — and through social media — that firms as they became connected would have a more direct link to large markets and be able to disintermediate and improve the benefits they received. Generally, we can say that such disintermediation has not happened, for different reasons. In the tourism sector, many tourists are still reluctant to go directly to Rwandan tourist firms, for reasons related to trust (particularly around payment for holidays). In the tea sector, the value chains are very well established, and with just a few retailers in the end-markets, direct interaction with markets has simply not materialised. So, the hope of connectivity driving disintermediation in value chains has been limited by the market structure of both these sectors.

Ed: Is there any sense that the Internet is helping to ‘lock’ Rwanda into global markets and institutions: for example international standards organisations? And will greater transparency mean Rwanda is better able to compete in global markets, or will it just allow international actors to more efficiently exploit Rwanda’s resources — ie for the value in the chain to accrue to outsiders?

Chris: One of the core activities around the Internet that we found for both tea and tourism was firms using connectivity as a way to integrate themselves into logistic tracking, information systems, and quality and standards; whether this be automation in the tea sector or using global booking systems in the tourism sector. In one sense, this benefits Rwandan firms in that it’s crucial to improving efficiency in global markets, but it’s less clear that benefits of integration always accrue to those in Rwanda. It also moves away from the earlier ideas that connectivity would empower firms, unleashing a wave of innovation. To some of the firms we interviewed, it felt like this type of investment in the Internet was simply a way for others to better monitor, define and control every step they made, dictated by firms far away.

Ed. How do the project findings relate to (or comment on) the broader hopes of ICT4D developers? ie does ICT (magically) solve economic and market problems — and if so, who benefits?

Chris: For ICT developers looking to support development, there is often a tendency to look to build for actors who are struggling to find markets for their goods and services (such as apps linking buyers and producers, or market pricing information). But, the industries we looked at are quite different — actors (even farmers) are already linked via value chains to global markets, and so these types of application were less useful. In interviews, we found other informal uses of the Internet amongst lower-income actors in these sectors, which point the way towards new ICT applications: sectoral knowledge building, adapting systems to allow smallholders to better understand their costs, and systems to allow better links amongst cooperatives. More generally for those interested in ICT and development, this work highlights that changes in economies are not solely driven by connectivity, particularly in industries where rewards are already skewed towards larger global firms over those in developing countries. This calls for a context-dependent analysis of policy and structures, something that can be missed when more optimistic commentators discuss connectivity and the digital future.

Christopher Foster was talking to blog editor David Sutcliffe.

Who represents the Arab world online?

Editors from all over the world have played some part in writing about Egypt; in fact, only 13% of all edits actually originate in the country (38% are from the US). More: Who edits Wikipedia? by Mark Graham.

Ed: In basic terms, what patterns of ‘information geography’ are you seeing in the region?

Mark: The first pattern that we see is that the Middle East and North Africa are relatively under-represented in Wikipedia. Even after accounting for factors like population, Internet access, and literacy, we still see less contact than would be expected. Second, of the content that exists, a lot of it is in European and French rather than in Arabic (or Farsi or Hebrew). In other words, there is even less in local languages.

And finally, if we look at contributions (or edits), not only do we also see a relatively small number of edits originating in the region, but many of those edits are being used to write about other parts of the word rather than their own region. What this broadly seems to suggest is that the participatory potentials of Wikipedia aren’t yet being harnessed in order to even out the differences between the world’s informational cores and peripheries.

Ed: How closely do these online patterns in representation correlate with regional (offline) patterns in income, education, language, access to technology (etc.) Can you map one to the other?

Mark: Population and broadband availability alone explain a lot of the variance that we see. Other factors like income and education also play a role, but it is population and broadband that have the greatest explanatory power here. Interestingly, it is most countries in the MENA region that fail to fit well to those predictors.

Ed: How much do you think these patterns result from the systematic imposition of a particular view point – such as official editorial policies – as opposed to the (emergent) outcome of lots of users and editors acting independently?

Mark: Particular modes of governance in Wikipedia likely do play a factor here. The Arabic Wikipedia, for instance, to combat vandalism has a feature whereby changes to articles need to be reviewed before being made public. This alone seems to put off some potential contributors. Guidelines around sourcing in places where there are few secondary sources also likely play a role.

Ed: How much discussion (in the region) is there around this issue? Is this even acknowledged as a fact or problem?

Mark: I think it certainly is recognised as an issue now. But there are few viable alternatives to Wikipedia. Our goal is hopefully to identify problems that lead to solutions, rather than simply discouraging people from even using the platform.

Ed: This work has been covered by the Guardian, Wired, the Huffington Post (etc.) How much interest has there been from the non-Western press or bloggers in the region?

Mark: There has been a lot of coverage from the non-Western press, particularly in Latin America and Asia. However, I haven’t actually seen that much coverage from the MENA region.

Ed: As an academic, do you feel at all personally invested in this, or do you see your role to be simply about the objective documentation and analysis of these patterns?

Mark: I don’t believe there is any such thing as ‘objective documentation.’ All research has particular effects in and on the world, and I think it is important to be aware of the debates, processes, and practices surrounding any research project. Personally, I think Wikipedia is one of humanity’s greatest achievements. No previous single platform or repository of knowledge has ever even come close to Wikipedia in terms of its scale or reach. However, that is all the more reason to critically investigate what exactly is, and isn’t, contained within this fantastic resource. By revealing some of the biases and imbalances in Wikipedia, I hope that we’re doing our bit to improving it.

Ed: What factors do you think would lead to greater representation in the region? For example: is this a matter of voices being actively (or indirectly) excluded, or are they maybe just not all that bothered?

Mark: This is certainly a complicated question. I think the most important step would be to encourage participation from the region, rather than just representation of the region. Some of this involves increasing some of the enabling factors that are the prerequisites for participation; factors like: increasing broadband access, increasing literacy, encouraging more participation from women and minority groups.

Some of it is then changing perceptions around Wikipedia. For instance, many people that we spoke to in the region framed Wikipedia as an American our outside project rather than something that is locally created. Unfortunately we seem to be currently stuck in a vicious cycle in which few people from the region participate, therefore fulfilling the very reason why some people think that they shouldn’t participate. There is also the issue of sources. Not only does Wikipedia require all assertions to be properly sourced, but secondary sources themselves can be a great source of raw informational material for Wikipedia articles. However, if few sources about a place exist, then it adds an additional burden to creating content about that place. Again, a vicious cycle of geographic representation.

My hope is that by both working on some of the necessary conditions to participation, and engaging in a diverse range of initiatives to encourage content generation, we can start to break out of some of these vicious cycles.

Ed: The final moonshot question: How would you like to extend this work; time and money being no object?

Mark: Ideally, I’d like us to better understand the geographies of representation and participation outside of just the MENA region. This would involve mixed-methods (large scale big data approaches combined with in-depth qualitative studies) work focusing on multiple parts of the world. More broadly, I’m trying to build a research program that maintains a focus on a wide range of Internet and information geographies. The goal here is to understand participation and representation through a diverse range of online and offline platforms and practices and to share that work through a range of publicly accessible media: for instance the ‘Atlas of the Internet’ that we’re putting together.

Mark Graham was talking to blog editor David Sutcliffe.

Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.