What explains the worldwide patterns in user-generated geographical content?

The geographies of codified knowledge have always been uneven, affording some people and places greater voice and visibility than others. While the rise of the geosocial Web seemed to promise a greater diversity of voices, opinions, and narratives about places, many regions remain largely absent from the websites and services that represent them to the rest of the world. These highly uneven geographies of codified information matter because they shape what is known and what can be known. As geographic content and geospatial information becomes increasingly integral to our everyday lives, places that are left off the ‘map of knowledge’ will be absent from our understanding of, and interaction with, the world.

We know that Wikipedia is important to the construction of geographical imaginations of place, and that it has immense power to augment our spatial understandings and interactions (Graham et al. 2013). In other words, the presences and absences in Wikipedia matter. If a person’s primary free source of information about the world is the Persian or Arabic or Hebrew Wikipedia, then the world will look fundamentally different from the world presented through the lens of the English Wikipedia. The capacity to represent oneself to outsiders is especially important in those parts of the world that are characterized by highly uneven power relationships: Brunn and Wilson (2013) and Graham and Zook (2013) have already demonstrated the power of geospatial content to reinforce power in a South African township and Jerusalem, respectively.

Until now, there has been no large-scale empirical analysis of the factors that explain information geographies at the global scale; this is something we have aimed to address in this research project on Mapping and measuring local knowledge production and representation in the Middle East and North Africa. Using regression models of geolocated Wikipedia data we have identified what are likely to be the necessary conditions for representation at the country level, and have also identified the outliers, i.e. those countries that fare considerably better or worse than expected. We found that a large part of the variation could be explained by just three factors: namely, (1) country population, (2) availability of broadband Internet, and (3) the number of edits originating in that country. [See the full paper for an explanation of the data and the regression models.]

But how do we explain the significant inequalities in the geography of user-generated information that remain after adjusting for differing conditions using our regression model? While these three variables help to explain the sparse amount of content written about much of Sub-Saharan Africa, most of the Middle East and North Africa have quantities of geographic information below their expected values. For example, despite high levels of wealth and connectivity, Qatar and the United Arab Emirates have far fewer articles than we might expect from the model.

These three factors independently matter, but they will also be subject to a number of constraints. A country’s population will probably affect the number of human sites, activities, and practices of interest; ie the number of things one might want to write about. The size of the potential audience might also be influential, encouraging editors in denser-populated regions and those writing in major languages. However, societal attitudes towards learning and information sharing will probably also affect the propensity of people in some places to contribute content. Factors discouraging the number of edits to local content might include a lack of local Wikimedia chapters, the attractiveness of writing content about other (better-represented) places, or contentious disputes in local editing communities that divert time into edit wars and away from content generation.

We might also be seeing a principle of increasing informational poverty. Not only is a broader base of traditional source material (such as books, maps, and images) needed for the generation of any Wikipedia article, but it is likely that the very presence of content itself is a generative factor behind the production of further content. This makes information produced about information-sparse regions most useful for people in informational cores — who are used to integrating digital information into their everyday practices — rather than those in informational peripheries.

Various practices and procedures of Wikipedia editing likely amplify this effect. There are strict guidelines on how knowledge can be created and represented in Wikipedia, including a ban on original research, and the need to source key assertions. Editing incentives and constraints probably also encourage work around existing content (which is relatively straightforward to edit) rather than creation of entirely new material. In other words, the very policies and norms that govern the encyclopedia’s structure make it difficult to populate the white space with new geographic content. In addressing these patterns of increasing informational poverty, we need to recognize that no one of these three conditions can ever be sufficient for the generation of geographic knowledge. As well as highlighting the presences and absences in user-generated content, we also need to ask what factors encourage or limit production of that content.

In interpreting our model, we have come to a stark conclusion: increasing representation doesn’t occur in a linear fashion, but it accelerates in a virtuous cycle, benefitting those with strong editing cultures in local languages. For example, Britain, Sweden, Japan and Germany are extensively georeferenced on Wikipedia, whereas much of the MENA region has not kept pace, even accounting for their levels of connectivity, population, and editors. Thus, while some countries are experiencing the virtuous cycle of more edits and broadband begetting more georeferenced content, those on the periphery of these information geographies might fail to reach a critical mass of editors, or even dismiss Wikipedia as a legitimate site for user-generated geographic content: a problem that will need to be addressed if Wikipedia is indeed to be considered as the “sum of all human knowledge”.

Read the full paper: Graham, M., Hogan, B., Straumann, R.K., and Medhat, A. (2014) Uneven Geographies of User-Generated Information: Patterns of Increasing Informational Poverty. Annals of the Association of American Geographers.

References

Brunn S. D., and M. W. Wilson. 2013. Cape Town’s million plus black township of Khayelitsha: Terrae incognitae and the geographies and cartographies of silence, Habitat International. 39 284-294.

Graham M., and M. Zook. (2013) Augmented Realities and Uneven Geographies: Exploring the Geolinguistic Contours of the Web. Environment and Planning A 45(1): 77–99.

Graham M, M. Zook, and A. Boulton. 2013. Augmented Reality in the Urban Environment: Contested Content and the Duplicity of Code. Transactions of the Institute of British Geographers. 38(3) 464-479.


Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

Mapping the Local Geographies of Digital Inequality in Britain

Britain has one of the largest Internet economies in the industrial world. The Internet contributes an estimated 8.3% to Britain’s GDP (Dean et al. 2012), and strongly supports domestic job and income growth by enabling access to new customers, markets and ideas. People benefit from better communications, and businesses are more likely to locate in areas with good digital access, thereby boosting local economies (Malecki & Moriset 2008). While the Internet brings clear benefits, there is also a marked inequality in its uptake and use (the so-called ‘digital divide’). We already know from the Oxford Internet Surveys (OxIS) that Internet use in Britain is strongly stratified by age, by income and by education; and yet we know almost nothing about local patterns of Internet use across the country.

A problem with national sample surveys (the usual source of data about Internet use and non-use), is that the sample sizes become too small to allow accurate generalization at smaller, sub-national areas. No one knows, for example, the proportion of Internet users in Glasgow, because national surveys simply won’t have enough respondents to make reliable city-level estimates. We know that Internet use is not evenly distributed at the regional level; Ofcom reports on broadband speeds and penetration at the county level (Ofcom 2011), and we know that London and the southeast are the most wired part of the country (Dean et al. 2012). But given the importance of the Internet, the lack of knowledge about local patterns of access and use in Britain is surprising. This is a problem because without detailed information about small areas we can’t identify where would benefit most from policy intervention to encourage Internet use and improve access.

We have begun to address this lack of information by combining two important but separate datasets — the 2011 national census, and the 2013 OxIS surveys — using the technique of small area estimation. By definition, census data are available for very small areas, and because it reaches (basically) everyone, there will be no sampling issues. Unfortunately, it is extremely expensive to collect this data, so it doesn’t collect many variables (it has no data on Internet use, for example). The second dataset, the OII’s Oxford Internet Survey (OxIS), is a very rich dataset of all kinds of Internet activity, measured with a random sample of more than 2,000 individuals across Britain. Because OxIS is unable to survey everyone in Britain, it is based on a random sample of people living in geographical ‘Output Areas’ (OAs). These areas (generally of 40-250 households) represent the fundamental building block of the national census, being the smallest geographical area for which it reports data.

Because OxIS and the census (happily) use the same OAs, we can combine national-level data on Internet use (from OxIS) with local-level demographic information (from the census) to map estimated Internet use across Britain for the first time. We can do this because we can estimate from OxIS the likelihood of an individual using the Internet just from basic demographic data (age, income, education etc.). And because the census records these demographics for everyone in each OA, we can go on to estimate the likely proportion of Internet users in each of these areas. By combining the richness of OxIS survey data with the comprehensive small area coverage of the census we can use the strengths of one to offset the gaps in the other.

Of course, this procedure assumes that people in small areas will generally match national patterns of Internet use; ie that those who are better educated, employed, and young, are more likely to use the Internet. We assume that this pattern isn’t affected by cultural or social factors (e.g. ‘Northerners just like the Internet more’), or by anything unusual about a particular group of households that makes it buck national trends (eg ‘the young people of Wytham Street, Oxford just prefer not to use the Internet’).

So what do we see when we combine the two datasets? What are the local-level patterns of Internet use across Britain? We can see from the figure that the highest estimated Internet use (88-89%) is concentrated in the south east, with London dominating. Bristol, Southampton, and Nottingham also have high levels of use, as well as the rest of the south (interestingly, including rural Cornwall) with estimated usage levels of 78-83%. Leeds, York and Manchester are also in this category. In the lowest category (59-70% use) we find the entire North East region. Cities show much the same pattern, with southern cities having the highest estimated Internet use, and Newcastle and Middlesbrough having the lowest.

There isn’t room in this post to explore and discuss all the patterns (or to speculate on the underlying reasons), but there are clear policy implications from this work. The Internet has made an enormous difference in our social life, culture, and economy; this is why it is important to bring people online, to encourage them all to participate and benefit. However, despite the importance of the Internet in Britain today, we still know very little about who is, and isn’t connected. We hope this approach (and this data) can help pinpoint the areas of greatest need. For example, the North East is striking — even the cities don’t seem to stand out from the surrounding rural areas. Allocating resources to improve use in the North East would probably be valuable, with rural areas as a secondary priority. Interestingly, Cornwall (despite being very rural) is actually above average in terms of likely Internet users, and is also the recipient of a major European Regional Development Fund effort to extend their broadband.

Actually getting access via fibre-optic cable is just one part of the story of Internet use (and one we don’t cover in this post); but this is the first time we have been estimate the likely use at a local level, based on the known characteristics of the people who live there. Using these small area estimation techniques opens a whole new area for social media research and policy-making around local patterns of digital participation. Going forward, we intend to expand the model to include urban-rural differences, the index of multiple deprivation, occupation, and socio-economic status. But there’s already much more we can do with these data.

References

Dean, D., DiGrande, S., Field, D., Lundmark, A., O’Day, J., Pineda, J., Zwillenberg, P. (2012) The connected world: The Internet economy in the G-20. Boston: Boston Consulting Group.

Malecki, E.J. & Moriset, B. (2008) The digital economy: Business organization, production processes and regional developments. London: Routledge.

Ofcom (2011) Communications infrastructure report: Fixed broadband data. [accessed on 23/9/2013 from http://stakeholders.ofcom.org.uk/binaries/research/broadband-research/Fixed_Broadband_June_2011.pdf ]

Read the full paper: Blank, G., Graham, M., and Calvino, C. (2014) Mapping the Local Geographies of Digital Inequality. [contact the authors for the paper and citation details]


Grant Blank is a Survey Research Fellow at the OII. He is a sociologist who studies the social and cultural impact of the Internet and other new communication media. He is principal investigator on the OII’s Geography of Digital Inequality project, which combines OxIS and census data to produce the first detailed geographic estimates of Internet use across the UK.

UK teenagers without the Internet are ‘educationally disadvantaged’

A major in-depth study examining how teenagers in the UK are using the internet and other mobile devices says the benefits of using such technologies far outweigh any perceived risks. The findings are based on a large-scale study of more than 1,000 randomly selected households in the UK, coupled with regular face-to-face interviews with more than 200 teenagers and their families between 2008 and 2011.

While the study reflects a high level of parental anxiety about the potential of social networking sites to distract their offspring, and shows that some parents despair at their children’s tendency to multitask on mobile devices, the research by Oxford University’s Department of Education and Oxford Internet Institute concludes that there are substantial educational advantages in teenagers being able to access the internet at home.

Teenagers who do not have access to the internet in their home have a strong sense of being ‘educationally disadvantaged’, warns the study. At the time of the study, the researchers estimated that around 10 per cent of the teenagers were without online connectivity at home, with most of this group living in poorer households. While recent figures from the Office of National Statistics suggest this dropped to five per cent in 2012, the researchers say that still leaves around 300,000 children without internet access in their homes.

The researchers’ interviews with teenagers reveal that they felt shut out of their peer group socially and also disadvantaged in their studies as so much of the college or school work set for them to do at home required online research or preparation. One teenager, whose parents had separated, explained that he would ring his father who had internet access and any requested materials were then mailed to him through the post.

Researcher Dr Rebecca Eynon commented: ‘While it’s difficult to state a precise figure for teenagers without access to the internet at home, the fact remains that in the UK, there is something like 300,000 young people who do not – and that’s a significant number. Behind the statistics, our qualitative research shows that these disconnected young people are clearly missing out both educationally and socially.’

In an interview with a researcher, one 14-year old boy said: ‘We get coursework now in Year 9 to see what groups we’re going to go in Year 10. And people with internet, they can get higher marks because they can like research on the internet … my friends are probably on it [MSN] all the day every day. And like they talk about it in school, what happened on MSN.’

Another teenager, aged 15, commented: ‘It was bell gone and I have a lot of things that I could write and I was angry that I haven’t got a computer because I might finish it at home when I’ve got lots of time to do it. But because when I’m at school I need to do it very fast.’

Strikingly, this study contradicts claims that others have made about the potential risks of such technologies adversely affecting the ability of teenagers to concentrate on serious study. The researchers, Dr Chris Davies and Dr Rebecca Eynon, found no evidence to support this claim. Furthermore, their study concludes that the internet has opened up far more opportunities for young people to do their learning at home.

Dr Davies said: ‘Parental anxiety about how teenagers might use the very technologies that they have bought their own children at considerable expense is leading some to discourage their children from becoming confident users. The evidence, based on the survey and hundreds of interviews, shows that parents have tended to focus on the negative side – especially the distracting effects of social networking sites – without always seeing the positive use that their children often make of being online.’

Teenagers’ experiences of the social networking site Facebook appear to be mixed, says the study. Although some regarded Facebook as an integral part of their social life, others were concerned about the number of arguments that had escalated due to others wading in as a result of comments and photographs being posted.

The age of teenagers using Facebook for the first time was found to go down over the three year period from around 16 years old in 2008 to 12 or 13 years old by 2011. Interviews reveal that even the very youngest teenagers who were not particularly interested felt under some peer pressure to join. But the study also suggests that the popularity of Facebook is waning, with teenagers now exploring other forms of social networking.

Dr Davies commented: ‘There is no steady state of teenage technology use – fashions and trends are constantly shifting, and things change very rapidly when they do change.’

The research was part funded by Becta, the British Educational Communications and Technology Agency, a non-departmental public body formed under the last Labour government. The study findings are contained in a new book entitled, Teenagers and Technology, published by Routledge in November 2012.

Understanding low and discontinued Internet use amongst young people in Britain

The Internet has become an important feature of the lives of the majority of young British people, providing them with another avenue to support their learning, inform their life choices about work and life opportunities, make and maintain friendships, and learn about and engage with the world around them. For many it is taken for granted. While the extent to which young people engage with the opportunities of the online world varies considerably, the majority of this age group can be considered to be within the digital mainstream. Indeed, in popular discourse many commentators assume that all young people are digitally included, and notions of the ‘google generation’ or ‘net gen’ continue to flourish.

However, the reality is far more nuanced and complex than this — when we empirically explore how young people really engage with the Internet and related technology we see a significant amount of diversity in how and why they use it, and the influences it has on their lives. We know from nationally representative survey data that around 10% of young people in the UK (aged 17–23) define themselves as people who no longer use the Internet, that is as ‘lapsed users’. This group is fascinating. Why do these people stop using the Internet given its prevalence and value in the lives of the majority of their peers? What difficulties do they face in being unable to connect properly with the online world?

Continue reading “Understanding low and discontinued Internet use amongst young people in Britain”

eHealth: what is needed at the policy level? New special issue from Policy and Internet

The explosive growth of the Internet and its omnipresence in people’s daily lives has facilitated a shift in information seeking on health, with the Internet now a key information source for the general public, patients, and health professionals. The Internet also has obvious potential to drive major changes in the organization and delivery of health services efforts, and many initiatives are harnessing technology to support user empowerment. For example, current health reforms in England are leading to a fragmented, marketized National Health Service (NHS), where competitive choice designed to drive quality improvement and efficiency savings is informed by transparency and patient experiences, and with the notion of an empowered health consumer at its centre.

Is this aim of achieving user empowerment realistic? In their examination of health queries submitted to the NHS Direct online enquiry service, John Powell and Sharon Boden find that while patient empowerment does occur in the use of online health services, it is constrained and context dependent. Policymakers wishing to promote greater choice and control among health system users should therefore take account of the limits to empowerment as well as barriers to participation. The Dutch government’s online public national health and care portal similarly aims to facilitate consumer decision-making behavior and increasing transparency and accountability to improve quality of care and functioning of health markets. Interestingly, Hans Ossebaard, Lisette van Gemert-Pijnen and Erwin Seydel find the influence of the Dutch portal on choice behavior, awareness, and empowerment of users to actually be small.

Continue reading “eHealth: what is needed at the policy level? New special issue from Policy and Internet”

New issue of Policy and Internet (2,3)

Welcome to the third issue of Policy & Internet for 2010. We are pleased to present five articles focusing on substantive public policy issues arising from widespread use of the Internet: regulation of trade in virtual goods; development of electronic government in Korea; online policy discourse in UK elections; regulatory models for broadband technologies in the US; and alternative governance frameworks for open ICT standards.

Three of the articles are the first to be published from the highly successful conference ‘Internet, Politics and Policy‘ held by the journal in Oxford, 16th-17th September 2010. You may access any of the articles below at no charge.

Helen Margetts: Editorial

Vili Lehdonvirta and Perttu Virtanen: A New Frontier in Digital Content Policy: Case Studies in the Regulation of Virtual Goods and Artificial Scarcity

Joon Hyoung Lim: Digital Divides in Urban E-Government in South Korea: Exploring Differences in Municipalities’ Use of the Internet for Environmental Governance

Darren G. Lilleker and Nigel A. Jackson: Towards a More Participatory Style of Election Campaigning: The Impact of Web 2.0 on the UK 2010 General Election

Michael J. Santorelli: Regulatory Federalism in the Age of Broadband: A U.S. Perspective

Laura DeNardis: E-Governance Policies for Interoperability and Open Standards

New issue of Policy and Internet (2,2)

Welcome to the second issue of Policy & Internet for 2010! We are pleased to present six articles which investigate the role of the Internet in a wide range of policy processes and sectors: agenda setting in online and traditional media; environmental policy networks; online deliberation on climate change; data protection and privacy; net neutrality; and digital inclusion/exclusion. You may access any of the articles below at no charge.

Helen Margetts: Editorial

Ben Sayre, Leticia Bode, Dhavan Shah, Dave Wilcox, and Chirag Shah: Agenda Setting in a Digital Age: Tracking Attention to California Proposition 8 in Social Media, Online News and Conventional News

Kathleen McNutt and Adam Wellstead: Virtual Policy Networks in Forestry and Climate Change in the U.S. and Canada: Government Nodality, Internationalization and Actor Complexity

Julien Talpin and Stéphanie Wojcik: Deliberating Environmental Policy Issues: Comparing the Learning Potential of Online and Face-To-Face Discussions on Climate Change

Andrew A. Adams, Kiyoshi Murata, and Yohko Orito: The Development of Japanese Data Protection

Scott Jordan: The Application of Net Neutrality to Wireless Networks Based on Network Architecture

Alison Powell, Amelia Bryne, and Dharma Dailey: The Essential Internet: Digital Exclusion in Low-Income American Communities