Policy & Internet Conference 2022

Datafication. Platformization. Metaverse. The state of global internet policy

University of Sydney, Australia (28-29 September, 2022).

Through a series of keynote presentations and plenary panels, the 2022 Policy & Internet Conference will set the trajectory for the next 12 months of scholarship in this space.

For event details, address and information on keynote speakers and panelists, please see our Conference page.

For the Conference itinerary, please see below.

Using Wikipedia as PR is a problem, but our lack of a critical eye is worse

That Wikipedia is used for less-than scrupulously neutral purposes shouldn’t surprise us – our lack of critical eye that’s the real problem. Reposted from The Conversation.

If you heard that a group of people were creating, editing, and maintaining Wikipedia articles related to brands, firms and individuals, you could point out, correctly, that this is the entire point of Wikipedia. It is, after all, the “encyclopedia that anyone can edit”.

But a group has been creating and editing articles for money. Wikipedia administrators banned more than 300 suspect accounts involved, but those behind the ring are still unknown.

For most Wikipedians, the editors and experts who volunteer their time and effort to develop and maintain the world’s largest encyclopedia for free, this is completely unacceptable. However, what the group was doing was not illegal—although it is prohibited by Wikipedia’s policies—and as it’s extremely hard to detect it’s difficult to stamp out entirely.

Conflicts of interest in those editing articles has been part of Wikipedia from the beginning. In the early days, a few of the editors making the most contributions wanted a personal Wikipedia entry, at least as a reward for their contribution to the project. Of course most of these were promptly deleted by the rest of the community for not meeting the notability criteria.

As Wikipedia grew and became the number one source of free-to-access information about everything, so Wikipedia entries rose up search engines rankings. Being well-represented on Wikipedia became important for any nation, organisation, firm, political party, entrepreneur, musician, and even scientists. Wikipedians have strived to prohibit self-serving editing, due to the inherent bias that this would introduce. At the same time, “organised” problematic editing developed despite their best efforts.

The glossy sheen of public relations

The first time I learned of non-Wikipedians taking an organised approach to editing articles I was attending a lecture by an “online reputation manager” in 2012. I didn’t know of her, so I pulled up her Wikipedia entry.

It was readily apparent that the article was filled with only positive things. So I did a bit of research about the individual and edited the article to try and introduce a more neutral point of view: softened language, added references and [citation needed] tags where I couldn’t find reference material to back up an important statement.

Online reputation mangers and PR firms charge celebrities and “important” people to, among other things, groom Wikipedia pages and fool search engines to push less favourable search results further down the page when their name is searched for. And they get caught doing it, again and again.

Separating fact from fiction

It is not that paid-for or biased editing is so problematic in itself, but the value that many associate with the information found in Wikipedia entries. For example, in academia, professors with Wikipedia entries might be considered more important than those without. Our own research has shown that scholars with Wikipedia articles have no greater statistically significant scientific impact than those without. So why do some appear on Wikipedia while others do not? The reason is clear: because many of those entries are written by themselves or their students or colleagues. It’s important that this aspect of Wikipedia should be communicated to those reading it, and remembered every single time you’re using it.

The arrival of [citation needed] tags is a good way to alert readers to the potential for statements to be unsafe, unsupported, or flat-out wrong. But these days Google has incorporated Wikipedia articles into its search results, so that an infobox at the right side of the results page will display the information – having first stripped such tags out, presenting it as referenced and reliable information.

A critical eye

Apart from self-editing that displays obvious bias, we know that Wikipedia, however amazing it is, has other shortcomings. Comparing Wikipedia’s different language versions to see the topics they find controversial reveals the attitudes and obsessions of writers from different nations. For example, English Wikipedia is obsessed with global warming, George W Bush and the World Wrestling Federation, the German language site by Croatia and Scientology, Spanish by Chile, and French by Ségolène Royal, homosexuality and UFOs. There are lots of edit wars behind the scenes, many of which are a lot of fuss about absolutely nothing.

It’s not that I’d suggest abandoning the use of Wikipedia, but a bit of caution and awareness in the reader of these potential flaws is required. And more so, it’s required by the many organisations, academics, journalists and services of all kind including Google itself that scrape or read Wikipedia unthinkingly assuming that it’s entirely correct.

Were everyone to approach Wikipedia with a little more of a critical eye, eventually the market for paid editing would weaken or dissolve.

Current alternatives won’t light up Britain’s broadband blackspots

Satellites, microwaves, radio towers – how many more options must be tried before the government just shells out for fibre to the home? Reposted from The Conversation.

Despite the British government’s boasts of the steady roll-out of superfast broadband to more than four out of five homes and businesses, you needn’t be a statistician to realise that this means one out of five are still unconnected. In fact, the recent story about a farmer who was so incensed by his slow broadband that he built his own 4G mast in a field to replace it shows that for much of the country, little has improved.

The government’s Broadband Delivery UK (BDUK) programme claims that it will provide internet access of at least 24 Mbps (megabits per second) to 95% of the country by 2017 through fibre to the cabinet, where fast fibre optic networks connect BT’s exchanges to street cabinets dotted around towns and villages. The final connection to the home comes via traditional (slower) copper cables.

Those in rural communities are understandably sceptical of the government’s “huge achievement”, arguing that only a fraction of the properties included in the government’s running total can achieve reasonable broadband speeds, as signals drop off quickly with distance from BT’s street cabinets. Millions of people are still struggling to achieve even basic broadband, and not necessarily just in the remote countryside, but in urban areas such as Redditch, Lancaster and even Pimlico in central London.

Four problems to solve

This cabinet is a problem, not a solution. mikecattell, CC BY

Our research found four recurring problems: connection speeds, latency, contention ratios, and reliability.

Getting high-speed ADSL broadband delivered over existing copper cables is not possible in many areas, as the distance from the exchange or the street cabinet is so far that the broadband signal degrades and speeds drop. Minimum speed requirements are rising as the volume of data we use increases, so such slow connections will become more and more frustrating.

But speed is not the only limiting factor. Network delay, known as latency, can be as frustrating as it forces the user to wait for data to arrive or to be assembled into the right order to be processed. Most of our interviewees had high latency connections.

Many home users also suffer from high contention, where a connection slows as more users in the vicinity log on—for example, during evenings after work and at weekends. One respondent pointed out that the two or three large companies in the neighbouring village carried out their daily company backups between 6.30pm-8.30pm. This was obvious, he said, because during that time internet speeds “drop off the end of a cliff”.

Connection reliability is also a problem, with connections failing randomly for no clear reason, or due to weather such as heavy rain, snow or wind—not very helpful in Britain.

Three band-aid solutions

With delivery by copper cable proving inadequate for many, other alternatives have been suggested to fill the gaps.

Mobile phones are now ubiquitous devices, and mobile phone networks cover a huge proportion of the country. A 4G mobile network connection could potentially provide 100Mbps speeds. Unfortunately, the areas failed by poor fixed line broadband provision are often the same areas with poor mobile phone networks—particularly rural areas. While 2G/3G network coverage is better, it is far slower. Without unlimited data plans, users will also face monthly caps on use as part of their contract. Weather conditions can also adversely affect the service.

Satellite broadband could be the answer and can provide reasonably high speeds of up to around 20 Mbps. But despite the decent bandwidth available, satellite connections have high latency from the slow speed of transferring data to and from satellites, due to the far larger distances involved between satellites and the ground. High latency connections make it very difficult or impossible to use internet telephony such as Skype, to stream films, video or music, or play online games. It’s not really an option in mountainous regions, and is a more expensive option.

A third alternative is to use fixed wireless, relaying broadband signals over radio transmitters to cover the distance from where BT’s fixed-line fibre optic network ends. These services generally provide 20Mbps, low latency connections. However, radio towers require line-of-sight access which could be a problem given obstructions from hills or woods—factors that, again, limit use where it’s most needed.

The only one that fits

All these alternatives tend to be more expensive to set up and run, come with more strict data limits, and can be affected by atmospheric conditions such as rain, wind or fog. The only true superior alternative to fibre to the cabinet is to provide fibre to the home (FTTH), in which the last vestiges of the original copper telephone network are replaced with high-speed fibre optic right to the door of the home or business premises. Fibre optic is faster, can carry signals without loss over greater distances, and is more upgradable than copper. A true fibre optic solution would future-proof Britain’s internet access network for decades to come.

Despite its expense, it is the only solution for many rural communities, which is why some have organised to provide it for themselves, such as B4RN and B4YS in the north of England, and B4RDS in the southwest. But this requires a group of volunteers with knowledge, financial means, and the necessary dedication to lay the infrastructure that could offer a 1,000 Mbps service regardless of line distance and location—which won’t be an option for all.

After dinner: the best time to create 1.5 million dollars of ground-breaking science

Count this! In celebration of the International Year of Astronomy 2009, NASA’s Great Observatories—the Hubble Space Telescope, the Spitzer Space Telescope, and the Chandra X-ray Observatory—collaborated to produce this image of the central region of our Milky Way galaxy. Image: Nasa Marshall Space Flight Center

Since it first launched as a single project called Galaxy Zoo in 2007, the Zooniverse has grown into the world’s largest citizen science platform, with more than 25 science projects and over 1 million registered volunteer citizen scientists. While initially focused on astronomy projects, such as those exploring the surfaces of the moon and the planet Mars, the platform now offers volunteers the opportunity to read and transcribe old ship logs and war diaries, identify animals in nature capture photos, track penguins, listen to whales communicating and map kelp from space.

These projects are examples of citizen science; collaborative research undertaken by professional scientists and members of the public. Through these projects, individuals who are not necessarily knowledgeable about or familiar with science can become active participants in knowledge creation (such as in the examples listed in the Chicago Tribune: Want to aid science? You can Zooniverse).

The Zooniverse is a predominant example of citizen science projects that have enjoyed particularly widespread popularity and traction online. Although science-public collaborative efforts have long existed, the Zooniverse is a predominant example of citizen science projects that have enjoyed particularly widespread popularity and traction online. In addition to making science more open and accessible, online citizen science accelerates research by leveraging human and computing resources, tapping into rare and diverse pools of expertise, providing informal scientific education and training, motivating individuals to learn more about science, and making science fun and part of everyday life.

While online citizen science is a relatively recent phenomenon, it has attracted considerable academic attention. Various studies have been undertaken to examine and understand user behaviour, motivation, and the benefits and implications of different projects for them. For instance, Sauermann and Franzoni’s analysis of seven Zooniverse projects (Solar Stormwatch, Galaxy Zoo Supernovae, Galaxy Zoo Hubble, Moon Zoo, Old Weather, The Milkyway Project, and Planet Hunters) found that 60 percent of volunteers never return to a project after finishing their first session of contribution. By comparing contributions to these projects with those of research assistants and Amazon Mechanical Turk workers, they also calculated that these voluntary efforts amounted to an equivalent of $1.5 million in human resource costs.

Our own project on the taxonomy and ecology of contributions to the Zooniverse examines the geographical, gendered and temporal patterns of contributions and contributors to 17 Zooniverse projects between 2009 and 2013. Our preliminary results show that:

  • The geographical distribution of volunteers and contributions is highly uneven, with the UK and US contributing the bulk of both. Quantitative analysis of 130 countries show that of three factors—population, GDP per capita and number of Internet users—the number of Internet users is most strongly correlated with the number of volunteers and number of contributions. However, when population is controlled, GDP per capita is found to have greater correlation with numbers of users and volunteers. The correlations are positive, suggesting that wealthier (or more developed) countries are more likely to be involved in the citizen science projects.
The Global distribution of contributions to the projects within our dataset of 35 million records. The number of contributions of each country is normalized to the population of the country.
  • Female volunteers are underrepresented in most countries. Very few countries have gender parity in participation. In many other countries, women make up less than one-third of number of volunteers whose gender is known. The female ratio of participation in the UK and Australia, for instance, is 25 per cent, while the figures for US, Canada and Germany are between 27 and 30 per cent. These figures are notable when compared with the percentage of academic jobs in the sciences held by women. In the UK, women make up only 30.3 percent of full time researchers in Science, Technology, Engineering and Mathematics (STEM) departments (UKRC report, 2010), and 24 per cent in the United States (US Department of Commerce report, 2011).
  • Our analysis of user preferences and activity show that in general, there is a strong subject preference among users, with two main clusters evident among users who participate in more than one project. One cluster revolves around astrophysics projects. Volunteers in these projects are more likely to take part in other astrophysics projects, and when one project ends, volunteers are more likely to start a new project within this cluster. Similarly, volunteers in the other cluster, which are concentrated around life and Earth science projects, have a higher likelihood of being involved in other life and Earth science projects than in astrophysics projects. There is less cross-project involvement between the two main clusters.
Dendrogram showing the overlap of contributors between projects. The scale indicates the similarity between the pools of contributors to pairs of projects. Astrophysics (blue) and Life-Earth Science (green and brown) projects create distinct clusters. Old Weather 1 and WhaleFM are exceptions to this pattern, and Old Weather 1 has the most distinct pool of contributors.
  • In addition to a tendency for cross-project activity to be contained within the same clusters, there is also a gendered pattern of engagement in various projects. Females make up more than half of gender-identified volunteers in life science projects (Snapshot Serengeti, Notes from Nature and WhaleFM have more than 50 per cent of women contributors). In contrast, the proportions of women are lowest in astrophysics projects (Galaxy Zoo Supernovae and Planet Hunters have less than 20 per cent of female contributors). These patterns suggest that science subjects in general are gendered, a finding that correlates with those by the US National Science Foundation (2014). According to an NSF report, there are relatively few women in engineering (13 per cent), computer and mathematical sciences (25 per cent), but they are well-represented in the social sciences (58 per cent) and biological and medical sciences (48 per cent).
  • For the 20 most active countries (led by the UK, US and Canada), the most productive hours in terms of user contributions are between 8pm and 10pm. This suggests that citizen science is an after-dinner activity (presumably, reflecting when most people have free time before bed). This general pattern corresponds with the idea that many types of online peer-production activities, such as citizen science, are driven by ‘cognitive surplus’, that is, the aggregation of free time spent on collective pursuits (Shirky, 2010).

These are just some of the results of our study, which has found that despite being informal, relatively more open and accessible, online citizen science exhibits similar geographical and gendered patterns of knowledge production as professional, institutional science. In other ways, citizen science is different. Unlike institutional science, the bulk of citizen science activity happens late in the day, after the workday has ended and people are winding down after dinner and before bed.

We will continue our investigations into the patterns of activity in citizen science and the behaviour of citizen scientists, in order to help improve ways to make science more accessible in general and to tap into the resources of the public for scientific knowledge production. It is anticipated that upcoming projects on the Zooniverse will be more diversified and include topics from the humanities and social sciences. Towards this end, we aim to continue our investigations into patterns of activity on the citizen science platform, and the implications of a wider range of projects on the user base (in terms of age, gender and geographical coverage) and on user behaviour.


Sauermann, H., & Franzoni, C. (2015). Crowd science user contribution patterns and their implications. Proceedings of the National Academy of Sciences112(3), 679-684.

Shirky, C. (2010). Cognitive surplus: Creativity and generosity in a connected age. Penguin: London.

Taha Yasseri is the Research Fellow in Computational Social Science at the OII. Prior to coming to the OII, he spent two years as a Postdoctoral Researcher at the Budapest University of Technology and Economics, working on the socio-physical aspects of the community of Wikipedia editors, focusing on conflict and editorial wars, along with Big Data analysis to understand human dynamics, language complexity, and popularity spread. He has interests in analysis of Big Data to understand human dynamics, government-society interactions, mass collaboration, and opinion dynamics.

A promised ‘right’ to fast internet rings hollow for millions stuck with 20th-century speeds

Tell those living in the countryside about the government’s promised “right to fast internet” and they’ll show you 10 years of similar, unmet promises. Reposted from The Conversation

In response to the government’s recent declarations that internet speeds of 100Mb/s should be available to “nearly all homes” in the UK, a great many might suggest that this is easier said than done. It would not be the first such bold claim, yet internet connections in many rural areas still languish at 20th-century speeds.

The government’s digital communications infrastructure strategy contains the intention of giving customers the “right” to a broadband connection of at least 5Mb/s in their homes.

There’s no clear indication of any timeline for introduction, nor what is meant by “nearly all homes” and “affordable prices”. But in any case, bumping the minimum speed to 5Mb/s is hardly adequate to keep up with today’s online society. It’s less than the maximum possible ADSL1 speed of 8Mb/s that was common in the mid-2000s, far less than the 24Mb/s maximum speed of ADSL2+ that followed, and far, far less than the 30-60Mb/s speeds typical of fibre optic or cable broadband connections available today.

In fact a large number of rural homes still are not able to access even the previously promised 2Mb/s minimum of the Digital Britain report in 2009.

Serious implications

As part of our study of rural broadband access we interviewed 27 people from rural areas in England and Wales about the quality of their internet connection and their daily experiences with slow and unreliable internet. Only three had download speeds of up to 6Mb/s, while most had connections that barely reached 1Mb/s. Even those who reported the faster speeds were still unable to carry out basic online tasks in a reasonable amount of time. For example using Google Maps, watching online videos, or opening several pages at once would require several minutes of buffering and waiting. Having several devices share the connection at a time wasn’t even an option.

So the pledge for a “right” to 5Mb/s made by the chancellor of the exchequer, George Osborne, is as meaningless as previous promises for 2Mb/s. Nor is it close to fast enough. The advertised figure refers to download speed, of which the upload speed is typically only a fraction. This means uploads far slower even than these slow download speeds, rendering it all but unusable for those needing to send large files, such as businesses.

With constantly moving timescales for completion, the government doesn’t seem to regard adequate rural broadband connections as a matter of urgency, even while the consequences for those affected are often serious and urgent at the same time. In Snowdonia, for example, a fast and more importantly reliable broadband connection can be a matter of life and death.

The Llanberis Mountain Rescue team at the foot of Mount Snowdon receives around 200 call-outs a year to rescue mountaineers from danger. Their systems are connected to police and emergency services, all of which run online to provide a quick and precise method of locating lost or injured mountaineers. But their internet connection is below 1Mb/s and cuts out regularly, especially in bad weather, which interferes with dispatching the rescue teams quickly. With low signal or no reception at all in the mountains, neither mobile phone networks nor satellite internet connections are alternatives.

All geared up but no internet connection. Anne-Marie Oostveen, Author provided

Connection interrupted

Even besides life and death situations, slow and unreliable internet can seriously affect people—their social lives, their family connections, their health and even their finances. Some of those we interviewed had to drive one-and-a-half hours to the nearest city in order to find internet connections fast enough to download large files for their businesses. Others reported losing clients because they weren’t able to maintain a consistent online presence or conduct Skype meetings. Families were unable to check up on serious health conditions of their children, while others, unable to work from home, were forced to commute long distances to an office.

Rural areas: high on appeal, low on internet connectivity. Bianca Reisdorf, Author provided

Especially in poorer rural areas such as North Wales, fast and reliable internet could boost the economy by enabling small businesses to emerge and thrive. It’s not a lack of imagination and ability holding people in the region back, it’s the lack of 21st-century communications infrastructure that most of us take for granted.

The government’s strategy document explains that it “wants to support the development of the UK’s digital communications infrastructure”, yet in doing so wishes “to maintain the principle that intervention should be limited to that which is required for the market to function effectively.”

It is exactly this vagueness that is currently preventing communities from taking matters into their own hands. Many of our interviewees said they still hoped BT would deploy fast internet to their village or premises, but had been given no sense of when that might occur, if at all, or that given timescales slip. “Soon” seems to be the word that keeps those in the countryside in check, causing them to hold off on looking for alternatives—such as community efforts like the B4RN initiative in Lancashire.

If the government is serious about the country’s role as a digital nation, it needs to provide feasible solutions for all populated areas of the country, which means affordable, and future-proof, which entails fibre to the premises (FTTP)—and sooner rather than later.

Outside the cities and towns, rural Britain’s internet is firmly stuck in the 20th century

The quality of rural internet access in the UK, or lack of it, has long been a bone of contention. Reposted from The Conversation.

The quality of rural internet access in the UK, or lack of it, has long been a bone of contention. The government says “fast, reliable broadband” is essential, but the disparity between urban and rural areas is large and growing, with slow and patchy connections common outside towns and cities.

The main reason for this is the difficulty and cost of installing the infrastructure necessary to bring broadband to all parts of the countryside—certainly to remote villages, hamlets, homes and farms, but even to areas not classified as “deep rural” too.

A countryside unplugged

As part of our project Access Denied, we are interviewing people in rural areas, both very remote and less so, to hear their experiences of slow and unreliable internet connections and the effects on their personal and professional lives. What we’ve found so far is that even in areas less than 20 miles away from big cities, the internet connection slows to far below the minimum of 2Mb/s identified by the government as “adequate”. Whether this is fast enough to navigate today’s data-rich Web 2.0 environment is questionable.

Yes… but where, exactly? Rept0n1x, CC BY-SA

Our interviewees could attain speeds between 0.1Mb/s and 1.2Mb/s, with the latter being a positive outlier among the speed tests we performed. Some interviewees also reported that the internet didn’t work in their homes at all, in some cases for 60% of the time. This wasn’t related to time of day; the dropped connection appeared to be random, and not something they could plan for.

The result is that activities that those in cities and towns would see as entirely normal are virtually impossible in the country—online banking, web searches for information, even sending email. One respondent explained that she was unable to pay her workers’ wages for a full week because the internet was too slow and kept cutting out, causing her online banking session to reset.

Linking villages

So poor quality internet is a major problem for some. The question is what the government and BT—which won the bid to deploy broadband to all rural UK areas—are doing about it.

The key factor affecting the speed and quality of the connection is the copper telephone lines used to connect homes to the street cabinet. While BT is steadily upgrading cabinets with high-speed fibre optic connections that connect them to the local exchange, known as fibre to the cabinet (FTTC), the copper lines slow the connection speed considerably as line quality degrades with distance from the cabinet. While some homes within a few hundred metres of the cabinet in a village centre may enjoy speedier access, for homes that are perhaps several miles away FTTC brings no improvement.

One solution is to leave out cables of any kind, and use microwave radio links, similar to those used by mobile phone networks. BT has recently installed an 80Mb/s microwave link spanning the 4km necessary to connect the village of Northlew, in Devon, to the network—significantly cheaper and easier than laying the same length of fibre optic cable.

Connecting homes

Microwave links require line-of-sight between antennas, so it’s not a solution that will work everywhere. And in any case, while this is another step toward connecting remote villages, it doesn’t solve the problem of connecting individual homes which are still fed by copper cables and which could be miles away from the cabinet, with their internet speeds falling with every metre.

An alternative approach, championed by some community initiatives such as the Broadband For the Rural North (B4RN) project in Lancashire, is fibre-to-the-home (FTTH). This is regarded as future-proof because it provides a huge increase in speed—up to 1,000Mb/s—and because, even as minimum acceptable speeds continue to rise over the following years and decades, fibre can be easily upgraded. Copper cables simply cannot provide rural areas with the internet speeds needed today.

However FTTH is expensive—and BT will opt for the cheapest option or nothing at all. This needs to be addressed more assertively by the government as the UK’s internet speeds are falling behind other countries. According to Akamai’s latest State of the Internet report for 2014, peak and average speeds in the UK lag behind. The UK ranks 16th in Europe, behind others usually perceived as less connected and competitive such as Latvia or Romania.

If the government is serious about staying competitive in the global market this isn’t good enough, which means the government and BT need to get serious about putting some speed into getting Britain online.

Young people are the most likely to take action to protect their privacy on social networking sites

A pretty good idea of what not to do on a social media site. Image by Sean MacEntee.

Standing on a stage in San Francisco in early 2010, Facebook founder Mark Zuckerberg, partly responding to the site’s decision to change the privacy settings of its 350 million users, announced that as Internet users had become more comfortable sharing information online, privacy was no longer a “social norm”. Of course, he had an obvious commercial interest in relaxing norms surrounding online privacy, but this attitude has nevertheless been widely echoed in the popular media. Young people are supposed to be sharing their private lives online—and providing huge amounts of data for commercial and government entities—because they don’t fully understand the implications of the public nature of the Internet.

There has actually been little systematic research on the privacy behaviour of different age groups in online settings. But there is certainly evidence of a growing (general) concern about online privacy (Marwick et al., 2010), with a 2013 Pew study finding that 50 percent of Internet users were worried about the information available about them online, up from 30 percent in 2009. Following the recent revelations about the NSA’s surveillance activities, a Washington Post-ABC poll reported 40 percent of its U.S. respondents as saying that it was more important to protect citizens’ privacy even if it limited the ability of the government to investigate terrorist threats. But what of young people, specifically? Do they really care less about their online privacy than older users?

Privacy concerns an individual’s ability to control what personal information about them is disclosed, to whom, when, and under what circumstances. We present different versions of ourselves to different audiences, and the expectations and norms of the particular audience (or context) will determine what personal information is presented or kept hidden. This highlights a fundamental problem with privacy in some SNSs: that of ‘context collapse’ (Marwick and boyd 2011). This describes what happens when audiences that are normally kept separate offline (such as employers and family) collapse into a single online context: such a single Facebook account or Twitter channel. This could lead to problems when actions that are appropriate in one context are seen by members of another audience; consider for example, the US high school teacher who was forced to resign after a parent complained about a Facebook photo of her holding a glass of wine while on holiday in Europe.

SNSs are particularly useful for investigating how people handle privacy. Their tendency to collapse the “circles of social life” may prompt users to reflect more about their online privacy (particularly if they have been primed by media coverage of people losing their jobs, going to prison, etc. as a result of injudicious postings). However, despite SNS being an incredibly useful source of information about online behaviour practices, few articles in the large body of literature on online privacy draw on systematically collected data, and the results published so far are probably best described as conflicting (see the literature review in the full paper). Furthermore, they often use convenience samples of college students, meaning they are unable to adequately address either age effects, or potentially related variables such as education and income. These ambiguities certainly provide fertile ground for additional research; particularly research based on empirical data.

The OII’s own Oxford Internet Surveys (OxIS) collect data on British Internet users and non-users through nationally representative random samples of more than 2,000 individuals aged 14 and older, surveyed face-to-face. One of the (many) things we are interested in is online privacy behaviour, which we measure by asking respondents who have an SNS profile: “Thinking about all the social network sites you use, on average how often do you check or change your privacy settings?” In addition to the demographic factors we collect about respondents (age, sex, location, education, income etc.), we can construct various non-demographic measures that might have a bearing on this question, such as: comfort revealing personal data; bad experiences online; concern with negative experiences; number of SNSs used; and self-reported ability using the Internet.

So are young people completely unconcerned about their privacy online, gaily granting access to everything to everyone? Well, in a word, no. We actually find a clear inverse relationship: almost 95% of 14-17-year-olds have checked or changed their SNS privacy settings, with the percentage steadily dropping to 32.5% of respondents aged 65 and over. The strength of this effect is remarkable: between the oldest and youngest the difference is over 62 percentage points, and we find little difference in the pattern between the 2013 and 2011 surveys. This immediately suggests that the common assumption that young people don’t care about—and won’t act on—privacy concerns is probably wrong.

Comparing our own data with recent nationally representative surveys from Australia (OAIC 2013) and the US (Pew 2013) we see an amazing similarity: young people are more, not less, likely to have taken action to protect the privacy of their personal information on social networking sites than older people. We find that this age effect remains significant even after controlling for other demographic variables (such as education). And none of the five non-demographic variables changes the age effect either (see the paper for the full data, analysis and modelling). The age effect appears to be real.

So in short, and contrary to the prevailing discourse, we do not find young people to be apathetic when it comes to online privacy. Barnes (2006) outlined the original ‘privacy paradox’ by arguing that “adults are concerned about invasion of privacy, while teens freely give up personal information (…) because often teens are not aware of the public nature of the Internet.” This may once have been true, but it is certainly not the case today.

Existing theories are unable to explain why young people are more likely to act to protect privacy, but maybe the answer lies in the broad, fundamental characteristics of social life. It is social structure that creates context: people know each other based around shared life stages, experiences and purposes. Every person is the centre of many social circles, and different circles have different norms for what is acceptable behaviour, and thus for what is made public or kept private. If we think of privacy as a sort of meta-norm that arises between groups rather than within groups, it provides a way to smooth out some of the inevitable conflicts of the varied contexts of modern social life.

This might help explain why young people are particularly concerned about their online privacy. At a time when they’re leaving their families and establishing their own identities, they will often be doing activities in one circle (e.g. friends) that they do not want known in other circles (e.g. potential employers or parents). As an individual enters the work force, starts to pay taxes, and develops friendships and relationships farther from the home, the number of social circles increases, increasing the potential for conflicting privacy norms. Of course, while privacy may still be a strong social norm, it may not be in the interest of the SNS provider to cater for its differentiated nature.

The real paradox is that these sites have become so embedded in the social lives of users that to maintain their social lives they must disclose information on them despite the fact that there is a significant privacy risk in disclosing this information; and often inadequate controls to help users to meet their diverse and complex privacy needs.

Read the full paper: Blank, G., Bolsover, G., and Dubois, E. (2014) A New Privacy Paradox: Young people and privacy on social network sites. Prepared for the Annual Meeting of the American Sociological Association, 16-19 August 2014, San Francisco, California.


Barnes, S. B. (2006). A privacy paradox: Social networking in the United States. First Monday,11(9).

Marwick, A. E., Murgia-Diaz, D., & Palfrey, J. G. (2010). Youth, Privacy and Reputation (Literature Review). SSRN Scholarly Paper No. ID 1588163. Rochester, NY: Social Science Research Network.

Marwick, A. E., & boyd, D. (2011). I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience. New Media & Society, 13(1), 114–133. doi:10.1177/1461444810365313

Grant Blank is a Survey Research Fellow at the OII. He is a sociologist who studies the social and cultural impact of the Internet and other new communication media.

Facebook and the Brave New World of Social Research using Big Data

Reports about the Facebook study ‘Experimental evidence of massive-scale emotional contagion through social networks’ have resulted in something of a media storm. Yet it can be predicted that ultimately this debate will result in the question: so what’s new about companies and academic researchers doing this kind of research to manipulate peoples’ behaviour? Isn’t that what a lot of advertising and marketing research does already—changing peoples’ minds about things? And don’t researchers sometimes deceive subjects in experiments about their behaviour? What’s new?

This way of thinking about the study has a serious defect, because there are three issues raised by this research: The first is the legality of the study, which, as the authors correctly point out, falls within Facebook users’ giving informed consent when they sign up to the service. Laws or regulation may be required here to prevent this kind of manipulation, but may also be difficult, since it will be hard to draw a line between this experiment and other forms of manipulating peoples’ responses to media. However, Facebook may not want to lose users, for whom this way of manipulating them via their service may ‘cause anxiety’ (as the first author of the study, Adam Kramer, acknowledged in a blog post response to the outcry). In short, it may be bad for business, and hence Facebook may abandon this kind of research (but we’ll come back to this later). But this—companies using techniques that users don’t like, so they are forced to change course—is not new.

The second issue is academic research ethics. This study was carried out by two academic researchers (the other two authors of the study). In retrospect, it is hard to see how this study would have received approval from an institutional review board (IRB), the boards at which academic institutions check the ethics of studies. Perhaps stricter guidelines are needed here since a) big data research is becoming much more prominent in the social sciences and is often based on social media like Facebook, Twitter, and mobile phone data, and b) much—though not all (consider Wikipedia)—of this research therefore entails close relations with the social media companies who provide access to these data, and to being able to experiment with the platforms, as in this case. Here, again, the ethics of academic research may need to be tightened to provide new guidelines for academic collaboration with commercial platforms. But this is not new either.

The third issue, which is the new and important one, is the increasing power that social research using big data has over our lives. This is of course even more difficult to pin down than the first two points. Where does this power come from? It comes from having access to data of a scale and scope that is a leap or step change from what was available before, and being able to perform computational analysis on these data. This is my definition of ‘big data’ (see note 1), and clearly applies in this case, as in other cases we have documented: almost 700000 users’ Facebook newsfeeds were changed in order to perform this experiment, and more than 3 million posts containing more than 122 million words were analysed. The result: it was found that more positive words in Facebook Newsfeeds led to more positive posts by users, and the reverse for negative words.

What is important here are the implications of this powerful new knowledge. To be sure, as the authors point, this was a study that is valuable for social science in showing that emotions may be transmitted online via words, not just in face-to-face situations. But secondly, it also provides Facebook with knowledge that it can use to further manipulate users’ moods; for example, making their moods more positive so that users will come to its—rather than a competitor’s—website. In other words, social science knowledge, produced partly by academic social scientists, enables companies to manipulate peoples’ hearts and minds.

This not the Orwellian world of the Snowden revelations about phone tapping that have been in the news recently. It’s the Huxleyan Brave New World where companies and governments are able to play with peoples’ minds, and do so in a way whereby users may buy into it: after all, who wouldn’t like to have their experience on Facebook improved in a positive way? And of course that’s Facebook’s reply to criticisms of the study: the motivation of the research is that we’re just trying to improve your experience, as Kramer says in his blogpost response cited above. Similarly, according to The Guardian newspaper, ‘A Facebook spokeswoman said the research…was carried out “to improve our services and to make the content people see on Facebook as relevant and engaging as possible”’. But improving experience and services could also just mean selling more stuff.

This is scary, and academic social scientists should think twice before producing knowledge that supports this kind of impact. But again, we can’t pinpoint this impact without understanding what’s new: big data is a leap in how data can be used to manipulate people in more powerful ways. This point has been lost by those who criticise big data mainly on the grounds of the epistemological conundrums involved (as with boy and Crawford’s widely cited paper, see note 2). No, it’s precisely because knowledge is more scientific that it enables more manipulation. Hence, we need to identify the point or points at which we should put a stop to sliding down a slippery slope of increasing manipulation of our behaviours. Further, we need to specify when access to big data on a new scale enables research that affects many people without their knowledge, and regulate this type of research.

Which brings us back to the first point: true, Facebook may stop this kind of research, but how would we know? And have academics therefore colluded in research that encourages this kind of insidious use of data? We can only hope for a revolt against this kind of Huxleyan conditioning, but as in Brave New World, perhaps the outlook is rather gloomy in this regard: we may come to like more positive reinforcement of our behaviours online.


1. Schroeder, R. 2014. ‘Big Data: Towards a More Scientific Social Science and Humanities?’, in Graham, M., and Dutton, W. H. (eds.), Society and the Internet. Oxford: Oxford University Press, pp.164-76.

2. Boyd, D. and Crawford, K. (2012). ‘Critical Questions for big data: Provocations for a cultural, technological and scholarly phenomenon’, Information, Communication and Society, 15(5), 662-79.

Professor Ralph Schroeder has interests in virtual environments, social aspects of e-Science, sociology of science and technology, and has written extensively about virtual reality technology. He is a researcher on the OII project Accessing and Using Big Data to Advance Social Science Knowledge, which follows ‘big data’ from its public and private origins through open and closed pathways into the social sciences, and documents and shapes the ways they are being accessed and used to create new knowledge about the social world.

The social economies of networked cultural production (or, how to make a movie with complete strangers)

Nomad, the perky-looking Mars rover from the crowdsourced documentary Solar System 3D (Wreckamovie).

Ed: You have been looking at “networked cultural production”—ie the creation of cultural goods like films through crowdsourcing platforms—specifically in the ‘wreckamovie’ community. What is wreckamovie?

Isis: Wreckamovie is an open online platform that is designed to facilitate collaborate film production. The main advantage of the platform is that it encourages a granular and modular approach to cultural production; this means that the whole process is broken down into small, specific tasks. In doing so, it allows a diverse range of geographically dispersed, self-selected members to contribute in accordance with their expertise, interests and skills. The platform was launched by a group of young Finnish filmmakers in 2008, having successfully produced films with the aid of an online forum since the late 1990s. Officially, there are more than 11,000 Wreckamovie members, but the active core, the community, consists of fewer than 300 individuals.

Ed: You mentioned a tendency in the literature to regard production systems as being either ‘market driven’ (eg Hollywood) or ‘not market driven’ (eg open or crowdsourced things); is that a distinction you recognised in your research?

Isis: There’s been a lot of talk about the disruptive and transformative powers nested in networked technologies, and most often Wikipedia or open source software are highlighted as examples of new production models, denoting a discontinuity from established practices of the cultural industries. Typically, the production models are discriminated based on their relation to the market: are they market-driven or fuelled by virtues such as sharing and collaboration? This way of explaining differences in cultural production isn’t just present in contemporary literature dealing with networked phenomena, though. For example, the sociologist Bourdieu equally theorised cultural production by drawing this distinction between market and non-market production, portraying the irreconcilable differences in their underlying value systems, as proposed in his The Rules of Art. However, one of the key findings of my research is that the shaping force of these productions is constituted by the tensions that arise in an antagonistic interplay between the values of social networked production and the production models of the traditional film industry. That is to say, the production practices and trajectories are equally shaped by the values embedded in peer production virtues and the conventions and drivers of Hollywood.

Ed: There has also been a tendency to regard the participants of these platforms as being either ‘professional’ or ‘amateur’—again, is this a useful distinction in practice?

Isis: I think it’s important we move away from these binaries in order to understand contemporary networked cultural production. The notion of the blurring of boundaries between amateurs and professionals, and associated concepts such as user-generated content, peer production, and co-creation, are fine for pointing to very broad trends and changes in the constellations of cultural production. But if we want to move beyond that, towards explanatory models, we need a more fine-tuned categorisation of cultural workers. Based on my ethnographic research in the Wreckamovie community, I have proposed a typology of crowdsourcing labour, consisting of five distinct orientations. Rather than a priori definitions, the orientations are defined based on the individual production members’ interaction patterns, motivations and interpretation of the conventions guiding the division of labour in cultural production.

Ed: You mentioned that the social capital of participants involved in crowdsourcing efforts is increasingly quantifiable, malleable, and convertible: can you elaborate on this?

Isis: A defining feature of the online environment, in particular social media platforms, is its quantification of participation in the form of lists of followers, view counts, likes and so on. Across the Wreckamovie films I researched, there was a pronounced implicit understanding amongst production leaders of the exchange value of social capital accrued across the extended production networks beyond the Wreckamovie platform (e.g. Facebook, Twitter, YouTube). The quantified nature of social capital in the socio-technical space of the information economy was experienced as a convertible currency; for example, when social capital was used to drive YouTube views (which in turn constituted symbolic capital when employed as a bargaining tool in negotiating distribution deals). For some productions, these conversion mechanisms enabled increased artistic autonomy.

Ed: You also noted that we need to understand exactly where value is generated on these platforms to understand if some systems of ‘open/crowd’ production might be exploitative. How do we determine what constitutes exploitation?

Isis: The question of exploitation in the context of voluntary cultural work is an extremely complex matter, and remains an unresolved debate. I argue that it must be determined partially by examining the flow of value across the entire production networks, paying attention to nodes on both micro and macro level. Equally, we need to acknowledge the diverse forms of value that volunteers might gain in the form of, for example, embodied cultural or symbolic capital, and assess how this corresponds to their motivation and work orientation. In other words, this isn’t a question about ownership or financial compensation alone.

Ed: There were many movie-failures on the platform; but movies are obviously tremendously costly and complicated undertakings, so we would probably expect that. Was there anything in common between them, or any lessons to be learned form the projects that didn’t succeed?

Isis: You’ll find that the majority of productions on Wreckamovie are virtual ghosts; created on a whim with the expectation that production members will flock to take part and contribute. The projects that succeed in creating actual cultural goods (such as the 2010 movie Snowblind) were those that were lead by engaged producers actively promoting the building of genuine social relationships amongst members, and providing feedback to submitted content in a constructive and supportive manner to facilitate learning. The production periods of the movies I researched spanned between two and six years—it requires real dedication! Crowdsourcing does not make productions magically happen overnight.

Ed: Crowdsourcing is obviously pretty new and exciting, but are the economics (whether monetary, social or political) of these platforms really understood or properly theorised? ie is this an area where there genuinely does need to be ‘more work’?

Isis: The economies of networked cultural production are under-theorised; this is partially an outcome of the dichotomous framing of market vs. non-market led production. When conceptualised as divorced from market-oriented production, networked phenomena are most often approached through the scope of gift exchanges (in a somewhat uninformed manner). I believe Bourdieu’s concepts of alternative capital in their various guises can serve as an appropriate analytical lens for examining the dynamics and flows of the economics underpinning networked cultural production. However, this requires innovation within field theory. Specifically, the mechanisms of conversion of one form capital to another must be examined in greater detail; something I have focused on in my thesis, and hope to develop further in the future.

Isis Hjorth was speaking to blog editor David Sutcliffe.

Isis Hjorth is a cultural sociologist focusing on emerging practices associated with networked technologies. She is currently researching microwork and virtual production networks in Sub-Saharan Africa and Southeast Asia.

Read more: Hjorth, I. (2014) Networked Cultural Production: Filmmaking in the Wreckamovie Community. PhD thesis. Oxford Internet Institute, University of Oxford, UK.

Verification of crowd-sourced information: is this ‘crowd wisdom’ or machine wisdom?

‘Code’ or ‘law’? Image from an Ushahidi development meetup by afropicmusing.

In ‘Code and Other Laws of Cyberspace’, Lawrence Lessig (2006) writes that computer code (or what he calls ‘West Coast code’) can have the same regulatory effect as the laws and legal code developed in Washington D.C., so-called ‘East Coast code’. Computer code impacts on a person’s behaviour by virtue of its essentially restrictive architecture: on some websites you must enter a password before you gain access, in other places you can enter unidentified. The problem with computer code, Lessig argues, is that it is invisible, and that it makes it easy to regulate people’s behaviour directly and often without recourse.

For example, fair use provisions in US copyright law enable certain uses of copyrighted works, such as copying for research or teaching purposes. However the architecture of many online publishing systems heavily regulates what one can do with an e-book: how many times it can be transferred to another device, how many times it can be printed, whether it can be moved to a different format—activities that have been unregulated until now, or that are enabled by the law but effectively ‘closed off’ by code. In this case code works to reshape behaviour, upsetting the balance between the rights of copyright holders and the rights of the public to access works to support values like education and innovation.

Working as an ethnographic researcher for Ushahidi, the non-profit technology company that makes tools for people to crowdsource crisis information, has made me acutely aware of the many ways in which ‘code’ can become ‘law’. During my time at Ushahidi, I studied the practices that people were using to verify reports by people affected by a variety of events—from earthquakes to elections, from floods to bomb blasts. I then compared these processes with those followed by Wikipedians when editing articles about breaking news events. In order to understand how to best design architecture to enable particular behaviour, it becomes important to understand how such behaviour actually occurs in practice.

In addition to the impact of code on the behaviour of users, norms, the market and laws also play a role. By interviewing both the users and designers of crowdsourcing tools I soon realised that ‘human’ verification, a process of checking whether a particular report meets a group’s truth standards, is an acutely social process. It involves negotiation between different narratives of what happened and why; identifying the sources of information and assessing their reputation among groups who are considered important users of that information; and identifying gatekeeping and fact checking processes where the source is a group or institution, amongst other factors.

One disjuncture between verification ‘practice’ and the architecture of the verification code developed by Ushahidi for users was that verification categories were set as a default feature, whereas some users of the platform wanted the verification process to be invisible to external users. Items would show up as being ‘unverified’ unless they had been explicitly marked as ‘verified’, thus confusing users about whether the item was unverified because the team hadn’t yet verified it, or whether it was unverified because it had been found to be inaccurate. Some user groups wanted to be able to turn off such features when they could not take responsibility for data verification. In the case of the Christchurch Recovery Map in the aftermath of the 2011 New Zealand earthquake, the government officials with whom volunteers who set up the Ushahidi instance were working wanted to be able to turn off such features because they were concerned that they could not ensure that reports were indeed verified and having the category show up (as ‘unverified’ until ‘verified’) implied that they were engaged in some kind of verification process.

The existence of a default verification category impacted on the Christchurch Recovery Map group’s ability to gain support from multiple stakeholders, including the government, but this feature of the platform’s architecture did not have the same effect in other places and at other times. For other users like the original Ushahidi Kenya team who worked to collate instances of violence after the Kenyan elections in 2007/08, this detailed verification workflow was essential to counter the misinformation and rumour that dogged those events. As Ushahidi’s use cases have diversified—from reporting death and damage during natural disasters to political events including elections, civil war and revolutions, the architecture of Ushahidi’s code base has needed to expand. Ushahidi has recognised that code plays a defining role in the experience of verification practices, but also that code’s impact will not be the same at all times, and in all circumstances. This is why it invested in research about user diversity in a bid to understand the contexts in which code runs, and how these contexts result in a variety of different impacts.

A key question being asked in the design of future verification mechanisms is the extent to which verification work should be done by humans or non-humans (machines). Here, verification is not a binary categorisation, but rather there is a spectrum between human and non-human verification work, and indeed, projects like Ushahidi, Wikipedia and Galaxy Zoo have all developed different verification mechanisms. Wikipedia uses a set of policies and practices about how content should be added and reviewed, such as the use of ‘citation needed’ tags for information that sounds controversial and that should be backed up by a reliable source. Galaxy Zoo uses an algorithm to detect whether certain contributions are accurate by comparing them to the same work by other volunteers.

Ushahidi leaves it up to individual deployers of their tools and platform to make decisions about verification policies and practices, and is going to be designing new defaults to accommodate this variety of use. In parallel, Veri.ly, a project by ex-Ushahidi Patrick Meier with organisations Masdar and QCRI is responding to the large amounts of unverified and often contradictory information that appears on social media following natural disasters by enabling social media users to collectively evaluate the credibility of rapidly crowdsourced evidence. The project was inspired by MIT’s winning entry to DARPA’s ‘Red Balloon Challenge’ which was intended to highlight social networking’s potential to solve widely distributed, time-sensitive problems, in this case by correctly identifying the GPS coordinates of 10 balloons suspended at fixed, undisclosed locations across the US. The winning MIT team crowdsourced the problem by using a monetary incentive structure, promising $2,000 to the first person who submitted the correct coordinates for a single balloon, $1,000 to the person who invited that person to the challenge; $500 to the person who invited the inviter, and so on. The system quickly took root, spawning geographically broad, dense branches of connections. After eight hours and 52 minutes, the MIT team identified the correct coordinates for all 10 balloons.

Veri.ly aims to apply MIT’s approach to the process of rapidly collecting and evaluating critical evidence during disasters: “Instead of looking for weather balloons across an entire country in less than 9 hours, we hope Veri.ly will facilitate the crowdsourced collection of multimedia evidence for individual disasters in under 9 minutes.” It is still unclear how (or whether) Verily will be able to reproduce the same incentive structure, but a bigger question lies around the scale and spread of social media in the majority of countries where humanitarian assistance is needed. The majority of Ushahidi or Crowdmap installations are, for example, still “small data” projects, with many focused on areas that still require offline verification procedures (such as calling volunteers or paid staff who are stationed across a country, as was the case in Sudan [3]). In these cases—where the social media presence may be insignificant—a team’s ability to achieve a strong local presence will define the quality of verification practices, and consequently the level of trust accorded to their project.

If code is law and if other aspects in addition to code determine how we can act in the world, it is important to understand the context in which code is deployed. Verification is a practice that determines how we can trust information coming from a variety of sources. Only by illuminating such practices and the variety of impacts that code can have in different environments can we begin to understand how code regulates our actions in crowdsourcing environments.

For more on Ushahidi verification practices and the management of sources on Wikipedia during breaking news events, see:

[1] Ford, H. (2012) Wikipedia Sources: Managing Sources in Rapidly Evolving Global News Articles on the English Wikipedia. SSRN Electronic Journal. doi:10.2139/ssrn.2127204

[2] Ford, H. (2012) Crowd Wisdom. Index on Censorship 41(4), 33–39. doi:10.1177/0306422012465800

[3] Ford, H. (2011) Verifying information from the crowd. Ushahidi.

Heather Ford has worked as a researcher, activist, journalist, educator and strategist in the fields of online collaboration, intellectual property reform, information privacy and open source software in South Africa, the United Kingdom and the United States. She is currently a DPhil student at the OII, where she is studying how Wikipedia editors write history as it happens in a format that is unprecedented in the history of encyclopedias. Before this, she worked as an ethnographer for Ushahidi. Read Heather’s blog.

For more on the ChristChurch Earthquake, and the role of digital humanities in preserving the digital record of its impact see: Preserving the digital record of major natural disasters: the CEISMIC Canterbury Earthquakes Digital Archive project on this blog.