How policy makers can extract meaningful public opinion data from social media to inform their actions

Social media analysis can provide insight into the mobilization processes of stakeholders in response to government actions. Image of No-TAV protestors by Darren Johnson (Flickr: CC BY-NC-ND 2.0).

The role of social media in fostering the transparency of governments and strengthening the interaction between citizens and public administrations has been widely studied. Scholars have highlighted how online citizen-government and citizen-citizen interactions favour debates on social and political matters, and positively affect citizens’ interest in political processes, like elections, policy agenda setting, and policy implementation.

However, while top-down social media communication between public administrations and citizens has been widely examined, the bottom-up side of this interaction has been largely overlooked. In their Policy & Internet article “The ‘Social Side’ of Public Policy: Monitoring Online Public Opinion and Its Mobilisation During the Policy Cycle,” Andrea Ceron and Fedra Negri aim to bridge the gap between knowledge and practice, by examining how the information available on social media can support the actions of politicians and bureaucrats along the policy cycle.

Policymakers, particularly politicians, have always been interested in knowing citizens’ preferences, in measuring their satisfaction and in receiving feedback on their activities. Using the technique of Supervised Aggregated Sentiment Analysis, the authors show that meaningful information on public services, programmes, and policies can be extracted from the unsolicited comments posted by social media users, particularly those posted on Twitter. They use this technique to extract and analyse citizen opinion on two major public policies (on labour market reform and school reform) that drove the agenda of the Matteo Renzi cabinet in Italy between 2014 and 2015.

They show how online public opinion reacted to the different policy alternatives formulated and discussed during the adoption of the policies. They also demonstrate how social media analysis allows monitoring of the mobilisation and de-mobilisation processes of rival stakeholders in response to the various amendments adopted by the government, with results comparable to those of a survey and a public consultation that were undertaken by the government.

We caught up with the authors to discuss their findings:

Ed.: You say that this form of opinion monitoring and analysis is cheaper, faster and easier than (for example) representative surveys. That said, how commonly do governments harness this new form of opinion-monitoring (with the requirement for new data skills, as well as attitudes)? Do they recognise the value of it?

Andrea / Fedri: Governments are starting to pay attention to the world of social media. Just to give an idea, the Italian government has issued a call to jointly collect survey data together with the results of social media analysis and these two types of data are provided in a common report. The report has not been publicly shared, suggesting that the cabinet considers such information highly valuable. VOICES from the blogs, a spin-off created by Stefano Iacus, Luigi Curini and Andrea Ceron (University of Milan), has been involved in this and, for sure, we can attest that in a couple of instances the government modified its actions in line with shifts in public opinion observed both through survey polls and sentiment analysis. This happened with the law on Civil Unions and with the abolishment of the “voucher” (a flexible form of worker payment). So far these are just instances—although there are signs of enhanced responsiveness, particularly when online public opinion represents the core constituency of ruling parties, as the case of the school reform (discussed in the article) clearly indicates: teachers are in fact the core constituency of the Democratic Party.

Ed.: You mention that the natural language used by social media users evolves continuously and is sensitive to the discussed topic: resulting in error. The method you use involves scaling up of a human-coded (=accurate) ontology. Could you discuss how this might work in practice? Presumably humans would need to code the terms of interest first, as it wouldn’t be able to pick up new issues (e.g. around a completely new word: say, “Bowling Green”?) automatically.

Andrea / Fedri: Gary King says that the best technology is human empowered. There are at least two great advantages in exploiting human coders. First, with our technique coders manage to get rid of noise better than any algorithm, as often a single word can be judged to be in-topic or out of topic based on the context and on the rest of the sentence. Second, human-coders can collect deeper information by mining the real opinions expressed in the online conversations. This sometimes allows them to detect, bottom-up, some arguments that were completely ignored ex-ante by scholars or analysts.

Ed.: There has been a lot of debate in the UK around “false balance”, e.g. the BBC giving equal coverage to climate deniers (despite being a tiny, unrepresentative, and uninformed minority), in an attempt at “impartiality”: how do you get round issues of non-representativeness in social media, when tracking—and more importantly, acting on—opinion?

Andrea / Fedri: Nowadays social media are a non-representative sample of a country’s population. However, the idea of representativeness linked to the concept of “public opinion” dates back to the early days of polling. Today, by contrast, online conversations often represent an “activated public opinion” comprising stakeholders who express their voices in an attempt to build wider support around their views. In this regard, social media data are interesting precisely due to their non-representativeness. A tiny group can speak loudly and this voice can gain the support of an increasing number of people. If the activated public opinion acts as an “influencer”, this implies that social media analysis could anticipate trends and shifts in public opinion.

Ed.: As data becomes increasingly open and tractable (controlled by people like Google, Facebook, or monitored by e.g. GCHQ / NSA), and text-techniques become increasingly sophisticated: what is the extreme logical conclusion in terms of government being able to track opinion, say in 50 years, following the current trajectory? Or will the natural messiness of humans and language act as a natural upper limit on what is possible?

Andrea / Fedri: The purpose of scientific research, particularly applied research, is to improve our well-being and to make our life easier. For sure there could be issues linked with the privacy of our data and, in a sci-fi scenario, government and police will be able to read our minds—either to prevent crimes and terrorist attacks (as in the Minority Report movie) or to detect, isolate and punish dissent. However, technology is not a standalone object and we should not forget that there are humans behind it. Whether these humans are governments, activists or common citizens, can certainly make a difference. If governments try to misuse technology, they will certainly meet a reaction from citizens—which can be amplified precisely via this new technology.

Read the full article: Ceron, A. and Negri, F. (2016) The “Social Side” of Public Policy: Monitoring Online Public Opinion and Its Mobilisation During the Policy Cycle. Policy & Internet 8 (2) DOI:10.1002/poi3.117

Andrea Ceron and Fedra Negri were talking to blog editor David Sutcliffe.

Crowdsourcing for public policy and government

If elections were invented today, they would probably be referred to as “crowdsourcing the government.” First coined in a 2006 issue of Wired magazine (Howe, 2006), the term crowdsourcing has come to be applied loosely to a wide variety of situations where ideas, opinions, labor or something else is “sourced” in from a potentially large group of people. Whilst most commonly applied in business contexts, there is an increasing amount of buzz around applying crowdsourcing techniques in government and policy contexts as well (Brabham, 2013).

Though there is nothing qualitatively new about involving more people in government and policy processes, digital technologies in principle make it possible to increase the quantity of such involvement dramatically, by lowering the costs of participation (Margetts et al., 2015) and making it possible to tap into people’s free time (Shirky, 2010). This difference in quantity is arguably great enough to obtain a quality of its own. We can thus be justified in using the term “crowdsourcing for public policy and government” to refer to new digitally enabled ways of involving people in any aspect of democratic politics and government, not replacing but rather augmenting more traditional participation routes such as elections and referendums.

In this editorial, we will briefly highlight some of the key emerging issues in research on crowdsourcing for public policy and government. Our entry point into the discussion is a collection of research papers first presented at the Internet, Politics & Policy 2014 (IPP2014) conference organised by the Oxford Internet Institute (University of Oxford) and the Policy & Internet journal. The theme of this very successful conference—our third since the founding of the journal—was “crowdsourcing for politics and policy.” Out of almost 80 papers presented at the conference in September last year, 14 of the best have now been published as peer-reviewed articles in this journal, including five in this issue. A further handful of papers from the conference focusing on labor issues will be published in the next issue, but we can already now take stock of all the articles focusing on government, politics, and policy.

The growing interest in crowdsourcing for government and public policy must be understood in the context of the contemporary malaise of politics, which is being felt across the democratic world, but most of all in Europe. The problems with democracy have a long history, from the declining powers of parliamentary bodies when compared to the executive; to declining turnouts in elections, declining participation in mass parties, and declining trust in democratic institutions and politicians. But these problems have gained a new salience in the last five years, as the ongoing financial crisis has contributed to the rise of a range of new populist forces all across Europe, and to a fragmentation of the centre ground. Furthermore, poor accuracy of pre- election polls in recent elections in Israel and the UK have generated considerable debate over the usefulness and accuracy of the traditional way of knowing what the public is thinking: the sample survey.

Many place hopes on technological and institutional innovations such as crowdsourcing to show a way out of the brewing crisis of democratic politics and political science. One of the key attractions of crowdsourcing techniques to governments and grass roots movements alike is the legitimacy such techniques are expected to be able to generate. For example, crowdsourcing techniques have been applied to enable citizens to verify the legality and correctness of government decisions and outcomes. A well-known application is to ask citizens to audit large volumes of data on government spending, to uncover any malfeasance but also to increase citizens’ trust in the government (Maguire, 2011).

Articles emerging from the IPP2014 conference analyze other interesting and comparable applications. In an article titled “Population as Auditor of an Election Process in Honduras: The Case of the VotoSocial Crowdsourcing Platform,” Carlos Arias, Jorge Garcia and Alejandro Corpeño (2015) describe the use of crowdsourcing for auditing election results. Dieter Zinnbauer (2015) discusses the potentials and pitfalls of the use of crowdsourcing for some other types of auditing purposes, in “Crowdsourced Corruption Reporting: What Petrified Forests, Street Music, Bath Towels, and the Taxman Can Tell Us About the Prospects for Its Future.”

Besides allowing citizens to verify the outcome of a process, crowdsourcing can also be used to lend an air of inclusiveness and transparency to a process itself. This process legitimacy can then indirectly legitimate the outcome of the process as well. For example, crowdsourcing-style open processes have been used to collect policy ideas, gather support for difficult policy decisions, and even generate detailed spending plans through participatory budgeting (Wampler & Avritzer, 2004). Articles emerging from our conference further advance this line of research. Roxana Radu, Nicolo Zingales and Enrico Calandro (2015) examine the use of crowdsourcing to lend process legitimacy to Internet governance, in an article titled “Crowdsourcing Ideas as an Emerging Form of Multistakeholder Participation in Internet Governance.” Graham Smith, Robert C. Richards Jr. and John Gastil (2015) write about “The Potential of Participedia as a Crowdsourcing Tool for Comparative Analysis of Democratic Innovations.”

An interesting cautionary tale is presented by Henrik Serup Christensen, Maija Karjalainen and Laura Nurminen (2015) in “Does Crowdsourcing Legislation Increase Political Legitimacy? The Case of Avoin Ministeriö in Finland.” They show how a citizen initiative process ended up decreasing government legitimacy, after the government failed to implement the outcome of an initiative process that was perceived as highly legitimate by its supporters. Taneli Heikka (2015) further examines the implications of citizen initiative processes to the state–citizen relationship in “The Rise of the Mediating Citizen: Time, Space and Citizenship in the Crowdsourcing of Finnish Legislation.”

In many of the contributions that touch on the legitimating effects of crowdsourcing, one can sense a third, latent theme. Besides allowing outcomes to be audited and processes to be potentially more inclusive, crowdsourcing can also increase the perceived legitimacy of a government or policy process by lending an air of innovation and technological progress to the endeavour and those involved in it. This is most explicitly stated by Simo Hosio, Jorge Goncalves, Vassilis Kostakos and Jukka Riekki (2015) in “Crowdsourcing Public Opinion Using Urban Pervasive Technologies: Lessons From Real-Life Experiments in Oulu.” They describe how local government officials collaborating with the research team to test a new public screen based polling system “expressed that the PR value boosted their public perception as a modern organization.” That some government crowdsourcing initatives are at least in part motivated by such “crowdwashing” is hardly surprising, but it encourages us to retain a critical approach and analyse actual outcomes instead of accepting dominant discourses about the nature and effects of crowdsourcing at face value.

For instance, we must continue to examine the actual size, composition, internal structures and motivations of the supposed “crowds” that make use of online platforms. Articles emerging from our conference that contributed towards this aim include “Event Prediction With Learning Algorithms—A Study of Events Surrounding the Egyptian Revolution of 2011 on the Basis of Micro Blog Data” by Benedikt Boecking, Margeret Hall and Jeff Schneider (2015) and “Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making” by Pete Burnap and Matthew L. Williams (2015). Anatoliy Gruzd and Ksenia Tsyganova won a best paper award at the IPP2014 conference for an article published in this journal as “Information Wars and Online Activism During the 2013/2014 Crisis in Ukraine: Examining the Social Structures of Pro- and Anti-Maidan Groups.” These articles can be used to challenge the notion that crowdsourcing contributors are simply sets of independent individuals who are neatly representative of a larger population, and instead highlight the clusters, networks, and power structures inherent within them. This has implications to the democratic legitimacy of some of the more naive crowdsourcing initiatives.

One of the most original articles to emerge out of IPP2014 turns the concept of crowdsourcing for public policy and government on its head. While most research has focused on crowdsourcing’s empowering effects (or lack thereof), Gregory Asmolov (2015) analyses crowdsourcing as a form of social control. In an article titled “Vertical Crowdsourcing in Russia: Balancing Governance of Crowds and State–Citizen Partnership in Emergency Situations,” Asmolov draws on empirical evidence and theorists such as Foucault to show how crowdsourcing platforms can be used to institutionalise volunteer resources in order to align them with state objectives and prevent independent collective action. An article by Jorge Goncalves, Yong Liu, Bin Xiao, Saad Chaudhry, Simo Hosio and Vassilis Kostakos (2015) provides a less nefarious example of strategic use of online platforms to further government objectives, under the title “Increasing the Reach of Government Social Media: A Case Study in Modeling Government–Citizen Interaction on Facebook.”

Articles emerging from the conference also include two review articles that provide useful overviews of the field from different perspectives. “A Systematic Review of Online Deliberation Research” by Dennis Friess and Christiane Eilders (2015) takes stock of the use of digital technologies as public spheres. “The Fundamentals of Policy Crowdsourcing” by John Prpić, Araz Taeihagh and James Melton (2015) situates a broad variety of crowdsourcing literature into the context of a public policy cycle framework.

It has been extremely satisfying to follow the progress of these papers from initial conference submissions to high-quality journal articles, and to see that the final product not only advances the state of the art, but also provides certain new and critical perspectives on crowdsourcing. These perspectives will no doubt provoke responses, and Policy & Internet continues to welcome high-quality submissions dealing with crowdsourcing for public policy, government, and beyond.

Read the full editorial: Vili Lehdonvirta and Jonathan Bright (2015) Crowdsourcing for Public Policy and Government. Editorial. Volume 7, Issue 3, pages 263–267.


Arias, C.R., Garcia, J. and Corpeño, A. (2015) Population as Auditor of an Election Process in Honduras: The Case of the VotoSocial Crowdsourcing Platform. Policy & Internet 7 (2) 185–202.

Asmolov, G. (2105) Vertical Crowdsourcing in Russia: Balancing Governance of Crowds and State–Citizen Partnership in Emergency Situations. Policy & Internet 7 (3).

Brabham, D. C. (2013). Citizen E-Participation in Urban Governance: Crowdsourcing and Collaborative Creativity: Crowdsourcing and Collaborative Creativity. IGI Global.

Boecking, B., Hall, M. and Schneider, J. (2015) Event Prediction With Learning Algorithms—A Study of Events Surrounding the Egyptian Revolution of 2011 on the Basis of Micro Blog Data. Policy & Internet 7 (2) 159–184.

Burnap P. and Williams, M.L. (2015) Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making. Policy & Internet 7 (2) 223–242.

Christensen, H.S., Karjalainen, M. and Nurminen, L. (2015) Does Crowdsourcing Legislation Increase Political Legitimacy? The Case of Avoin Ministeriö in Finland. Policy & Internet 7 (1) 25-45.

Friess, D. and Eilders, C. (2015) A Systematic Review of Online Deliberation Research. Policy & Internet 7 (3).

Goncalves, J., Liu, Y., Xiao, B., Chaudhry, S., Hosio, S. and Kostakos, V. (2015) Increasing the Reach of Government Social Media: A Case Study in Modeling Government–Citizen Interaction on Facebook. Policy & Internet 7 (1) 80-102.

Gruzd, A. and Tsyganova, K. (2015) Information Wars and Online Activism During the 2013/2014 Crisis in Ukraine: Examining the Social Structures of Pro- and Anti-Maidan Groups. Policy & Internet 7 (2) 121–158.

Heikka, T. (2015) The Rise of the Mediating Citizen: Time, Space and Citizenship in the Crowdsourcing of Finnish Legislation. Policy & Internet 7 (3).

Hosio, S., Goncalves, J., Kostakos, V. and Riekki, J. (2015) Crowdsourcing Public Opinion Using Urban Pervasive Technologies: Lessons From Real-Life Experiments in Oulu. Policy & Internet 7 (2) 203–222.

Howe, J. (2006). The Rise of Crowdsourcing by Jeff Howe | Byliner. Retrieved from

Maguire, S. (2011). Can Data Deliver Better Government? Political Quarterly, 82(4), 522–525.

Margetts, H., John, P., Hale, S., & Yasseri, T. (2015): Political Turbulence: How Social Media Shape Collective Action. Princeton University Press.

Prpić, J., Taeihagh, A. and Melton, J. (2015) The Fundamentals of Policy Crowdsourcing. Policy & Internet 7 (3).

Radu, R., Zingales, N. and Calandro, E. (2015) Crowdsourcing Ideas as an Emerging Form of Multistakeholder Participation in Internet Governance. Policy & Internet 7 (3).

Shirky, C. (2010). Cognitive Surplus: How Technology Makes Consumers into Collaborators. Penguin Publishing Group.

Smith, G., Richards R.C. Jr. and Gastil, J. (2015) The Potential of Participedia as a Crowdsourcing Tool for Comparative Analysis of Democratic Innovations. Policy & Internet 7 (2) 243–262.

Wampler, B., & Avritzer, L. (2004). Participatory publics: civil society and new institutions in democratic Brazil. Comparative Politics, 36(3), 291–312.

Zinnbauer, D. (2015) Crowdsourced Corruption Reporting: What Petrified Forests, Street Music, Bath Towels, and the Taxman Can Tell Us About the Prospects for Its Future. Policy & Internet 7 (1) 1–24.

Evidence on the extent of harms experienced by children as a result of online risks: implications for policy and research

The range of academic literature analysing the risks and opportunities of Internet use for children has grown substantially in the past decade, but there’s still surprisingly little empirical evidence on how perceived risks translate into actual harms. Image by Brad Flickinger

Child Internet safety is a topic that continues to gain a great deal of media coverage and policy attention. Recent UK policy initiatives such as Active Choice Plus in which major UK broadband providers agreed to provide household-level filtering options, or the industry-led Internet Matters portal, reflect a public concern with the potential risks and harms of children’s Internet use. At the same time, the range of academic literature analysing the risks and opportunities of Internet use for children has grown substantially in the past decade, in large part due to the extensive international studies funded by the European Commission as part of the excellent EU Kids Online network. Whilst this has greatly helped us understand how children behave online, there’s still surprisingly little empirical evidence on how perceived risks translate into actual harms. This is a problematic, first, because risks can only be identified if we understand what types of harms we wish to avoid, and second, because if we only undertake research on the nature or extent of risk, then it’s difficult to learn anything useful about who is harmed, and what this means for their lives.

Of course, the focus on risk rather than harm is understandable from an ethical and methodological perspective. It wouldn’t be ethical, for example, to conduct a trial in which one group of children was deliberately exposed to very violent or sexual content to observe whether any harms resulted. Similarly, surveys can ask respondents to self-report harms experienced online, perhaps through the lens of upsetting images or experiences. But again, there are ethical concerns about adding to children’s distress by questioning them extensively on difficult experiences, and in a survey context it’s also difficult to avoid imposing adult conceptions of ‘harm’ through the wording of the questions.

Despite these difficulties, there are many research projects that aim to measure and understand the relationship between various types of physical, emotional or psychological harm and activities online, albeit often outside the social sciences. With support from the OUP Fell Fund, I worked with colleagues Vera Slavtcheva-Petkova and Monica Bulger to review the extent of evidence available across these other disciplines. Looking at journal articles published between 1997 and 2012, we aimed to identify any empirical evidence detailing Internet-related harms experienced by children and adolescents and to gain a sense of the types of harm recorded, their severity and frequency.

Our findings demonstrate that there are many good studies out there which do address questions of harm, rather than just risk. The narrowly drawn search found 148 empirical studies which either clearly delineated evidence of very specific harms, or offered some evidence of less well-defined harms. Further, these studies offer rich insights into three broad types of harm: health-related (including harms relating to the exacerbation of eating disorders, self-harming behaviour and suicide attempts); sex-related (largely focused on studies of online solicitation and child abuse); and bullying-related (including the effects on mental health and behaviour). Such a range of coverage would come as no surprise to most researchers focusing on children’s Internet use—these are generally well-documented areas, albeit with the focus more normally on risk rather than harm. Perhaps more surprising was the absence in our search of evidence of harm in relation to privacy violations or economic well-being, both of which are increasingly discussed as significant concerns or risks for minors using the Internet. This gap might have been a factor of our search terms, of course, but given the policy relevance of both issues, more empirical study of not just risk but actual harm would seem to be merited in these areas.

Another important gap in the literature concerned the absence of literature demonstrating that severe harms often befall those without prior evidence of vulnerability or risky behaviour. For example, in relation to websites promoting self-harm or eating disorders, there is little evidence that young people previously unaffected by self-harm or eating disorders are influenced by these websites. This isn’t unexpected—other researchers have shown that harm more often befalls those who display riskier behaviour, but this is important to bear in mind when devising treatment or policy strategies for reducing such harms.

It’s also worth noting how difficult it is to determine the prevalence of harms. The best-documented cases are often those where medical, police or court records provide great depth of qualitative detail about individual suffering in cases of online grooming and abuse, eating disorders or self-harm. Yet these cases provide little insight into prevalence. And whilst survey research offers more sense of scale, we found substantial disparities in the levels of harm reported on some issues, with the prevalence of cyber-bullying, for example, varying from 9% to 72% across studies with similar age groups of children. It’s also clear that we quite simply need much more research and policy attention on certain issues. The studies relating to the online grooming of children and production of abuse images are an excellent example of how a broad research base can make an important contribution to our understanding of online risks and harms. Here, journal articles offered a remarkably rich understanding, drawing on data from police reports, court records or clinical files as well as surveys and interviews with victims, perpetrators and carers. There would be real benefits to taking a similarly thorough approach to the study of users of pro-eating disorder, self-harm and pro-suicide websites.

Our review flagged up some important lessons for policy-makers. First, whilst we (justifiably) devote a wealth of resources to the small proportion of children experiencing severe harms as a result of online experiences, the number of those experiencing more minor harms such as those caused by online bullying is likely much higher and may thus deserve more attention than currently received. Second, the diversity of topics discussed and types of harm identified seems to suggest that a one-size-fits-all solution will not work when it comes to online protection of minors. Simply banning or filtering all potentially harmful websites, pages or groups might be more damaging than useful if it drives users to less public means of communicating. Further, whilst some content such as child sexual abuse images are clearly illegal and generate great harms, other content and sites is less easy to condemn if the balance between perpetuating harmful behavior and provide valued peer support is hard to call. It should also be remembered that the need to protect young people from online harms must always be balanced against the need to protect their rights (and opportunities) to freely express themselves and seek information online.

Finally, this study makes an important contribution to public debates about child online safety by reminding us that risk and harm are not equivalent and should not be conflated. More children and young people are exposed to online risks than are actually harmed as a result and our policy responses should reflect this. In this context, the need to protect minors from online harms must always be balanced against their rights and opportunities to freely express themselves and seek information online.

A more detailed account of our findings can be found in this Information, Communication and Society journal article: Evidence on the extent of harms experienced by children as a result of online risks: implications for policy and research. If you can’t access this, please e-mail me for a copy.

Victoria Nash is a Policy and Research Fellow at the Oxford Internet Institute (OII), responsible for connecting OII research with policy and practice. Her own particular research interests draw on her background as a political theorist, and concern the theoretical and practical application of fundamental liberal values in the Internet era. Recent projects have included efforts to map the legal and regulatory trends shaping freedom of expression online for UNESCO, analysis of age verification as a tool to protect and empower children online, and the role of information and Internet access in the development of moral autonomy.

Past and Emerging Themes in Policy and Internet Studies

We can’t understand, analyze or make public policy without understanding the technological, social and economic shifts associated with the Internet. Image from the (post-PRISM) “Stop Watching Us” Berlin Demonstration (2013) by mw238.

In the journal’s inaugural issue, founding Editor-in-Chief Helen Margetts outlined what are essentially two central premises behind Policy & Internet’s launch. The first is that “we cannot understand, analyse or make public policy without understanding the technological, social and economic shifts associated with the Internet” (Margetts 2009, 1). It is simply not possible to consider public policy today without some regard for the intertwining of information technologies with everyday life and society. The second premise is that the rise of the Internet is associated with shifts in how policy itself is made. In particular, she proposed that impacts of Internet adoption would be felt in the tools through which policies are effected, and the values that policy processes embody.

The purpose of the Policy and Internet journal was to take up these two challenges: the public policy implications of Internet-related social change, and Internet-related changes in policy processes themselves. In recognition of the inherently multi-disciplinary nature of policy research, the journal is designed to act as a meeting place for all kinds of disciplinary and methodological approaches. Helen predicted that methodological approaches based on large-scale transactional data, network analysis, and experimentation would turn out to be particularly important for policy and Internet studies. Driving the advancement of these methods was therefore the journal’s third purpose. Today, the journal has reached a significant milestone: over one hundred high-quality peer-reviewed articles published. This seems an opportune moment to take stock of what kind of research we have published in practice, and see how it stacks up against the original vision.

At the most general level, the journal’s articles fall into three broad categories: the Internet and public policy (48 articles), the Internet and policy processes (51 articles), and discussion of novel methodologies (10 articles). The first of these categories, “the Internet and public policy,” can be further broken down into a number of subcategories. One of the most prominent of these streams is fundamental rights in a mediated society (11 articles), which focuses particularly on privacy and freedom of expression. Related streams are children and child protection (six articles), copyright and piracy (five articles), and general e-commerce regulation (six articles), including taxation. A recently emerged stream in the journal is hate speech and cybersecurity (four articles). Of course, an enduring research stream is Internet governance, or the regulation of technical infrastructures and economic institutions that constitute the material basis of the Internet (seven articles). In recent years, the research agenda in this stream has been influenced by national policy debates around broadband market competition and network neutrality (Hahn and Singer 2013). Another enduring stream deals with the Internet and public health (eight articles).

Looking specifically at “the Internet and policy processes” category, the largest stream is e-participation, or the role of the Internet in engaging citizens in national and local government policy processes, through methods such as online deliberation, petition platforms, and voting advice applications (18 articles). Two other streams are e-government, or the use of Internet technologies for government service provision (seven articles), and e-politics, or the use of the Internet in mainstream politics, such as election campaigning and communications of the political elite (nine articles). Another stream that has gained pace during recent years, is online collective action, or the role of the Internet in activism, ‘clicktivism,’ and protest campaigns (16 articles). Last year the journal published a special issue on online collective action (Calderaro and Kavada 2013), and the next forthcoming issue includes an invited article on digital civics by Ethan Zuckerman, director of MIT’s Center for Civic Media, with commentary from prominent scholars of Internet activism. A trajectory discernible in this stream over the years is a movement from discussing mere potentials towards analyzing real impacts—including critical analyses of the sometimes inflated expectations and “democracy bubbles” created by digital media (Shulman 2009; Karpf 2012; Bryer 2012).

The final category, discussion of novel methodologies, consists of articles that develop, analyse, and reflect critically on methodological innovations in policy and Internet studies. Empirical articles published in the journal have made use of a wide range of conventional and novel research methods, from interviews and surveys to automated content analysis and advanced network analysis methods. But of those articles where methodology is the topic rather than merely the tool, the majority deal with so-called “big data,” or the use of large-scale transactional data sources in research, commerce, and evidence-based public policy (nine articles). The journal recently devoted a special issue to the potentials and pitfalls of big data for public policy (Margetts and Sutcliffe 2013), based on selected contributions to the journal’s 2012 big data conference: Big Data, Big Challenges? In general, the notion of data science and public policy is a growing research theme.

This brief analysis suggests that research published in the journal over the last five years has indeed followed the broad contours of the original vision. The two challenges, namely policy implications of Internet-related social change and Internet-related changes in policy processes, have both been addressed. In particular, research has addressed the implications of the Internet’s increasing role in social and political life. The journal has also furthered the development of new methodologies, especially the use of online network analysis techniques and large-scale transactional data sources (aka ‘big data’).

As expected, authors from a wide range of disciplines have contributed their perspectives to the journal, and engaged with other disciplines, while retaining the rigour of their own specialisms. The geographic scope of the contributions has been truly global, with authors and research contexts from six continents. I am also pleased to note that a characteristic common to all the published articles is polish; this is no doubt in part due to the high level of editorial support that the journal is able to afford to authors, including copyediting. The justifications for the journal’s establishment five years ago have clearly been borne out, so that the journal now performs an important function in fostering and bringing together research on the public policy implications of an increasingly Internet-mediated society.

And what of my own research interests as an editor? In the inaugural editorial, Helen Margetts highlighted work, finance, exchange, and economic themes in general as being among the prominent areas of Internet-related social change that are likely to have significant future policy implications. I think for the most part, these implications remain to be addressed, and this is an area that the journal can encourage authors to tackle better. As an editor, I will work to direct attention to this opportunity, and welcome manuscript submissions on all aspects of Internet-enabled economic change and its policy implications. This work will be kickstarted by the journal’s 2014 conference (26-27 September), which this year focuses on crowdsourcing and online labor.

Our published articles will continue to be highlighted here in the journal’s blog. Launched last year, we believe this blog will help to expand the reach and impact of research published in Policy and Internet to the wider academic and practitioner communities, promote discussion, and increase authors’ citations. After all, publication is only the start of an article’s public life: we want people reading, debating, citing, and offering responses to the research that we, and our excellent reviewers, feel is important, and worth publishing.

Read the full editorial:  Lehdonvirta, V. (2014) Past and Emerging Themes in Policy and Internet Studies. Policy & Internet 6(2): 109-114.


Bryer, T.A. (2011) Online Public Engagement in the Obama Administration: Building a Democracy Bubble? Policy & Internet 3 (4).

Calderaro, A. and Kavada, A. (2013) Challenges and Opportunities of Online Collective Action for Policy Change. Policy & Internet (5) 1.

Hahn, R. and Singer, H. (2013) Is the U.S. Government’s Internet Policy Broken? Policy & Internet 5 (3) 340-363.

Karpf, D. (2012) Online Political Mobilisation from the Advocacy Group’s Perspective: Looking Beyond Clicktivism. Policy & Internet 2 (4) 7-41.

Margetts, H. (2009) The Internet and Public Policy. Policy and Internet 1 (1).

Margetts, H. and Sutcliffe, D. (2013) Addressing the Policy Challenges and Opportunities of ‘Big Data.’ Policy & Internet 5 (2) 139-146.

Shulman, S.W. (2009) The Case Against Mass E-mails: Perverse Incentives and Low Quality Public Participation in U.S. Federal Rulemaking. Policy & Internet 1 (1) 23-53.

Five recommendations for maximising the relevance of social science research for public policy-making in the big data era

As I discussed in a previous post on the promises and threats of big data for public policy-making, public policy making has entered a period of dramatic change. Widespread use of digital technologies, the Internet and social media means citizens and governments leave digital traces that can be harvested to generate big data. This increasingly rich data environment poses both promises and threats to policy-makers.

So how can social scientists help policy-makers in this changed environment, ensuring that social science research remains relevant? Social scientists have a good record on having policy influence, indeed in the UK better than other academic fields, including medicine, as recent research from the LSE Public Policy group has shown. Big data hold major promise for social science, which should enable us to further extend our record in policy research. We have access to a cornucopia of data of a kind which is more like that traditionally associated with so-called ‘hard’ science. Rather than being dependent on surveys, the traditional data staple of empirical social science, social media such as Wikipedia, Twitter, Facebook, and Google Search present us with the opportunity to scrape, generate, analyse and archive comparative data of unprecedented quantity. For example, at the OII over the last four years we have been generating a dataset of all petition signing in the UK and US, which contains the joining rate (updated every hour) for the 30,000 petitions created in the last three years. As a political scientist, I am very excited by this kind of data (up to now, we have had big data like this only for voting, and that only at election time), which will allow us to create a complete ecology of petition signing, one of the more popular acts of political participation in the UK. Likewise, we can look at the entire transaction history of online organisations like Wikipedia, or map the link structure of government’s online presence.

But big data holds threats for social scientists too. The technological challenge is ever present. To generate their own big data, researchers and students must learn to code, and for some that is an alien skill. At the OII we run a course on Digital Social Research that all our postgraduate students can take; but not all social science departments could either provide such a course, or persuade their postgraduate students that they needed it. Ours, who study the social science of the Internet, are obviously predisposed to do so. And big data analysis requires multi-disciplinary expertise. Our research team working on petitions data includes a computer scientist (Scott Hale), a physicist (Taha Yasseri) and a political scientist (myself). I can’t imagine doing this sort of research without such technical expertise, and as a multi-disciplinary department we are (reasonably) free to recruit these type of research faculty. But not all social science departments can promise a research career for computer scientists, or physicists, or any of the other disciplinary specialists that might be needed to tackle big data problems.

Five Recommendations for Social Scientists

So, how can social scientists overcome these challenges, and thereby be in a good position to aid policy-makers tackle their own barriers to making the most of the possibilities afforded by big data? Here are five recommendations:

Accept that multi-disciplinary research teams are going to become the norm for social science research, extending beyond social science disciplines into the life sciences, mathematics, physics, and engineering. At Policy and Internet’s 2012 Big Data conference, the keynote speaker Duncan Watts (physicist turned sociologist) called for a ‘dating agency’ for engineers and social scientists—with the former providing the technological expertise, and the latter identifying the important research questions. We need to make sure that forums exist where social scientists and technologists meet and discuss big data research at the earliest stages, so that research projects and programmes incorporate the core competencies of both.

We need to provide the normative and ethical basis for policy decisions in the big data era. That means bringing in normative political theorists and philosophers of information into our research teams. The government has committed £65 million to big data research funding, but it seems likely that any successful research proposals will have a strong ethics component embedded in the research programme, rather than an ethics add on or afterthought.

Training in data science. Many leading US universities are now admitting undergraduates to data science courses, but lack social science input. Of the 20 US masters courses in big data analytics compiled by Information Week, nearly all came from computer science or informatics departments. Social science research training needs to incorporate coding and analysis skills of the kind these courses provide, but with a social science focus. If we as social scientists leave the training to computer scientists, we will find that the new cadre of data scientists tend to leave out social science concerns or questions.

Bringing policy makers and academic researchers together to tackle the challenges that big data present. Last month the OII and Policy and Internet convened a workshop in Harvard on Responsible Research Agendas for Public Policy in the Big Data Era, which included various leading academic researchers in the government and big data field, and government officials from the Census Bureau, the Federal Reserve Board, the Bureau of Labor Statistics, and the Office of Management and Budget (OMB). The discussions revealed that there is continual procession of major events on big data in Washington DC (usually with a corporate or scientific research focus) to which US federal officials are invited, but also how few were really dedicated to tackling the distinctive issues that face government agencies such as those represented around the table.

Taking forward theoretical development in social science, incorporating big data insights. I recently spoke at the Oxford Analytica Global Horizons conference, at a session on Big Data. One of the few policy-makers (in proportion to corporate representatives) in the audience asked the panel “where is the theory”? As social scientists, we need to respond to that question, and fast.

This post is based on discussions at the workshop on Responsible Research Agendas for Public Policy in the era of Big Data workshop and the Political Studies Association Why Universities Matter: How Academic Social Science Contributes to Public Policy Impact, held at the LSE on 26 September 2013.

Helen Margetts is the Director of the OII, and Professor of Society and the Internet. She is a political scientist specialising in e-government and digital era governance and politics, investigating the nature and implications of relationships between governments, citizens and the Internet and related digital technologies in the UK and internationally.

The promises and threats of big data for public policy-making

The environment in which public policy is made has entered a period of dramatic change. Widespread use of digital technologies, the Internet and social media means both citizens and governments leave digital traces that can be harvested to generate big data. Policy-making takes place in an increasingly rich data environment, which poses both promises and threats to policy-makers.

On the promise side, such data offers a chance for policy-making and implementation to be more citizen-focused, taking account of citizens’ needs, preferences and actual experience of public services, as recorded on social media platforms. As citizens express policy opinions on social networking sites such as Twitter and Facebook; rate or rank services or agencies on government applications such as NHS Choices; or enter discussions on the burgeoning range of social enterprise and NGO sites, such as Mumsnet, 38 degrees and, they generate a whole range of data that government agencies might harvest to good use. Policy-makers also have access to a huge range of data on citizens’ actual behaviour, as recorded digitally whenever citizens interact with government administration or undertake some act of civic engagement, such as signing a petition.

Data mined from social media or administrative operations in this way also provide a range of new data which can enable government agencies to monitor—and improve—their own performance, for example through log usage data of their own electronic presence or transactions recorded on internal information systems, which are increasingly interlinked. And they can use data from social media for self-improvement, by understanding what people are saying about government, and which policies, services or providers are attracting negative opinions and complaints, enabling identification of a failing school, hospital or contractor, for example. They can solicit such data via their own sites, or those of social enterprises. And they can find out what people are concerned about or looking for, from the Google Search API or Google trends, which record the search patterns of a huge proportion of internet users.

As for threats, big data is technologically challenging for government, particularly those governments which have always struggled with large-scale information systems and technology projects. The UK government has long been a world leader in this regard and recent events have only consolidated its reputation. Governments have long suffered from information technology skill shortages and the complex skill sets required for big data analytics pose a particularly acute challenge. Even in the corporate sector, over a third of respondents to a recent survey of business technology professionals cited ‘Big data expertise is scarce and expensive’ as their primary concern about using big data software.

And there are particular cultural barriers to government in using social media, with the informal style and blurring of organisational and public-private boundaries which they engender. And gathering data from social media presents legal challenges, as companies like Facebook place barriers to the crawling and scraping of their sites.

More importantly, big data presents new moral and ethical dilemmas to policy makers. For example, it is possible to carry out probabilistic policy-making, where policy is made on the basis of what a small segment of individuals will probably do, rather than what they have done. Predictive policing has had some success particularly in California, where robberies declined by a quarter after use of the ‘PredPol’ policing software, but can lead to a “feedback loop of injustice” as one privacy advocacy group put it, as policing resources are targeted at increasingly small socio-economic groups. What responsibility does the state have to devote disproportionately more—or less—resources to the education of those school pupils who are, probabilistically, almost certain to drop out of secondary education? Such challenges are greater for governments than corporations. We (reasonably) happily trade privacy to allow Tesco and Facebook to use our data on the basis it will improve their products, but if government tries to use social media to understand citizens and improve its own performance, will it be accused of spying on its citizenry in order to quash potential resistance.

And of course there is an image problem for government in this field—discussion of big data and government puts the word ‘big’ dangerously close to the word ‘government’ and that is an unpopular combination. Policy-makers’ responses to Snowden’s revelations of the US Tempora and UK Prism programmes have done nothing to improve this image, with their focus on the use of big data to track down individuals and groups involved in acts of terrorism and criminality—rather than on anything to make policy-making better, or to use the wealth of information that these programmes collect for the public good.

However, policy-makers have no choice but to tackle some of these challenges. Big data has been the hottest trend in the corporate world for some years now, and commentators from IBM to the New Yorker are starting to talk about the big data ‘backlash’. Government has been far slower to recognise the advantages for policy-making and services. But in some policy sectors, big data poses very fundamental questions which call for an answer; how should governments conduct a census, for or produce labour statistics, for example, in the age of big data? Policy-makers will need to move fast to beat the backlash.

This post is based on discussions at the workshop on Responsible Research Agendas for Public Policy in the era of Big Data workshop.

Helen Margetts is the Director of the OII, and Professor of Society and the Internet. She is a political scientist specialising in digital era governance and politics.

Can text mining help handle the data deluge in public policy analysis?

Policy makers today must contend with two inescapable phenomena. On the one hand, there has been a major shift in the policies of governments concerning participatory governance—that is, engaged, collaborative, and community-focused public policy. At the same time, a significant proportion of government activities have now moved online, bringing about “a change to the whole information environment within which government operates” (Margetts 2009, 6).

Indeed, the Internet has become the main medium of interaction between government and citizens, and numerous websites offer opportunities for online democratic participation. The Hansard Society, for instance, regularly runs e-consultations on behalf of UK parliamentary select committees. For examples, e-consultations have been run on the Climate Change Bill (2007), the Human Tissue and Embryo Bill (2007), and on domestic violence and forced marriage (2008). Councils and boroughs also regularly invite citizens to take part in online consultations on issues affecting their area. The London Borough of Hammersmith and Fulham, for example, recently asked its residents for thier views on Sex Entertainment Venues and Sex Establishment Licensing policy.

However, citizen participation poses certain challenges for the design and analysis of public policy. In particular, governments and organisations must demonstrate that all opinions expressed through participatory exercises have been duly considered and carefully weighted before decisions are reached. One method for partly automating the interpretation of large quantities of online content typically produced by public consultations is text mining. Software products currently available range from those primarily used in qualitative research (integrating functions like tagging, indexing, and classification), to those integrating more quantitative and statistical tools, such as word frequency and cluster analysis (more information on text mining tools can be found at the National Centre for Text Mining).

While these methods have certainly attracted criticism and skepticism in terms of the interpretability of the output, they offer four important advantages for the analyst: namely categorisation, data reduction, visualisation, and speed.

1. Categorisation. When analysing the results of consultation exercises, analysts and policymakers must make sense of the high volume of disparate responses they receive; text mining supports the structuring of large amounts of this qualitative, discursive data into predefined or naturally occurring categories by storage and retrieval of sentence segments, indexing, and cross-referencing. Analysis of sentence segments from respondents with similar demographics (eg age) or opinions can itself be valuable, for example in the construction of descriptive typologies of respondents.

2. Data Reduction. Data reduction techniques include stemming (reduction of a word to its root form), combining of synonyms, and removal of non-informative “tool” or stop words. Hierarchical classifications, cluster analysis, and correspondence analysis methods allow the further reduction of texts to their structural components, highlighting the distinctive points of view associated with particular groups of respondents.

3. Visualisation. Important points and interrelationships are easy to miss when read by eye, and rapid generation of visual overviews of responses (eg dendrograms, 3D scatter plots, heat maps, etc.) make large and complex datasets easier to comprehend in terms of identifying the main points of view and dimensions of a public debate.

4. Speed. Speed depends on whether a special dictionary or vocabulary needs to be compiled for the analysis, and on the amount of coding required. Coding is usually relatively fast and straightforward, and the succinct overview of responses provided by these methods can reduce the time for consultation responses.

Despite the above advantages of automated approaches to consultation analysis, text mining methods present several limitations. Automatic classification of responses runs the risk of missing or miscategorising distinctive or marginal points of view if sentence segments are too short, or if they rely on a rare vocabulary. Stemming can also generate problems if important semantic variations are overlooked (eg lumping together ‘ill+ness’, ‘ill+defined’, and ‘ill+ustration’). Other issues applicable to public e-consultation analysis include the danger that analysts distance themselves from the data, especially when converting words to numbers. This is quite apart from the issues of inter-coder reliability and data preparation, missing data, and insensitivity to figurative language, meaning and context, which can also result in misclassification when not human-verified.

However, when responding to criticisms of specific tools, we need to remember that different text mining methods are complementary, not mutually exclusive. A single solution to the analysis of qualitative or quantitative data would be very unlikely; and at the very least, exploratory techniques provide a useful first step that could be followed by a theory-testing model, or by triangulation exercises to confirm results obtained by other methods.

Apart from these technical issues, policy makers and analysts employing text mining methods for e-consultation analysis must also consider certain ethical issues in addition to those of informed consent, privacy, and confidentiality. First (of relevance to academics), respondents may not expect to end up as research subjects. They may simply be expecting to participate in a general consultation exercise, interacting exclusively with public officials and not indirectly with an analyst post hoc; much less ending up as a specific, traceable data point.

This has been a particularly delicate issue for healthcare professionals. Sharf (1999, 247) describes various negative experiences of following up online postings: one woman, on being contacted by a researcher seeking consent to gain insights from breast cancer patients about their personal experiences, accused the researcher of behaving voyeuristically and “taking advantage of people in distress.” Statistical interpretation of responses also presents its own issues, particularly if analyses are to be returned or made accessible to respondents.

Respondents might also be confused about or disagree with text mining as a method applied to their answers; indeed, it could be perceived as dehumanising—reducing personal opinions and arguments to statistical data points. In a public consultation, respondents might feel somewhat betrayed that their views and opinions eventually result in just a dot on a correspondence analysis with no immediate, apparent meaning or import, at least in lay terms. Obviously the consultation organiser needs to outline clearly and precisely how qualitative responses can be collated into a quantifiable account of a sample population’s views.

This is an important point; in order to reduce both technical and ethical risks, researchers should ensure that their methodology combines both qualitative and quantitative analyses. While many text mining techniques provide useful statistical output, the UK Government’s prescribed Code of Practice on public consultation is quite explicit on the topic: “The focus should be on the evidence given by consultees to back up their arguments. Analysing consultation responses is primarily a qualitative rather than a quantitative exercise” (2008, 12). This suggests that the perennial debate between quantitative and qualitative methodologists needs to be updated and better resolved.


Margetts, H. 2009. “The Internet and Public Policy.” Policy & Internet 1 (1).

Sharf, B. 1999. “Beyond Netiquette: The Ethics of Doing Naturalistic Discourse Research on the Internet.” In Doing Internet Research, ed. S. Jones, London: Sage.

Read the full paper: Bicquelet, A., and Weale, A. (2011) Coping with the Cornucopia: Can Text Mining Help Handle the Data Deluge in Public Policy Analysis? Policy & Internet 3 (4).

Dr Aude Bicquelet is a Fellow in LSE’s Department of Methodology. Her main research interests include computer-assisted analysis, Text Mining methods, comparative politics and public policy. She has published a number of journal articles in these areas and is the author of a forthcoming book, “Textual Analysis” (Sage Benchmarks in Social Research Methods, in press).

Responsible research agendas for public policy in the era of big data

Last week the OII went to Harvard. Against the backdrop of a gathering storm of interest around the potential of computational social science to contribute to the public good, we sought to bring together leading social science academics with senior government agency staff to discuss its public policy potential. Supported by the OII-edited journal Policy and Internet and its owners, the Washington-based Policy Studies Organization (PSO), this one-day workshop facilitated a thought-provoking conversation between leading big data researchers such as David Lazer, Brooke Foucault-Welles and Sandra Gonzalez-Bailon, e-government experts such as Cary Coglianese, Helen Margetts and Jane Fountain, and senior agency staff from US federal bureaus including Labor Statistics, Census, and the Office for the Management of the Budget.

It’s often difficult to appreciate the impact of research beyond the ivory tower, but what this productive workshop demonstrated is that policy-makers and academics share many similar hopes and challenges in relation to the exploitation of ‘big data’. Our motivations and approaches may differ, but insofar as the youth of the ‘big data’ concept explains the lack of common language and understanding, there is value in mutual exploration of the issues. Although it’s impossible to do justice to the richness of the day’s interactions, some of the most pertinent and interesting conversations arose around the following four issues.

Managing a diversity of data sources. In a world where our capacity to ask important questions often exceeds the availability of data to answer them, many participants spoke of the difficulties of managing a diversity of data sources. For agency staff this issue comes into sharp focus when available administrative data that is supposed to inform policy formulation is either incomplete or inadequate. Consider, for example, the challenge of regulating an economy in a situation of fundamental data asymmetry, where private sector institutions track, record and analyse every transaction, whilst the state only has access to far more basic performance metrics and accounts. Such asymmetric data practices also affect academic research, where once again private sector tech companies such as Google, Facebook and Twitter often offer access only to portions of their data. In both cases participants gave examples of creative solutions using merged or blended data sources, which raise significant methodological and also ethical difficulties which merit further attention. The Berkman Center’s Rob Faris also noted the challenges of combining ‘intentional’ and ‘found’ data, where the former allow far greater certainty about the circumstances of their collection.

Data dictating the questions. If participants expressed the need to expend more effort on getting the most out of available but diverse data sources, several also canvassed against the dangers of letting data availability dictate the questions that could be asked. As we’ve experienced at the OII, for example, the availability of Wikipedia or Twitter data means that questions of unequal digital access (to political resources, knowledge production etc.) can often be addressed through the lens of these applications or platforms. But these data can provide only a snapshot, and large questions of great social or political importance may not easily be answered through such proxy measurements. Similarly, big data may be very helpful in providing insights into policy-relevant patterns or correlations, such as identifying early indicators of seasonal diseases or neighbourhood decline, but seem ill-suited to answer difficult questions regarding say, the efficacy of small-scale family interventions. Just because the latter are harder to answer using currently vogue-ish tools doesn’t mean we should cease to ask these questions.

Ethics. Concerns about privacy are frequently raised as a significant limitation of the usefulness of big data. Given that with two or more data sets even supposedly anonymous data subjects may be identified, the general consensus seems to be that ‘privacy is dead.’ Whilst all participants recognised the importance of public debate around this issue, several academics and policy-makers expressed a desire to get beyond this discussion to a more nuanced consideration of appropriate ethical standards. Accountability and transparency are often held up as more realistic means of protecting citizens’ interests, but one workshop participant also suggested it would be helpful to encourage more public debate about acceptable and unacceptable uses of our data, to determine whether some uses might simply be deemed ‘off-limits’, whilst other uses could be accepted as offering few risks.

Accountability. Following on from this debate about the ethical limits of our uses of big data, discussion exposed the starkly differing standards to which government and academics (to say nothing of industry) are held accountable. As agency officials noted on several occasions it matters less what they actually do with citizens’ data, than what they are perceived to do with it, or even what it’s feared they might do. One of the greatest hurdles to be overcome here concerns the fundamental complexity of big data research, and the sheer difficulty of communicating to the public how it informs policy decisions. Quite apart from the opacity of the algorithms underlying big data analysis, the explicit focus on correlation rather than causation or explanation presents a new challenge for the justification of policy decisions, and consequently, for public acceptance of their legitimacy. As Greg Elin of Gitmachines emphasised, policy decisions are still the result of explicitly normative political discussion, but the justifiability of such decisions may be rendered more difficult given the nature of the evidence employed.

We could not resolve all these issues over the course of the day, but they served as pivot points for honest and productive discussion amongst the group. If nothing else, they demonstrate the value of interaction between academics and policy-makers in a research field where the stakes are set very high. We plan to reconvene in Washington in the spring.

*We are very grateful to the Policy Studies Organization (PSO) and the American Public University for their generous support of this workshop. The workshop “Responsible Research Agendas for Public Policy in the Era of Big Data” was held at the Harvard Faculty Club on 13 September 2013.

Also read: Big Data and Public Policy Workshop by Eric Meyer, workshop attendee and PI of the OII project Accessing and Using Big Data to Advance Social Science Knowledge.

Victoria Nash received her M.Phil in Politics from Magdalen College in 1996, after completing a First Class BA (Hons) Degree in Politics, Philosophy and Economics, before going on to complete a D.Phil in Politics from Nuffield College, Oxford University in 1999. She was a Research Fellow at the Institute of Public Policy Research prior to joining the OII in 2002. As Research and Policy Fellow at the OII, her work seeks to connect OII research with policy and practice, identifying and communicating the broader implications of OII’s research into Internet and technology use.

Online collective action and policy change: shifting contentious politics in policy processes

Research has disproved the notion held by political elites—and some researchers—that collective action resides outside of policy-making processes, and is limited in generating a response from government. The Internet can facilitate the involvement of social movements in policymaking processes, but also constitute in itself the object of contentious politics. Most research on online mobilisations focuses on the Internet as a tool for campaigning and for challenging decision-makers at the national and international level.

Meanwhile, less attention is paid on the fact that the Internet has raised new issues for the policy making agenda around which activists are mobilising, such as issues related to internet governance, online freedom of expression, digital privacy or copyright. Contemporary social movements serve as indicators of new core challenges within society, and can thus constitute an enriching resource for the policy debate arena, particularly with regards to the urgent issues raised by the fast development of the Internet. The literature on social movements is rich in examples of campaigns that have successfully influenced public policy. Classic works have proved how major reforms can start as a consequence of civic mobilisations, and the history provides evidence of the influence of collective action within policy debates on environmental, national security, and peace issues.

However, as mentioned above and argued by Giugni (2004), social movement research has traditionally paid more attention to the process rather than the outcomes of mobilisations. The difficulty of identifying the consequences of collective action and the factors that contribute to its success may lie behind this tendency. As Gamson (1975) argues, the notion of success is elusive and can be most usefully defined with reference to a set of outcomes; these may include the new advantages gained by the group’s beneficiaries after a challenge with targets, or they may refer to the status of the challenging group and its legitimacy.

As for the factors determining success, some scholars believe that collective action is likely to succeed when its claims are close to the aims of political elites. Others, like Kriesi (1995), note that government responses depend on the forms and tactics of contentious politics. Activists occupy different positions on the spectrum between radical and reformist, which in turn affects their willingness to engage with policy-makers. For instance, movements pursuing a type of ‘prefigurative’ politics tend to avoid direct contact with policy-makers, focusing instead on building alternatives which ‘prefigure’ the values that they would like to see on a grander scale.

The Internet has raised new questions around the outcomes of collective action and its interface with policy making. Internet governance, for instance, constitutes a new source of contentious politics, while the use of the Internet for the organisation of protest influences the forms of contemporary collective action. As a tool of collective action, the Internet facilitates the rapid organisation of protests around issues of public concern. Mobilisations can be organised without a formal hierarchy in place, and spread easily as online networking helps the creation of flexible ‘opt-in/opt-out’ coalitions. Information about protest can diffuse through online interpersonal networks and alternative media, and can capture the attention of mainstream news outlets. In addition, the Internet has expanded the ‘repertoire of contention’ of current movements. Petitions, direct action, and occupations now have their online counterparts with tactics such as email bombings, DDoS attacks, and e-petitions.

The affordances of online tools are thought to be affecting the characteristics of collective action and thus its capacities for policy change. Compared to past mobilisations, current movements tend to be more decentralised and flexible, constituted by loose coalitions between a plurality of actors. They are also more inclusive, addressing a diverse and at times incongruent combination of issues. As Della Porta (2005) points out, online mobilisations tend to have a more temporary and fleeting character as they can emerge and dissolve with equal speed; they are also more global in nature, since they can scale up easily and at a low cost.

At the same time, the Internet has facilitated the rise of a what Chadwick defines as new ‘hybrid’ type of civil society actor who combines traditional tactics, such as petitioning, with the more flexible forms of organising favoured by less institutional groups. Organisations like MoveOn in the U.S., Avaaz on the transnational level, and GetUp! in Australia use the power of the Internet to influence policy makers. The Internet helps such organisations to operate at a very low cost, which in turn allows them to be flexible and easily switch the focus of campaigns around issues that capture the public interest.

However, the Internet has become in itself an object of policy and further attention must be paid in considering its implications as a source of new contentious politics. Melucci (1996) argues that research on collective action should pay attention to the new types of inequality that generate contentious politics, rather than restricting its focus to the forms of mobilisations. Studies of online collective action should thus consider the Internet not only as a tool for practicing politics but also as a new “dominant discourse” which produces new claims and inequalities.

Current inequalities in governing digital mediated communication generate what Kriesi (1995) calls ‘windows of opportunities’ for collective action to play an important role in policy making on Internet-related issues. Within this framework, an increasing body of research is addressing collective action around the governance of the Internet, the regulation of free and open software, privacy and data retention, file sharing and copyright issues, and online freedom of expression as a fundamental human right.

Whether and to what extent these new types of actors and mobilisations organised through and around the Internet can have an effect on policy making is an issue requiring further research. For instance, the decentralised and temporary character of some movements , together with their lack of clearly identified leaders and spokespersons can make it difficult in establishing themselves as legitimate representatives of public opinion. New types of ‘hybrid’ actors may encounter problems in establishing themselves as legitimate representatives of public opinion. In addition, new online tactics like e-petitions, which require limited time and commitment from participants, tend to have a weaker impact on policy makers. At the same time, the technical nature of Internet regulation complicates efforts to influence public opinion due to the degree of knowledge necessary for the lay public to understand the issues at stake.

Despite these potential limitations, collective action through and around the internet can enrich policy making. It can help to connect local voices with policy makers and facilitate their involvement in policy-making processes. It can also contribute to new policy debates around urgent issues concerning the Internet. As the Policy and Internet special issue on ‘Online Collective Action and Policy Change’ explores, online collective action can thus constitute an important resource and participant in policy debates with whom policy makers should create new lines of dialogue.

Read the full article at: Calderaro, A. and Kavada A., (2013) “Challenges and Opportunities of Online Collective Action for Policy Change“, Policy and Internet 5(1).

Twitter: @andreacalderaro / @AnastasiaKavada
Web: Andrea’s Personal Page / Anastasia’s Personal Page


Giugni, Marco. 2004. Social Protest and Policy Change : Ecology, Antinuclear, and Peace Movements in Comparative Perspective. Lanham: Rowman & Littlefield.

Gamson, William A. 1975. The Strategy of Social Protest. Homewood, Ill.: Dorsey Press.

Kriesi, Hanspeter. 1995. “The Political Opportunity Structure of New Social Movements: its Impact on their Mobilisation.” In The Politics of Social Protest, eds. J. Jenkins and B. Dermans. London: UCL Press, pp. 167–198.

Della Porta, Donatella, and Mario Diani. 2006. Social Movements: An Introduction. 2nd ed. Malden, MA: Blackwell Pub.

Melucci, Alberto. 1996. Challenging Codes: Collective Action in the Information Age. Cambridge: Cambridge University Press.

Online collective action and policy change: new special issue from Policy and Internet

The Internet has multiplied the platforms available to influence public opinion and policy making. It has also provided citizens with a greater capacity for coordination and mobilisation, which can strengthen their voice and representation in the policy agenda. As waves of protest sweep both authoritarian regimes and liberal democracies, this rapidly developing field calls for more detailed enquiry. However, research exploring the relationship between online mobilisation and policy change is still limited. This special issue of ‘Policy and Internet’ addresses this gap through a variety of perspectives. Contributions to this issue view the Internet both as a tool that allows citizens to influence policy making, and as an object of new policies and regulations, such as data retention, privacy, and copyright laws, around which citizens are mobilising. Together, these articles offer a comprehensive empirical account of the interface between online collective action and policy making.

Within this framework, the first article in this issue, “Networked Collective Action and the Institutionalized Policy Debate: Bringing Cyberactivism to the Policy Arena?” by Stefania Milan and Arne Hintz (2013), looks at the Internet as both a tool of collective action and an object of policy. The authors provide a comprehensive overview of how computer-mediated communication creates not only new forms of organisational structure for collective action, but also new contentious policy fields. By focusing on what the authors define as ‘techie activists,’ Milan and Hintz explore how new grassroots actors participate in policy debates around the governance of the Internet at different levels. This article provides empirical evidence to what Kriesi et al. (1995) defines as “windows of opportunities” for collective action to contribute to the policy debate around this new space of contentious politics. Milan and Hintz demonstrate how this has happened from the first World Summit of Information Society (WSIS) in 2003 to more recent debates about Internet regulation.

Yana Breindl and François Briatte’s (2013) article “Digital Protest Skills and Online Activism Against Copyright Reform in France and the European Union” complements Milan and Hintz’s analysis by looking at how the regulation of copyright issues opens up new spaces of contentious politics. The authors compare how online and offline initiatives and campaigns in France around the “Droit d’Auteur et les Droits Voisins dans la Société de l’Information” (DADVSI) and “Haute Autorité pour la diffusion des œuvres et la protection des droits sure Internet” (HADOPI) laws, and in Europe around the Telecoms Package Reform, have contributed to the deliberations within the EU Parliament. They thus add to the rich debate on the contentious issues of intellectual property rights, demonstrating how collective action contributes to this debate at the European level.

The remaining articles in this special issue focus more on the online tactics and strategies of collective actors and the opportunities opened by the Internet for them to influence policy makers. In her article, “Activism and The Online Mediation Opportunity Structure: Attempts to Impact Global Climate Change Policies?” Julie Uldam (2013) discusses the tactics used by London-based environmental activists to influence policy making during the 17th UN climate conference (COP17) in 2011. Based on ethnographic research, Uldam traces the relationship between online modes of action and problem identification and demands. She also discusses the differences between radical and reformist activists in both their preferences for online action and their attitudes towards policy makers. Drawing on Cammaerts’ (2012) framework of the mediation opportunity structure, Uldam shows that radical activists preferred online tactics that aimed at disrupting the conference, since they viewed COP17 as representative of an unjust system. However, their lack of technical skills and resources prevented them from disrupting the conference in the virtual realm. Reformist activists, on the other hand, considered COP17 as a legitimate adversary, and attempted to influence its politics mainly through the diffusion of alternative information online.

The article by Ariadne Vromen and William Coleman (2013) “Online Campaigning Organizations and Storytelling Strategies: GetUp! Australia,” also investigates a climate change campaign but shifts the focus to the new ‘hybrid’ collective actors, who use the Internet extensively for campaigning. Based on a case study of GetUp!, Vromen and Coleman examine the storytelling strategies employed by the organisation in two separate campaigns, one around climate change, the other around mental health. The authors investigate the factors that led one campaign to be successful and the other to have limited resonance. They also skilfully highlight the difficulties encountered by new collective actors to gain legitimacy and influence policy making. In this respect, GetUp! used storytelling to set itself apart from traditional party-based politics and to emphasise its identity as an organiser and representative of grassroots communities, rather than as an insider lobbyist or disruptive protestor.

Romain Badouard and Laurence Monnoyer-Smith (2013), in their article “Hyperlinks as Political Resources: The European Commission Confronted with Online Activism,” explore some of the more structured ways in which citizens use online tools to engage with policy makers. They investigate the political opportunities offered by the e-participation and e-government platforms of the European Commission for activists wishing to make their voice heard in the European policy making sphere. They focus particularly on strategic uses of web technical resources and hyperlinks, which allows citizens to refine their proposals and thus increase their influence on European policy.

Finally, Jo Bates’ (2013) article “The Domestication of Open Government Data Advocacy in the UK: A Neo-Gramscian Analysis” provides a pertinent framework that facilitates our understanding of the policy challenges posed by the issue of open data. The digitisation of data offers new opportunities for increasing transparency; traditionally considered a fundamental public good. By focusing on the Open Data Government initiative in the UK, Bates explores the policy challenges generated by increasing transparency via new Internet platforms by applying the established theoretical instruments of Gramscian ‘Trasformismo.’ This article frames the open data debate in terms consistent with the literature on collective action, and provides empirical evidence as to how citizens have taken an active role in the debate on this issue, thereby challenging the policy debate on public transparency.

Taken together, these articles advance our understanding of the interface between online collective action and policy making. They introduce innovative theoretical frameworks and provide empirical evidence around the new forms of collective action, tactics, and contentious politics linked with the emergence of the Internet. If, as Melucci (1996) argues, contemporary social movements are sensors of new challenges within current societies, they can be an enriching resource for the policy debate arena. Gaining a better understanding of how the Internet might strengthen this process is a valuable line of enquiry.

Read the full article at: Calderaro, A. and Kavada A., (2013) “Challenges and Opportunities of Online Collective Action for Policy Change“, Policy and Internet 5(1).

Twitter: @AnastasiaKavada / @andreacalderaro
Web: Anastasia’s Personal Page / Andrea’s Personal Page


Badouard, R., and Monnoyer-Smith, L. 2013. Hyperlinks as Political Resources: The European Commission Confronted with Online Activism. Policy and Internet 5(1).

Bates, J. 2013. The Domestication of Open Government Data Advocacy in the UK: A Neo-Gramscian Analysis. Policy and Internet 5(1).

Breindl, Y., and Briatte, F. 2013. Digital Protest Skills and Online Activism Against Copyright Reform in France and the European Union. Policy and Internet 5(1).

Cammaerts, Bart. 2012. “Protest Logics and the Mediation Opportunity Structure.” European Journal of Communication 27(2): 117–134.

Kriesi, Hanspeter. 1995. “The Political Opportunity Structure of New Social Movements: its Impact on their Mobilization.” In The Politics of Social Protest, eds. J. Jenkins and B. Dermans. London: UCL Press, pp. 167–198.

Melucci, Alberto. 1996. Challenging Codes: Collective Action in the Information Age. Cambridge: Cambridge University Press.

Milan, S., and Hintz, A. 2013. Networked Collective Action and the Institutionalized Policy Debate: Bringing Cyberactivism to the Policy Arena? Policy and Internet 5(1).

Uldam, J. 2013. Activism and the Online Mediation Opportunity Structure: Attempts to Impact Global Climate Change Policies? Policy and Internet 5(1).

Vromen, A., and Coleman, W. 2013. Online Campaigning Organizations and Storytelling Strategies: GetUp! in Australia. Policy and Internet 5(1).