Governance & Security, Interviews, Methods

How policy makers can extract meaningful public opinion data from social media to inform their actions

by David Sutcliffe 07/07/2017

Examining how the information available on social media can support the actions of politicians and bureaucrats along the policy cycle.

by David Sutcliffe 07/07/2017

Social media analysis can provide insight into the mobilisation processes of stakeholders in response to government actions. Image of No-TAV protestors by Darren Johnson (Flickr: CC BY-NC-ND 2.0).

The role of social media in fostering the transparency of governments and strengthening the interaction between citizens and public administrations has been widely studied. Scholars have highlighted how online citizen-government and citizen-citizen interactions favour debates on social and political matters, and positively affect citizens’ interest in political processes, like elections, policy agenda setting, and policy implementation. However, while top-down social media communication between public administrations and citizens has been widely examined, the bottom-up side of this interaction has been largely overlooked. In their Policy & Internet article “The ‘Social Side’ of Public Policy: Monitoring Online Public Opinion and Its Mobilisation During the Policy Cycle,” Andrea Ceron and Fedra Negri aim to bridge the gap between knowledge and practice, by examining how the information available on social media can support the actions of politicians and bureaucrats along the policy cycle. Policymakers, particularly politicians, have always been interested in knowing citizens’ preferences, in measuring their satisfaction and in receiving feedback on their activities. Using the technique of Supervised Aggregated Sentiment Analysis, the authors show that meaningful information on public services, programmes, and policies can be extracted from the unsolicited comments posted by social media users, particularly those posted on Twitter. They use this technique to extract and analyse citizen opinion on two major public policies (on labour market reform and school reform) that drove the agenda of the Matteo Renzi cabinet in Italy between 2014 and 2015. They show how online public opinion reacted to the different policy alternatives formulated and discussed during the adoption of the policies. They also demonstrate how social media analysis allows monitoring of the mobilisation and de-mobilisation processes of rival stakeholders in response to the various amendments adopted by the government, with results comparable to those of a survey and a public consultation that were undertaken by the government. We caught up with the authors to discuss their findings: Ed.: You say that this form of opinion…

Interviews, Politics & Government, Social Data Science

Did you consider Twitter’s (lack of) representativeness before doing that predictive study?

by David Sutcliffe 10/04/2017

Do Twitter users share identical characteristics with the population interest? For what populations are Twitter data actually appropriate?

by David Sutcliffe 10/04/2017

Twitter data have many qualities that appeal to researchers, but are probably not suitable for research where representativeness is important. Image: Bernard Goldbach (Flickr).

Twitter data have many qualities that appeal to researchers. They are extraordinarily easy to collect. They are available in very large quantities. And with a simple 140-character text limit they are easy to analyse. As a result of these attractive qualities, over 1,400 papers have been published using Twitter data, including many attempts to predict disease outbreaks, election results, film box office gross, and stock market movements solely from the content of tweets. Easy availability of Twitter data links nicely to a key goal of computational social science. If researchers can find ways to impute user characteristics from social media, then the capabilities of computational social science would be greatly extended. However few papers consider the digital divide among Twitter users. But the question of who uses Twitter has major implications for research attempts to use the content of tweets for inference about population behaviour. Do Twitter users share identical characteristics with the population interest? For what populations are Twitter data actually appropriate? A new article by Grant Blank published in Social Science Computer Review provides a multivariate empirical analysis of the digital divide among Twitter users, comparing Twitter users and nonusers with respect to their characteristic patterns of Internet activity and to certain key attitudes. It thereby fills a gap in our knowledge about an important social media platform, and it joins a surprisingly small number of studies that describe the population that uses social media. Comparing British (OxIS survey) and US (Pew) data, Grant finds that generally, British Twitter users are younger, wealthier, and better educated than other Internet users, who in turn are younger, wealthier, and better educated than the offline British population. American Twitter users are also younger and wealthier than the rest of the population, but they are not better educated. Twitter users are disproportionately members of elites in both countries. Twitter users also differ from other groups in their online activities and their attitudes.…

Methods, Politics & Government

How easy is it to research the Chinese web?

by Han-Teng Liao 18/02/2014

The research expectations seem to be that control and intervention by Beijing will be most likely on political and cultural topics, not likely on economic or entertainment ones.

by Han-Teng Liao 18/02/2014

Access to data from the Chinese Web, like other Web data, depends on platform policies, the level of data openness, and the availability of data intermediary and tools. Image of a Chinese Internet cafe by Hal Dick.

Ed: How easy is it to request or scrape data from the “Chinese Web”? And how much of it is under some form of government control? Han-Teng: Access to data from the Chinese Web, like other Web data, depends on the policies of platforms, the level of data openness, and the availability of data intermediary and tools. All these factors have direct impacts on the quality and usability of data. Since there are many forms of government control and intentions, increasingly not just the websites inside mainland China under Chinese jurisdiction, but also the Chinese “soft power” institutions and individuals telling the “Chinese story” or “Chinese dream” (as opposed to “American dreams”), it requires case-by-case research to determine the extent and level of government control and interventions. Based on my own research on Chinese user-generated encyclopaedias and Chinese-language twitter and Weibo, the research expectations seem to be that control and intervention by Beijing will be most likely on political and cultural topics, not likely on economic or entertainment ones. This observation is linked to how various forms of government control and interventions are executed, which often requires massive data and human operations to filter, categorise and produce content that are often based on keywords. It is particularly true for Chinese websites in mainland China (behind the Great Firewall, excluding Hong Kong and Macao), where private website companies execute these day-to-day operations under the directives and memos of various Chinese party and government agencies. Of course there is some extra layer of challenges if researchers try to request content and traffic data from the major Chinese websites for research, especially regarding censorship. Nonetheless, since most Web content data is open, researchers such as Professor Fu in Hong Kong University manage to scrape data sample from Weibo, helping researchers like me to access the data more easily. These openly collected data can then be used to measure potential government control, as has…

Mapping, Methods, Politics & Government, Social Data Science

Mapping collective public opinion in the Russian blogosphere

by Olessia Koltsova 10/02/2014

The Russian language blogosphere counts about 85 million blogs—an amount far beyond the capacities of any government to control—and is thereby able to function as a mass medium of “public opinion” and also to exercise influence.

by Olessia Koltsova 10/02/2014

Widely reported as fraudulent, the 2011 Russian Parliamentary elections provoked mass street protest action by tens of thousands of people in Moscow and cities and towns across Russia. Image by Nikolai Vassiliev.

Blogs are becoming increasingly important for agenda setting and formation of collective public opinion on a wide range of issues. In countries like Russia where the Internet is not technically filtered, but where the traditional media is tightly controlled by the state, they may be particularly important. The Russian language blogosphere counts about 85 million blogs—an amount far beyond the capacities of any government to control—and the Russian search engine Yandex, with its blog rating service, serves as an important reference point for Russia’s educated public in its search of authoritative and independent sources of information. The blogosphere is thereby able to function as a mass medium of “public opinion” and also to exercise influence. One topic that was particularly salient over the period we studied concerned the Russian Parliamentary elections of December 2011. Widely reported as fraudulent, they provoked immediate and mass street protest action by tens of thousands of people in Moscow and cities and towns across Russia, as well as corresponding activity in the blogosphere. Protesters made effective use of the Internet to organise a movement that demanded cancellation of the parliamentary election results, and the holding of new and fair elections. These protests continued until the following summer, gaining widespread national and international attention. Most of the political and social discussion blogged in Russia is hosted on the blog platform LiveJournal. Some of these bloggers can claim a certain amount of influence; the top thirty bloggers have over 20,000 “friends” each, representing a good circulation for the average Russian newspaper. Part of the blogosphere may thereby resemble the traditional media; the deeper into the long tail of average bloggers, however, the more it functions as more as pure public opinion. This “top list” effect may be particularly important in societies (like Russia’s) where popularity lists exert a visible influence on bloggers’ competitive behaviour and on public perceptions of their significance. Given the influence of these top…

Methods, Social Data Science

The physics of social science: using big data for real-time predictive modelling

by taha yasseri 21/11/2013

There are very interesting examples of using big data to make predictions about disease outbreaks, financial moves in the markets, social interactions based on human mobility patterns, election results, etc.

by taha yasseri 21/11/2013

Ed: You are interested in analysis of big data to understand human dynamics; how much work is being done in terms of real-time predictive modelling using these data? Taha: The socially generated transactional data that we call “big data” have been available only very recently; the amount of data we now produce about human activities in a year is comparable to the amount that used to be produced in decades (or centuries). And this is all due to recent advancements in ICTs. Despite the short period of availability of big data, the use of them in different sectors including academia and business has been significant. However, in many cases, the use of big data is limited to monitoring and post hoc analysis of different patterns. Predictive models have been rarely used in combination with big data. Nevertheless, there are very interesting examples of using big data to make predictions about disease outbreaks, financial moves in the markets, social interactions based on human mobility patterns, election results, etc. Ed: What were the advantages of using Wikipedia as a data source for your study—as opposed to Twitter, blogs, Facebook or traditional media, etc.? Taha: Our results have shown that the predictive power of Wikipedia page view and edit data outperforms similar box office-prediction models based on Twitter data. This can partially be explained by considering the different nature of Wikipedia compared to social media sites. Wikipedia is now the number one source of online information, and Wikipedia article page view statistics show how much Internet users have been interested in knowing about a specific movie. And the edit counts—even more importantly—indicate the level of interest of the editors in sharing their knowledge about the movies with others. Both indicators are much stronger than what you could measure on Twitter, which is mainly the reaction of the users after watching or reading about the movie. The cost of participation in Wikipedia’s editorial process…