Articles

How does the topic modelling algorithm ‘discover’ the topics within the context of everyday sexism?

We recently announced the start of an exciting new research project that will involve the use of topic modelling in understanding the patterns in submitted stories to the Everyday Sexism website. Here, we briefly explain our text analysis approach, “topic modelling”. At its very core, topic modelling is a technique that seeks to automatically discover the topics contained within a group of documents. ‘Documents’ in this context could refer to text items as lengthy as individual books, or as short as sentences within a paragraph. Let’s take the idea of sentences-as-documents as an example: Document 1: I like to eat kippers for breakfast. Document 2: I love all animals, but kittens are the cutest. Document 3: My kitten eats kippers too. Assuming that each sentence contains a mixture of different topics (and that a ‘topic’ can be understood as a collection of words (of any part of speech) that have different probabilities of appearance in passages discussing the topic), how does the topic modelling algorithm ‘discover’ the topics within these sentences? The algorithm is initiated by setting the number of topics that it needs to extract. Of course, it is hard to guess this number without having an insight on the topics, but one can think of this as a resolution tuning parameter. The smaller the number of topics is set, the more general the bag of words in each topic would be, and the looser the connections between them. The algorithm loops through all of the words in each document, assigning every word to one of our topics in a temporary and semi-random manner. This initial assignment is arbitrary and it is easy to show that different initialisations lead to the same results in long run. Once each word has been assigned a temporary topic, the algorithm then re-iterates through each word in each document to update the topic assignment using two criteria: 1) How prevalent is the word in question across topics? And 2) How prevalent are the…

Homejoy was slated to become the Uber of domestic cleaning services. It was a platform that allowed customers to summon a cleaner as easily as they could hail a ride. Why did it fail to achieve success?

Homejoy CEO Adora Cheung appears on stage at the 2014 TechCrunch Disrupt Europe/London, at The Old Billingsgate on October 21, 2014 in London, England. Image: TechCruch (Flickr)

Platforms that enable users to come together and  buy/sell services with confidence, such as Uber, have become remarkably popular, with the companies often transforming the industries they enter. In this blog post the OII’s Vili Lehdonvirta analyses why the domestic cleaning platform Homejoy failed to achieve such success. He argues that when buyer and sellers enter into repeated transactions they can communicate directly, and as such often abandon the platform. Homejoy was slated to become the Uber of domestic cleaning services. It was a platform that allowed customers to summon a cleaner as easily as they could hail a ride. Regular cleanups were just as easy to schedule. Ratings from previous clients attested to the skill and trustworthiness of each cleaner. There was no need to go through a cleaning services agency, or scour local classifieds to find a cleaner directly: the platform made it easy for both customers and people working as cleaners to find each other. Homejoy made its money by taking a cut out of each transaction. Given how incredibly successful Uber and Airbnb had been in applying the same model to their industries, Homejoy was widely expected to become the next big success story. It was to be the next step in the inexorable uberisation of every industry in the economy. On 17 July 2015, Homejoy announced that it was shutting down. Usage had grown slower than expected, revenues remained poor, technical glitches hurt operations, and the company was being hit with lawsuits on contractor misclassification. Investors’ money and patience had finally ran out. Journalists wrote interesting analyses of Homejoy’s demise (Forbes, TechCrunch, Backchannel). The root causes of any major business failure (or indeed success) are complex and hard to pinpoint. However, one of the possible explanations identified in these stories stands out, because it corresponds strongly with what theory on platforms and markets could have predicted. Homejoy wasn’t growing and making money because clients and cleaners were taking their relationships off-platform:…

Exploring the complexities of policing the web for extremist material, and its implications for security, privacy and human rights.

In terms of counter-speech there are different roles for government, civil society, and industry. Image by Miguel Discart (Flickr).

The Internet serves not only as a breeding ground for extremism, but also offers myriad data streams which potentially hold great value to law enforcement. The report by the OII’s Ian Brown and Josh Cowls for the VOX-Pol project: Check the Web: Assessing the Ethics and Politics of Policing the Internet for Extremist Material explores the complexities of policing the web for extremist material, and its implications for security, privacy and human rights. Josh Cowls discusses the report with blog editor Bertie Vidgen.* *please note that the views given here do not necessarily reflect the content of the report, or those of the lead author, Ian Brown. Ed: Josh, could you let us know the purpose of the report, outline some of the key findings, and tell us how you went about researching the topic? Josh: Sure. In the report we take a step back from the ground-level question of ‘what are the police doing?’ and instead ask, ‘what are the ethical and political boundaries, rationale and justifications for policing the web for these kinds of activity?’ We used an international human rights framework as an ethical and legal basis to understand what is being done. We also tried to further the debate by clarifying a few things: what has already been done by law enforcement, and, really crucially, what the perspectives are of all those involved, including lawmakers, law enforcers, technology companies, academia and many others. We derived the insights in the report from a series of workshops, one of which was held as part of the EU-funded VOX-Pol network. The workshops involved participants who were quite high up in law enforcement, the intelligence agencies, the tech industry civil society, and academia. We followed these up with interviews with other individuals in similar positions and conducted background policy research. Ed: You highlight that many extremist groups (such as Isis) are making really significant use of online platforms to organise,…

For data sharing between organisations to be straight forward, there needs to a common understanding of basic policy and practice.

Many organisations are coming up with their own internal policy and guidelines for data sharing. However, for data sharing between organisations to be straight forward, there needs to a common understanding of basic policy and practice. During her time as an OII Visiting Associate, Alison Holt developed a pragmatic solution in the form of a Voluntary Code, anchored in the developing ISO standards for the Governance of Data. She discusses the voluntary code, and the need to provide urgent advice to organisations struggling with policy for sharing data. Collecting, storing and distributing digital data is significantly easier and cheaper now than ever before, in line with predictions from Moore, Kryder and Gilder. Organisations are incentivised to collect large volumes of data with the hope of unleashing new business opportunities or maybe even new businesses. Consider the likes of Uber, Netflix, and Airbnb and the other data mongers who have built services based solely on digital assets. The use of this new abundant data will continue to disrupt traditional business models for years to come, and there is no doubt that these large data volumes can provide value. However, they also bring associated risks (such as unplanned disclosure and hacks) and they come with constraints (for example in the form of privacy or data protection legislation). Hardly a week goes by without a data breach hitting the headlines. Even if your telecommunications provider didn’t inadvertently share your bank account and sort code with hackers, and your child wasn’t one of the hundreds of thousands of children whose birthdays, names, and photos were exposed by a smart toy company, you might still be wondering exactly how your data is being looked after by the banks, schools, clinics, utility companies, local authorities and government departments that are so quick to collect your digital details. Then there are the companies who have invited you to sign away the rights to your data and possibly your…

Government involvement in crowdsourcing efforts can actually be used to control and regulate volunteers from the top down—not just to “mobilise them”.

RUSSIA, NEAR RYAZAN - 8 MAY 2011: Piled up wood in the forest one winter after a terribly huge forest fire in Russia in year 2010. Image: Max Mayorov (Flickr).

There is a great deal of interest in the use of crowdsourcing tools and practices in emergency situations. Gregory Asmolov’s article Vertical Crowdsourcing in Russia: Balancing Governance of Crowds and State–Citizen Partnership in Emergency Situations (Policy and Internet 7,3) examines crowdsourcing of emergency response in Russia in the wake of the devastating forest fires of 2010. Interestingly, he argues that government involvement in these crowdsourcing efforts can actually be used to control and regulate volunteers from the top down—not just to “mobilise them”. My interest in the role of crowdsourcing tools and practices in emergency situations was triggered by my personal experience. In 2010 I was one of the co-founders of the Russian “Help Map” project, which facilitated volunteer-based response to wildfires in central Russia. When I was working on this project, I realised that a crowdsourcing platform can bring the participation of the citizen to a new level and transform sporadic initiatives by single citizens and groups into large-scale, relatively well coordinated operations. What was also important was that both the needs and the forms of participation required in order to address these needs be defined by the users themselves. To some extent the citizen-based response filled the gap left by the lack of a sufficient response from the traditional institutions.[1] This suggests that the role of ICTs in disaster response should be examined within the political context of the power relationship between members of the public who use digital tools and the traditional institutions. My experience in 2010 was the first time I was able to see that, while we would expect that in a case of natural disaster both the authorities and the citizens would be mostly concerned about the emergency, the actual situation might be different. Apparently the emergence of independent, citizen-based collective action in response to a disaster was considered as some type of threat by the institutional actors. First, it was a threat to…

Exploring how involvement in the citizen initiatives affects attitudes towards democracy

Crowdsourcing legislation is an example of a democratic innovation that gives citizens a say in the legislative process. In their Policy and Internet journal article ‘Does Crowdsourcing Legislation Increase Political Legitimacy? The Case of Avoin Ministeriö in Finland’, Henrik Serup Christensen, Maija Karjalainen and Laura Nurminen explore how involvement in the citizen initiatives affects attitudes towards democracy. They find that crowdsourcing citizen initiatives can potentially strengthen political legitimacy, but both outcomes and procedures matter for the effects. Crowdsourcing is a recent buzzword that describes efforts to use the Internet to mobilise online communities to achieve specific organisational goals. While crowdsourcing serves several purposes, the most interesting potential from a democratic perspective is the ability to crowdsource legislation. By giving citizens the means to affect the legislative process more directly, crowdsourcing legislation is an example of a democratic innovation that gives citizens a say in the legislative process. Recent years have witnessed a scholarly debate on whether such new forms of participatory governance can help cure democratic deficits such as a declining political legitimacy of the political system in the eyes of the citizenry. However, it is still not clear how taking part in crowdsourcing affects the political attitudes of the participants, and the potential impact of such democratic innovations therefore remain unclear. In our study, we contribute to this research agenda by exploring how crowdsourcing citizens’ initiatives affected political attitudes in Finland. The non-binding Citizens’ Initiative instrument in Finland was introduced in spring 2012 to give citizens the chance to influence the agenda of the political decision making. In particular, we zoom in on people active on the Internet website Avoin Ministeriö (Open Ministry), which is a site based on the idea of crowdsourcing where users can draft citizens’ initiatives and deliberate on their contents. As is frequently the case for studies of crowdsourcing, we find that only a small portion of the users are actively involved in the crowdsourcing process. The option to deliberate…

Discussing the digitally crowdsourced law for same-sex marriage that was passed in Finland and analysing how the campaign created practices that affect democratic citizenship.

There is much discussion about a perceived “legitimacy crisis” in democracy. In his article The Rise of the Mediating Citizen: Time, Space, and Citizenship in the Crowdsourcing of Finnish Legislation, Taneli Heikka (University of Jyväskylä) discusses the digitally crowdsourced law for same-sex marriage that was passed in Finland in 2014, analysing how the campaign used new digital tools and created practices that affect democratic citizenship and power making. Ed: There is much discussion about a perceived “legitimacy crisis” in democracy. For example, less than half of the Finnish electorate under 40 choose to vote. In your article you argue that Finland’s 2012 Citizens’ Initiative Act aimed to address this problem by allowing for the crowdsourcing of ideas for new legislation. How common is this idea? (And indeed, how successful?) Taneli: The idea that digital participation could counter the “legitimacy crisis” is a fairly common one. Digital utopians have nurtured that idea from the early years of the internet, and have often been disappointed. A couple of things stand out in the Finnish experiment that make it worth a closer look. First, the digital crowdsourcing system with strong digital identification is a reliable and potentially viral campaigning tool. Most civic initiative systems I have encountered rely on manual or otherwise cumbersome, and less reliable, signature collection methods. Second, in the Finnish model, initiatives that break the threshold of 50,000 names must be treated in the Parliament equally to an initiative from a group of MPs. This gives the initiative constitutional and political weight. Ed: The Act led to the passage of Finland’s first equal marriage law in 2014. In this case, online platforms were created for collecting signatures as well as drafting legislation. An NGO created a well-used platform, but it subsequently had to shut it down because it couldn’t afford the electronic signature system. Crowds are great, but not a silver bullet if something as prosaic as authentication is impossible. Where should the…

How do you increase the quality of feedback without placing citizens on different-level playing fields from the outset—particularly where technology is concerned?

Ed: Given the “crisis in democratic accountability”, methods to increase citizen participation are in demand. To this end, your team developed some interactive crowdsourcing technologies to collect public opinion around an urban renovation project in Oulu, Finland. What form did the consultation take, and how did you assess its impact? Simo: Over the years we’ve deployed various types of interactive interfaces on a network of public displays. In this case it was basically a network of interactive screens deployed in downtown Oulu, next to where a renovation project was happening that we wanted to collect feedback about. We deployed an app on the screens, that allowed people to type feedback directly on the screens (on-screen soft keyboard), and submit feedback to city authorities via SMS, Twitter and email. We also had a smiley-based “rating” system there, which people could us to leave quick feedback about certain aspects of the renovation project. We ourselves could not, and did not even want to, assess the impact—that’s why we did this in partnership with the city authorities. Then, together with the city folks we could better evaluate if what we were doing had any real-world value whatsoever. And, as we discuss, in the end it did! Ed: How did you go about encouraging citizens to engage with touch screen technologies in a public space—particularly the non-digitally literate, or maybe people who are just a bit shy about participating? Simo: Actually, the whole point was that we did not deliberately encourage them by advertising the deployment or by “forcing” anyone to use it. Quite to the contrary: we wanted to see if people voluntarily used it, and the technologies that are an integral part of the city itself. This is kind of the future vision of urban computing, anyway. The screens had been there for years already, and what we wanted to see is if people find this type of service on their own when…

Assessing the extent to which crowdsourcing represents an emerging opportunity of participation in global public policymaking.

What are the linkages between multistakeholder governance and crowdsourcing? Both are new—trendy, if you will—approaches to governance premised on the potential of collective wisdom, bringing together diverse groups in policy-shaping processes. Their interlinkage has remained under explored so far. Our article recently published in Policy and Internet sought to investigate this in the context of Internet governance, in order to assess the extent to which crowdsourcing represents an emerging opportunity of participation in global public policymaking. We examined two recent Internet governance initiatives which incorporated crowdsourcing with mixed results: the first one, the ICANN Strategy Panel on Multistakeholder Innovation, received only limited support from the online community; the second, NETmundial, had a significant number of online inputs from global stakeholders who had the opportunity to engage using a platform for political participation specifically set up for the drafting of the outcome document. The study builds on these two cases to evaluate how crowdsourcing was used as a form of public consultation aimed at bringing the online voice of the “undefined many” (as opposed to the “elected few”) into Internet governance processes. From the two cases, it emerged that the design of the consultation processes conducted via crowdsourcing platforms is key in overcoming barriers of participation. For instance, in the NETmundial process, the ability to submit comments and participate remotely via www.netmundial.br attracted inputs from all over the world very early on, since the preparatory phase of the meeting. In addition, substantial public engagement was obtained from the local community in the drafting of the outcome document, through a platform for political participation—www.participa.br—that gathered comments in Portuguese. In contrast, the outreach efforts of the ICANN Strategy Panel on Multistakeholder Innovation remained limited; the crowdsourcing platform they used only gathered input (exclusively in English) from a small group of people, insufficient to attribute to online public input a significant role in the reform of ICANN’s multistakeholder processes. Second, questions around how crowdsourcing should…