topic modelling

Exploring what sorts of reactions people might have to examples of assault and how they might differ online and offline.

Conversation between Laura Bates, Judy Wajcman (speaking) and Helen Margetts, at the Everyday Sexism Datahack, organised by the OII to encourage creative engagement with the textual data gathered by the Everyday Sexism project.

To encourage new ways of thinking about the problem of sexism in daily life, the OII’s recent Everyday Sexism Datahack brought together twenty people from a range of disciplinary backgrounds to analyse the written accounts of sexism and harassment gathered by the Everyday Sexism project. Founded by Laura Bates in 2012, Everyday Sexism has gathered more than 120,000 accounts submitted by members of the public. A research team at the OII has already been analysing the content, and provided cleaned data to the datahack participants that could be analysed through qualitative and quantitative methods. Following an introduction to the project by Laura Bates, an outline of the dataset by Taha Yasseri, and a speed-networking session led by Kathryn Eccles we fell into two teams to work with the data. Our own group wanted to examine the question of how people interact with the threat of public space. We were also interested in how public space is divided between online and offline, and the social perception of being online versus offline. We wanted to explore what sorts of reactions people might have to examples of assault, or strategies or things they might do in response to something happening to them—and how they might differ online and offline. We spent the first hour collecting keywords that might indicate reactions to either online or offline harassment, including identifying a perceived threat and coping with it. We then searched the raw data for responses like “I tried to ignore it,” “I felt safe/unsafe,” “I identified a risk,” “I was feeling worried, feeling anxious or nervous”; and also looked at online versus offline actions. So for online action we were looking for specific platforms being named, and people saying things like “comment, response, delete, remove” in relation to social media posts. For offline we were looking for things like “I carried a [specific item]” or “I hid or avoided certain areas“ or “I walked faster”…

How does the topic modelling algorithm ‘discover’ the topics within the context of everyday sexism?

We recently announced the start of an exciting new research project that will involve the use of topic modelling in understanding the patterns in submitted stories to the Everyday Sexism website. Here, we briefly explain our text analysis approach, “topic modelling”. At its very core, topic modelling is a technique that seeks to automatically discover the topics contained within a group of documents. ‘Documents’ in this context could refer to text items as lengthy as individual books, or as short as sentences within a paragraph. Let’s take the idea of sentences-as-documents as an example: Document 1: I like to eat kippers for breakfast. Document 2: I love all animals, but kittens are the cutest. Document 3: My kitten eats kippers too. Assuming that each sentence contains a mixture of different topics (and that a ‘topic’ can be understood as a collection of words (of any part of speech) that have different probabilities of appearance in passages discussing the topic), how does the topic modelling algorithm ‘discover’ the topics within these sentences? The algorithm is initiated by setting the number of topics that it needs to extract. Of course, it is hard to guess this number without having an insight on the topics, but one can think of this as a resolution tuning parameter. The smaller the number of topics is set, the more general the bag of words in each topic would be, and the looser the connections between them. The algorithm loops through all of the words in each document, assigning every word to one of our topics in a temporary and semi-random manner. This initial assignment is arbitrary and it is easy to show that different initialisations lead to the same results in long run. Once each word has been assigned a temporary topic, the algorithm then re-iterates through each word in each document to update the topic assignment using two criteria: 1) How prevalent is the word in question across topics? And 2) How prevalent are the…

What are the most common types of sexism globally, and (how) do they relate to each other? Do experiences of sexism change from one country to another?

When barrister Charlotte Proudman recently spoke out regarding a sexist comment that she had received on the professional networking website LinkedIn, hundreds of women praised her actions in highlighting the issue of workplace sexism—and many of them began to tell similar stories of their own. It soon became apparent that Proudman was not alone in experiencing this kind of sexism, a fact further corroborated by Laura Bates of the Everyday Sexism Project, who asserted that workplace harassment is “the most reported kind of incident” on the project’s UK website. Proudman’s experience and Bates’ comments on the number of submissions to her site concerning harassment at work provokes a conversation about the nature of sexism, not only in the UK but also at a global level. We know that since its launch in 2012, the Everyday Sexism Project has received over 100,000 submissions in more than 13 different languages, concerning a variety of topics. But what are these topics? As Bates has stated, in the UK, workplace sexism is the most commonly discussed subject on the website – but is this also the case for the Everyday Sexism sites in France, Japan, or Brazil? What are the most common types of sexism globally, and (how) do they relate to each other? Do experiences of sexism change from one country to another? The multi-lingual reports submitted to the Everyday Sexism project are undoubtedly a gold mine of crowdsourced information with great potential for answering important questions about instances of sexism worldwide, as well as drawing an overall picture of how sexism is experienced in different societies. So far much of the research relating to the Everyday Sexism project has focused on qualitative content analysis, and has been limited to the submissions written in English. Along with Principal Investigators Taha Yasseri and Kathryn Eccles, I will be acting as Research Assistant on a new project funded by the John Fell Oxford University Press…