Recap: Digital Methods Colloquium (December 7, 2023)

Digital and computational data collection and analysis methods such as mobile/internet tracking, experience sampling, web scraping, text mining, machine learning, and image recognition have become more relevant than ever in the social sciences. While these methods enable new avenues of inquiry, they also present many challenges. It is important to share and discuss research, experiences, and challenges surrounding these methods with other researchers to exchange ideas and to learn from experiences.

For this reason, Roland Toth from the Methods Lab and research fellow Douglas Parry organized the Digital Methods Colloquium that took place on December 7 at the Weizenbaum Institute. They invited researchers from all over Germany who had used such methods before. The focus lied on sharing not only successes, but – even more so – the challenges that they had experienced in the research process.

In the first part of the colloquium, participants presented recent or past research projects for which they had used digital methods. The presentations covered various methods, including experience sampling, mobile logging/tracking, multimodal content classification, network analysis, and large language models. All presentations were received very well and led to high engagement with many questions and exchanges from the participants.

The second part of the colloquium was designed to facilitate interactive discussion and knowledge sharing among the participants. They were assigned to one of two discussion groups that focused on either data collection or data analysis in the context of digital methods. In each group, participants followed prompts and discussed urgent issues and possible solutions, which they then visualized using posters. Finally, both groups sat together and presented the posters to each other, leading to a final discussion. After a short wrap-up, some participants joined the hosts at the Christmas Market for a well-deserved hot beverage.

The hosts would like to thank all participants for attending and engaging in the Digital Methods Colloquium. Bringing together researchers from different fields demonstrated that there are more commonalities than differences when it comes to the challenging and exciting field of digital methods. We are looking forward to more exchange and, possibly, Part 2 of the Digital Methods Colloquium sometime in the future.

Workshop Recap: A Practical Introduction to Text Analysis (November 30, 2023)

On November 30th, 2023, the Methods Lab organized a workshop on quantitative text analysis. The workshop was conducted by Douglas Parry (Stellenbosch University) and covered the whole process of text analysis from data preparation to the visualization of sentiments or topics identified.

In the first half of the workshop, Douglas covered the first steps involved in text analysis, such as tokenization (the transformation of texts into smaller parts like single words or consecutive words), the removal of “stop words” (words that do not contain meaningful information), and the aggregation of content by meta-information (authors, books, chapters, etc.). Apart from the investigation of the frequency with which terms occur, sentiment analysis using existing dictionaries was also addressed. This technique involves assigning values to each word representing certain targeted characteristics (e.g., emotionality/polarity), which in turn allows for comparing overall sentiments between different corpora. Finally, the visualization of word occurrences and sentiments was covered. After this introduction, participants had the chance to apply their knowledge using the programming language R by solving tasks with texts Douglas provided.

In the second half of the workshop, Douglas focused on different methods of topic modeling, which ultimately attempt to assign texts to latent topics based on the words they contain. In comparison to simpler procedures covered in the first half of the workshop, topic models can also consider the context of words within the texts. Specifically, Douglas introduced participants to Latent Dirichlet Allocation (LDA), Correlated Topic Modeling (CTM), and Structural Topic Modeling (STM). One of the most important decisions to be made for any such model is the number of topics to emerge: too few may dilute nuances within topics and too many may lead to redundancies. The visualization and – most importantly – limitations of topic modeling were also discussed before participants performed topic modeling themselves with the data provided earlier. Finally, Douglas concluded with a summary of everything covered and an overview of advanced subjects in text analysis.

The workshop was very well-received and prepared all participants for text analysis in the future. Douglas balanced lecture-style sections and well-prepared, hands-on application very well and provided all materials in a way that participants could focus on the tasks at hand, while following a logical structure throughout. We would like to thank him for this great introduction to text analysis!

First Research Fellow at the Methods Lab

The Methods Lab is excited to welcome its first research fellow who arrived at the Weizenbaum Institute on November 20: Douglas Parry from Stellenbosch University, South Africa. His research focus lies on Socio-Informatics in the area of Communication Science, Human-Computer Interaction, and Media/CyberPsychology.

During his 4-week stay, Douglas Parry will contribute to work at the Methods Lab in different ways. On November 30, he will hold the workshop A Practical Introduction to Text Analysis, where he covers all important steps, from pre-processing text to visualizing results of topic modeling in a single day. On December 7, he will host a Digital Methods Colloquium together with Roland Toth, where German researchers focusing on digital methods will get together, present recent work, and discuss challenges and opportunities in the field.

Furthermore, Douglas Parry is collaborating on two research projects with the Methods Lab during his stay, both of which involve the processing of complex data surrounding smartphone usage that were collected using multiple methods earlier this year.

The Methods Lab is happy to host Douglas Parry and is looking forward to the results of this exciting partnership – stay tuned!

Workshop: A Practical Introduction to Text Analysis

We are eager to announce our upcoming workshop, “A Practical Introduction to Text Analysis“, on Thursday, November 30, at the Weizenbaum Institute. Led by visiting fellow Dr. Douglas Parry (Stellenbosch University, South Africa), this workshop offers a comprehensive introduction to text analysis using the R programming language. Topics covered include text pre-processing (formats, tokenization, stemming, stop words, regex), dictionary analysis (lexicons, tf-idf, sentiment), topic modeling (LDA, CTM, STM), and data visualization. By the end of the workshop, participants will be equipped to tackle real-world text-mining tasks and have a solid foundation to move on to more advanced analysis techniques. While a basic understanding of R programming is anticipated, prior experience in text analysis is not necessary.

For more details about the workshop, visit our program page. We look forward to your participation!

Workshop Recap: Theory Construction – Building and Advancing Theories for Empirical Social Science (September 14, 2023)

On September 14th, 2023, the Methods Lab organized a workshop on the rationale and methodology of theory building in empirical research. The workshop was conducted by Adrian Meier (U of Erlangen-Nürnberg) and aimed to provide participants with an orientation for working with theories in a meaningful way that provides a foundation for empirical research.

In the first section of the workshop, Adrian outlined what theories are and how they relate to the overarching mission of science. The introduction focused on the differentiation between theories, concepts, constructs, and models and addressed the interplay between theories and empirical research.

After this introduction, the focus shifted to challenges and problems of social scientific theorizing. Participants were given the opportunity to add issues and questions they identified in the past when working with theories. Most prominently, they mentioned confusion due to different terminology that is used for specific concepts (i.e., synonymy and ambiguity), the “moving target” problem (as phenomena are changing while they are being studied), and the lack of incentivization to focus on theory in the formalized infrastructure of empirical research. Adrian elaborated on some of the underlying issues uniting many of these challenges: Theories are underdetermined by evidence, concepts and measurement instruments are rarely validated, and manipulations in experimental research are not precise enough.

In the last section of the workshop, participants learned about a recently proposed Theory Construction Methodology (Borsboom et al., 2021) and took part in an accompanying exercise. They were asked to read a one-pager summarizing crucial elements of the Mood Management Theory, a popular theory in the field of media psychology. Within this text, they should identify statements about phenomena the theory is supposed to explain, data that supported it (or not), as well as the theoretical statements (e.g., premises, propositions) themselves, to increase participants‘ sensitivity in differentiating between these elements in their own work. Lastly, Adrian gave an outlook on how theories can be formalized and how theory construction can be crucially fostered by non-confirmatory research practices.

The workshop was a great and unconventional addition to this year’s series of workshops organized by the Methods Lab. Adrian structured and executed it brilliantly and gave participants – who were associated with various fields of research and very engaged – lots of room for discussions.

We would like to thank Adrian for his thorough and inspiring workshop and hope he will contribute to the Methods Lab program again in the future. In the meantime, we recommend following him on X for updates on his research!

Workshop postponed – Interdisciplinarity in Action: Methods for Fruitful Teamwork

The announced workshop on interdisciplinary (practical) methods has been postponed to 2024 (the exact date and program will be announce in due time, stay tuned). A shorter, slightly modified online version of the workshop will be offered on Friday, 6 October 2023, please contact directly Sara Saba (sara.saba@weizenbaum-institut.de ) or Stephanie Bouré (stephanie.boure@weizenbaum-institut.de) if you are interested in participating.

Workshop Recap: Whose Data Is It Anyway? Ethical, Practical, and Methodological Challenges of Data Donation in Messenger Groups Research (August 30, 2023)

On August 30th, 2023, the Methods Lab and Olga Pasitselska (U of Groningen) organized the workshop on data donation in messaging groups research. The workshop intended to tackle practical and ethical issues behind data collection, processing, and dissemination in the research of closed messaging groups. We asked four colleagues to share their experiences and struggles and provide their solutions for closed chat groups research. The invited speakers, Sérgio Barbosa (U Coimbra), Katharina Knop-Hülß (HMTMH Hannover), Connie Moon Sehat (Hacks/Hackers), and Julian Kohne (GESIS), paved the way for better conceptualization of messaging groups and application of tailor-made ethical and practical solutions. The workshop allowed for a cross-field discussion of ad-hoc developments in closed groups research and provided many insights for the audience, speakers, and organizers.

Sérgio Barbosa explained his approach of joining activist WhatsApp groups in Brazil. Sérgio suggested that informed consent cannot be assumed as a one-off solution: instead, one should go beyond the check-list of ethical guidelines and learn by doing and negotiating with the group members. When joining these types of groups, researchers should clearly state the purposes of the research and disclose their identity, and also share the outcome of the research and promote it in the local community as well. Different approaches should be taken, depending on the type of groups: for example, pro-democracy groups and extremist groups should be treated differently, independent of the group size.

Dr. Katharina Knop-Hülß shared insights about studying non-professional secondary groups (e.g., choir, sport, volunteer groups) with her highly unobtrusive and highly invasive research approach of scraping chats’ content. Since these groups were representative of intimate environments of everyday communication, they can be considered as “safe spaces”, closed from the public eye. To account for the sensitive nature of the data collection, Katharina used an opt-in approach, provided pseudonymized chat logs to the participants before they consented to participate, and complied with the requirement not to share this data with anyone beyond the research team, even after the data was pseudonymized.

Julian Kohne introduced his digital platform for WhatsApp data donation that automatically cleans and anonymizes the data, reducing researchers’ exposure to and intervention in the raw data. In his research, Julian takes a participant-centered approach: the data collection tool is designed to maximize usability and control of the data for research participants. They can pre-process the data in a way that allows them to review the chat logs and decide what exactly they want to donate, deleting undesirable pieces of data, up to the possibility of deleting time stamps and other meta-data. With that, the tool also allows researchers to track how much and what types of data was deleted.

Dr. Connie Moon Sehat presented the meta-review of closed messaging apps research that aimed to determine what are the conditions in terms of indexed invites, group size, discussion topics, or other aspects of closed groups that make them arguably public or private. Adding to the previous speakers’ examples of their research with activist/public and hobby and friends/private types of groups, the review summarized the discussed points and provided a framework for mapping chat groups according to the multiple parameters. Whether researchers scraped the groups without entering them, entered with invitation, disclosed or not their identity and research interest, depended on the nature of the groups and public interest that can justify researchers’ intervention into the closed communication spaces. Connie also stressed the possible differences in perceptions of groups’ “publicness” between users, researchers, and platforms, that also should be taken into account.

After four presentations, we continued the discussion with the online and offline audience, addressing the issues of generalizability of messaging data (what slice of the “natural” social interaction are we looking at here?), the role of language, and the differences between long- and short-term groups. We also discussed what is the role of the researcher in the automated versus manual data collection process, and how participants can benefit from data donation.

The workshop provided theoretical and practical insights for messaging groups research and outlined future directions for collaboration in creating the guidelines for ethical closed messaging research and data donation.

Workshop: Interdisciplinarity in Action: Methods for Fruitful Teamwork (October 4, 2023)

We are excited to announce our upcoming workshop, “Interdisciplinarity in Action: Methods for Fruitful Teamwork,” scheduled for Wednesday, October 4, at the Weizenbaum Institute. Led by Silvio Suckow and Sara Saba (both WI), this intensive one-day workshop provides practical tools and knowledge for enhancing teamwork and interdisciplinary collaboration. The workshop offers diverse perspectives and actionable advice for structuring interdisciplinary teams and their work, hands-on practice of various team-building methods, and an input presentation by an external speaker. It is open to anyone interested in interdisciplinary research, whether leading or collaborating on such projects. Please note that spots are limited and allocated on a first-come, first-served basis. A slightly modified online version of the course will be offered separately.

For more details about the workshop, visit our program page. We look forward to seeing you there!

Workshop Recap: Introduction to Topic Modeling (June 15, 2023)

On June 15, the Methods Lab organized the workshop Introduction to Topic Modeling in collaboration with the research group Platform Algorithms and Digital Propaganda. The workshop aimed to provide participants with a comprehensive understanding of topic modeling – a machine-learning technique used to determine clusters of similar words (i.e., topics) within bodies of text. The event took place at the Weizenbaum Institute in a hybrid format, bringing together researchers from various institutions.

The workshop was conducted by Daniel Matter (TU Munich) who guided the participants through basic concepts and applications of this method. Through theory, demonstrations, and practical examples, participants gained insight into commonly used algorithms such as Latent Dirichlet Allocation (LDA) and BERT-based topic models. The workshop enabled participants to assess the advantages and drawbacks of each approach, equipping them with a foundation in topic modeling while, at the same time, providing plenty of new insights to those with prior expertise.

During the workshop, Daniel explained the distinction between LDA and BERTopic, two popular topic modeling strategies. LDA, or Latent Dirichlet Allocation, a commonly used method for topic modeling, operates as a generative model and treats each document as a mixture of topics. LDA aims to determine the topic and word distributions that maximize the probability of generating the documents in the corpus. With LDA, as opposed to BERTopic, the number of topics must be known beforehand.

BERTopic, on the other hand, belongs to the category of Embeddings-Based Topic Models (EBTM), which take a different approach. Unlike LDA, which treats words as distinct features, BERTopic incorporates semantic relationships between words. BERTopic follows a bottom-up approach, embedding documents in a semantic space and extracting topics from this transformed representation. Unlike LDA, which can be applied to short and long text corpora, BERTopic generally works better on shorter text, such as social media posts or news headlines.

When deciding between BERTopic and LDA, it is essential to consider the specific requirements of the text analysis. BERTopic’s strength lies in its flexibility and ability to handle short texts effectively, while LDA is preferred when strong interpretability is needed.

With this workshop, we at the Methods Lab hope to have provided our attendees with a solid understanding of topic modeling as a method. By exploring the concepts, applications, and advantages of each approach, these tools can be used to unlock hidden semantic structures within textual data, enabling researchers to employ them in various domains and facilitating tasks such as document clustering, information retrieval, and recommender systems.

A big thank you to Daniel for inducting us into the world of topic modeling and to all our participants!

Our next workshop, Whose Data is it Anyway? Ethical, Practical, and Methodological Challenges of Data Donation in Messenger Groups Research, will take place on August 30, 2023. See you there!

Workshop: Theory Construction: Building and Advancing Theories for Empirical Social Science (September 14, 2023)

We are excited to announce our upcoming workshop, Theory Construction: Building and Advancing Theories for Empirical Social Science, which will take place on Thursday, September 14 in the Kassenhalle (main hall), WI. Led by Adrian Meier (FAU Erlangen-Nürnberg) and created in collaboration with Dr. Daniel Possler (JMU Würzburg), this intensive “crash course” will equip participants with practical strategies for constructing and advancing social scientific theories. Beginning with an exploration of fundamental concepts, structure, and quality criteria of social scientific theories, Adrian will delve into hands-on techniques for building and advancing theory. The workshop will focus on the theory-building process as well as the micro-level of social analysis, offering examples from media psychology and communication science.

For more information, visit our program page. See you there!