The announced workshop on interdisciplinary (practical) methods has been postponed to 2024 (the exact date and program will be announce in due time, stay tuned). A shorter, slightly modified online version of the workshop will be offered on Friday, 6 October 2023, please contact directly Sara Saba (firstname.lastname@example.org ) or Stephanie Bouré (email@example.com) if you are interested in participating.
On June 15, 2023, the Methods Lab organized the workshop “Introduction to Topic Modeling” in collaboration with the WI research group “Platform Algorithms and Digital Propaganda”. The workshop aimed to provide participants with a comprehensive understanding of topic modeling, a machine-learning technique used to determine clusters of similar words (i.e., topics) within bodies of text. The event took place at the Weizenbaum Institute in a hybrid format, bringing together researchers from various institutions.
The workshop was conducted by Daniel Matter (TU Munich) who guided the participants through basic concepts and applications of this method. Through theory, demonstrations, and practical examples, attendees gained insight into commonly used algorithms such as Latent Dirichlet Allocation (LDA) and BERT-based topic models. The workshop enabled participants to assess the advantages and drawbacks of each approach, equipping them with a foundation in topic modeling while, at the same time, providing plenty of new insights to those with prior expertise.
During the workshop, Daniel explained the distinction between LDA and BERTopic, two popular topic modeling strategies. LDA, or Latent Dirichlet Allocation, a commonly used method for topic modeling, operates as a generative model and treats each document as a mixture of topics. LDA aims to determine the topic and word distributions that maximize the probability of generating the documents in the corpus. With LDA, as opposed to BERTopic, the number of topics must be known beforehand.
BERTopic, on the other hand, belongs to the category of Embeddings-Based Topic Models (EBTM), which take a different approach. Unlike LDA, which treats words as distinct features, BERTopic incorporates semantic relationships between words. BERTopic follows a bottom-up approach, embedding documents in a semantic space and extracting topics from this transformed representation. Unlike LDA, which can be applied to short and long text corpora, BERTopic generally works better on shorter text, such as social media posts or news headlines.
When deciding between BERTopic and LDA, it is essential to consider the specific requirements of the text analysis. BERTopic’s strength lies in its flexibility and ability to handle short texts effectively, while LDA is preferred when strong interpretability is necessary.
With this workshop, we at the Methods Lab hope to have provided our attendees with a comprehensive understanding of topic modeling as a method, with a special focus on LDA and BERTopic. By exploring the concepts, applications, and advantages of each approach, these tools can be used to unlock hidden semantic structures within textual data, enabling researchers to employ them in various domains and facilitating tasks such as document clustering, information retrieval, and recommender systems.
We want to thank Daniel for giving this workshop and inducting us into the world of topic modeling and also all participants, both virtually and at the institute.
Our next workshop, “Whose data is it anyway? Ethical, practical, and methodological challenges of data donation in messenger groups research”, will take place on August 30, 2023. We hope to see you there!
We are excited to announce our upcoming workshop, “Theory Construction: Building and Advancing Theories for Empirical Social Science,” which will take place on Thursday, September 14 in the Kassenhalle (main hall), WI. Led by Adrian Meier (FAU Erlangen-Nürnberg) and created in collaboration with Dr. Daniel Possler (JMU Würzburg), this intensive “crash course” will equip participants with practical strategies for constructing and advancing social scientific theories. Beginning with an exploration of fundamental concepts, structure, and quality criteria of social scientific theories, Adrian will delve into hands-on techniques for building and advancing theory. The workshop will focus on the theory-building process as well as the micro-level of social analysis, offering examples from media psychology and communication science.
You can find more about the workshop on our program page. See you there!
We hereby present the first workshop at the Institute to emerge from the methodological needs that were indicated in our institute-wide survey in December. It is titled Web Scraping and API-based Data Collection and takes place on March 2.
After an introduction to the topic by the Methods Lab team, Florian Primig (FU), Steffen Lepa (TU), Felix Gaisbauer (WI), and Lion Wedel (WI) will each present various use cases of these two data collection methods. You can find more information about the workshop on its program page.
In December 2022, the Methods Lab conducted an internal survey to map out the methodological experiences and needs at the Weizenbaum Institute. Thanks to everybody who participated! We have identified specific demands and requests at the institute. Even though there already is extensive expertise for a large variety of methods and tools, many Weizenbaum scholars also expressed a wish for additional support and knowledge-building in, for instance, the following areas:
- Data collection: Automated observation (e.g., logging, tracking), Automated content analysis, Web Scraping, API-based data collection, and Eye-Tracking
- Data Analysis: Network Analysis, Deep/Transfer Learning, Natural Language Processing, and Classification Methods
- Software/Tools: R, Python, and Network analysis software
With these results as our polaris, we in the Methods Lab have embarked on the expedition of developing a future methods training and consulting program suited to your needs, which we will announce shortly. In the meantime, the results of the survey hopefully serve as a launch pad for networking amongst the scholars at the Weizenbaum Institute.
Welcome to the digital baptism of the Methods Lab blog. This blog will keep you informed about our work, future workshops, events, and other resources and materials that may be useful to you in your upcoming research.
As a unit, we are committed to three principal tasks: training, consulting, and research. We aim to assist you with all your methodological questions, issues, and needs, no matter how large or small, and to coordinate expertise at the institute. Think of us as a hub, a metaphorical Rome, if you will, where all your methods-related queries, and (non-)knowledge have a space to converge. If you have any thoughts, suggestions, or concerns, don’t hesitate to contact us – we will always lend you an ear.
At the start of December, we asked you to participate in a survey in order to give us an overview of your expertise and needs regarding data collection, analysis, and software. With the help of the results, we have created a preliminary training program tailored to your wants and needs. To everyone who participated: thank you!
On that note, we are delighted to announce that our first official workshop will take place at the beginning of March. Besides that, we have two more workshops planned for spring.
So stay tuned for further announcements about many exciting things to come! We look forward to beginning this new chapter with you.