On April 18 2024, the Methods Lab organized the workshop Research Ethics – Principles and Practice in Digitalization Research to meet the increasing relevance and complexity of ethics in digitalization research.
In the first part of the workshop, Christine Normann (WZB) introduced participants to good research practice and research ethics in alignment with the guidelines of the German Research Foundation (DFG). Besides the need to balance the freedom of research and data protection, she informed about important institutions, noted the difficulties of formulating ethics statements for funding applications before study designs are finalized, and provided some practical tips regarding guidance when planning research.
Next, Julian Vuorimäki (WI) guided participants through the handling of research ethics at the Weizenbaum Institute. He focussed on the code of conduct, ombudspersons, guideline for handling research data, and the newly founded review board. The latter is in charge of providing ethics reviews for individual projects and studies, which can be applied for through a questionnaire on the institute website.
Julian Vuorimäki presents the principles of good research practice at WI
In the second part of the workshop, three researchers presented practical ethical implications and learnings from research projects. Methods Lab lead Christian Strippel reported on a study where user comments were annotated to allow for the automatic detection of hate speech. He focused on possible misuse for censorship, the confrontation of coders with questionable content, and the challenges of publishing the results and data regarding copyright and framing. Tianling Yang (WI) presented ethical considerations and challenges in qualitative research. The focus lied on consent acquisition, anonymity and confidentiality, power relations, reciprocity (i.e., incentives and support), and the protection of the researchers themselves due to the physical and emotional impact of qualitative field work. Finally, Maximilian Heimstädt (Helmut Schmidt University Hamburg) talked about ambiguous consent in ethnographic research. He gave insights into a study in cooperation with the state criminal police office to predict crime for regional police agencies. Not all individuals in this research could be informed about the research endeavor, especially when the researchers accompanied the police during their shifts, which raised the question of how to find a balance between overt and covert research.
The Methods Labs thanks all presenters and participants for this insightful workshop!
Great interest in the 13 high-density presentations
Together with Johannes Breuer, Silke Fürst, Erik Koenen, Dimitri Prandner, and Christian Schwarzenegger, Methods Lab member Christian Strippel organized a “Data, Archive & Tool Demos” session as part of the DGPuK 2024 conference at the University of Erfurt on March 14, 2024. The idea behind this session was to provide space to present and discuss datasets, archives, and software to an interested audience. The event met with great interest, so that all the seats were taken. After a high-density session in which all 13 projects were presented in short talks, the individual projects were discussed in more detail in the following poster and demo session in the hallway.
The 13 contributions were:
CKIT: Construction KIT — Lisa Dieckmann, Maria Effinger, Anne Klammt, Fabian Offert, & Daniel Röwenstrunk CKIT is a review journal for research tools and data services in the humanities, founded in 2022. The journal addresses the increasing use of digital tools and online databases across academic disciplines, highlighting the importance of understanding how these tools influence research design and outcomes. Despite their critical role, scholarly examination of these tools has been minimal. CKIT aims to fill this gap by providing a platform for reviews that appeal to both humanities scholars and technical experts, promoting interdisciplinary collaboration. For more details, see here.
Der Querdenken Telegram Datensatz 2020-2022 — Kilian Buehling, Heidi Schulze, & Maximilian Zehring The Querdenken Telegram Datensatz is a dataset that represents the German-speaking anti-COVID-19 measures protest mobilization from 2020 to 2022. It includes public messages from 390 channels and 611 groups associated with the Querdenken movement and the broader COVID-19 protest movement. Unlike other datasets, it is manually classified and processed to provide a longitudinal view of this specific movement and its networking.
DOCA – Database of Variables for Content Analysis — Franziska Oehmer-Pedrazzi, Sabrina H. Kessler, Edda Humprech, Katharina Sommer, & Laia Castro The DOCA database collects, systematizes, and evaluates operationalizations for standardized manual and automated content analysis in communication science. It helps researchers find suitable and established operationalizations and codebooks, making them freely accessible in line with Open Method and Open Access principles. This enhances the comparability of content analytical studies and emphasizes transparency in operationalizations and quality indicators. DOCA includes variables for various areas such as journalism, fictional content, strategic communication, and user-generated content. It is supported by an open-access handbook that consolidates current research. For more info, visit the project’s website here.
A “Community Data Trustee Model” for the Study of Far-Right Online Communication — Jan Rau, Nils Jungmann, Moritz Fürneisen, Gregor Wiedemann, Pascal Siegers, & Heidi Schulze The community data trustee model is introduced for researching sensitive areas like digital right-wing extremism. This model involves sharing lists of relevant actors and their online presences across various projects to reduce the labor-intensive data collection process. It proposes creating and maintaining these lists as a community effort, with users contributing updates back into a shared repository, facilitated by an online portal. The model aims to incentivize data sharing, ensure legal security and trust, and improve data quality through collaborative efforts.
Development and Publication of Individual Research Apps Using DIKI as an Example — Anke Stoll DIKI is a dictionary designed for the automated detection of incivility in German-language online discussions, accessible through a web application. Developed using the Streamlit framework, DIKI allows users to perform automated content analysis via a drag-and-drop interface without needing to install any software. This tool exemplifies how modern frameworks can transform complex analytical methods into user-friendly applications, enhancing the accessibility and reuse of research instruments. By providing an intuitive graphical user interface, DIKI makes advanced analytical capabilities available to those without programming expertise, thus broadening the scope and impact of computational communication science.
The FROG Tool for Gathering Telegram Data — Florian Primig & Fabian Fröschl The FROG tool is designed to gather data from Telegram, a platform increasingly important for social science research due to its popularity and resilience against deplatforming. FROG addresses the challenges of data loss and the tedious collection process by providing a user-friendly interface capable of scraping multiple channels simultaneously. It allows users to select specific timeframes or perform full channel collections, making it suitable for both qualitative and quantitative research. The tool aims to facilitate data collection for researchers with limited coding skills and invites the community to contribute to its ongoing development. An introduction to the tool can be found here.
Mastodon-Toolbox – Decentralized Data Collection in the Fediverse — Tim Schatto-Eckrodt The Mastodon Toolbox is a Python package designed for systematic analysis of user content and network structures on the decentralized social media platform Mastodon. Developed as an alternative to centralized platforms, Mastodon offers more privacy and control over data. The toolbox aids researchers in selecting relevant instances, filtering public posts by hashtags or keywords, collecting interactions such as replies, reblogs, and likes, and exporting data for further analysis. It is particularly useful for researchers with limited programming skills, enabling comprehensive data collection across Mastodon’s decentralized network. More info about the tool can be found here.
Open Source Transformer Models: A Simple Tool for Automated Content Analysis for (German-Speaking) Communication Science — Felix Dietrich, Daniel Possler, Anica Lammers, & Jule Scheper The “Open Source Transformer Models” tool is designed for automated content analysis in German-language communication science. Leveraging advancements in natural language processing, it utilizes large transformer-based language models to interpret word meanings in context and adapt to specific applications like sentiment analysis and emotion classification. Hosted on the Open Source platform “Hugging Face,” the tool allows researchers to analyze diverse text types with minimal programming skills.
Meteor: A Research Platform for Political Text Data — Paul Balluff, Michele Scotto di Vettimo, Marvin Stecker, Susan Banducci, & Hajo G. Boomgaarden Meteor is a comprehensive research platform designed to enhance the study of political texts by providing a wide range of resources, including datasets, tools, and scientific publications. It features a curated classification system and an interlinked graph structure to facilitate easy navigation and discoverability of resources. Users can contribute new resources, create personalized collections, and receive updates through a notification system. Additionally, Meteor integrates with AmCAT 4.0 to enable non-consumptive research, ensuring the protection of copyrighted materials. For more details, visit the project’s website here.
rufus – The Portal for Radio Search — Patricia F. Blume The “rufus” tool is an online research platform developed by the Leipzig University Library (UBL) to provide easy access to broadcast information from the ZDF archive. This platform allows researchers to search production archive data from an external source for the first time, offering data from nearly 500,000 broadcasts and 2 million segments dating back to 1963. The tool features a versatile user interface with specific search instruments, enabling straightforward viewing requests to the ZDF archive. Built with open-source components, rufus not only facilitates access to valuable audiovisual heritage for communication and media researchers but also supports the integration of additional data providers. For more details, visit the project’s website here.
Weizenbaum Panel — Martin Emmer, Katharina Heger, Sofie Jokerst, Roland Toth, & Christian Strippel The Weizenbaum Panel is an annual, representative telephone survey conducted by the Weizenbaum Institute for the Networked Society and the Institute for Journalism and Communication Studies at the Free University of Berlin. Since 2019, around 2,000 German-speaking individuals over the age of 16 are surveyed each year about their media usage, democratic attitudes, civic norms, and social and political engagement, with a special focus on online civic interventions. The survey allows for longitudinal intra-individual analyses and the data is made available for scientific reuse shortly after collection. More information about the panel can be found here.
WhatsR – An R Package for Processing and Analyzing WhatsApp Chat Logs — Julian Kohne The WhatsR R-package enables researchers to process and analyze WhatsApp chat logs, addressing the gap in studying private interpersonal communication. It supports parsing, preprocessing, and anonymizing chat data from exported logs, while allowing researchers to analyze either their own data or data voluntarily donated by participants. The package includes a function to exclude data from non-consenting participants and is complemented by the ChatDashboard, an interactive R shiny app for transparent data donation and participant feedback. The package can be found here.
OpenQDA — Andreas Hepp & Florian Hohmann The OpenQDA tool is an open source qualitative data analysis tool, and the latest product developed at the ZeMKI institute in Bremen. It is provided as free-to-use research software that enables collaborative text analysis and all basic functions of other QDA software. The tool that is currently still in its beta version can be found here.
Berlin’s academic landscape is rich with diverse research endeavors, particularly in the realms of digital cultural, social, and humanities studies. However, there’s a notable gap in structured and sustained networking among key players in these fields. To address this gap, the Weizenbaum Institute and the Interdisciplinary Center for Digitality and Digital Methods at HU Berlin’s Campus Mitte are organizing a networking meeting.
Scheduled for May 31, 2024, at the Auditorium in the Grimm-Zentrum, HU Berlin, this event is open to institutions, institutionalized teams, and centers actively engaged in digital research within the humanities, social sciences, and cultural studies in Berlin. The aim of the meeting is to strengthen existing connections, identify potential common interests and goals, and spotlight further avenues for collaboration and exchange within Berlin’s vibrant digital research community.
In the first part of the meeting, every team will introduce themselves through short highlighting talks. In the second part, the participants will facilitate a casual, direct exchange for all participants in a World Café format, covering various questions and cross-cutting themes related to digitality and digital methods in the humanities and social sciences.
Event Details:
Date: Friday, May 31, 2024
Time: 13:00–16:00
Location: Auditorium at the Grimm-Zentrum, HU Berlin, Geschwister-Scholl-Str. 1/3
We’re excited to announce our upcoming workshop Introduction to High-Performance Computing (HPC), scheduled for Monday, May 6th at the Weizenbaum Institute. Led by Loris Bennett (FU) from the HPC service at Freie Universität Berlin, the workshop is open to members of the Weizenbaum Institute with an FU account and access to HPC resources at FU. It aims to provide fundamentals on utilizing HPC resources in general by the example of those offered by FU Berlin.
For further details about the workshop, please visit our program page.
On April 10th and 11th, The Methods Lab organized the second edition of the workshop Introduction to Programming and Data Analysis with R. Led by Roland Toth from the Methods Lab, the workshop was designed to equip participants with fundamental R programming skills essential for data wrangling and analysis.
Roland Toth introduces participants to data wrangling with R
Across two days, attendees engaged in a comprehensive exploration of R fundamentals, covering topics such as RStudio, Markdown, data wrangling, and practical data analysis. Day one focused on laying the groundwork, covering the main concepts in programming including functions, classes, objects, and vectors. Participants were also familiarized with Markdown and Quarto, enabling them to include analysis results while producing text, and the key steps and techniques of data wrangling.
Participants work on their own research questions during the practical exercise
The first half of the second day was dedicated to showcasing and exploring basic data analysis and various visualization methods. Afterwards, participants had the opportunity to put into practice the knowledge they had gained from the previous day by working with a dataset to formulate and address their own research questions. Roland was on hand to offer assistance and guidance to the participants, addressing any challenges or concerns that arose along the journey.
Christian Strippel presents first results
The workshop fostered a collaborative learning environment, with lively discussions and ample questions from all. We thank all participants for their active involvement!
We are excited to announce our next workshop, “Research Ethics – Principles and Practice in Digitalization Research“, which will take place on Thursday, April 18. This workshop will be conducted both at the Weizenbaum Institute and online, and is open to Weizenbaum Institute members as well as external participants (and the QPD). Led by Christine Normann (WZB), Julian Vuorimäki (WI), Maximilian Heimstädt (HSU), and Tiangling Yang (WI), the workshop will focus on principles and best practices of ethics in research. After a general introduction and overview of principles according to the German Research Foundation (DFG), current plans regarding an ethics board at Weizenbaum Institute will be presented and finally, three separate examples for ethical considerations in research practice will be shown.
For detailed information about the workshop, please visit our program page. We are looking forward to your participation!
The use of online surveys in contemporary social science research has grown rapidly due to their many benefits such as cost-effectiveness and ability to yield insights into attitudes, experiences, and perceptions. Unlike more established methods such as pen-and-paper surveys, they enable complex setups like experimental designs and seamless integration of digital media content. But despite their user-friendliness, even seasoned researchers still face numerous challenges in creating online surveys. To showcase the versatility and common pitfalls of online surveying, Martin Emmer, Christian Strippel, and Roland Toth of the Methods Lab arranged the workshopIntroduction to Online Surveyson February 22, 2024.
Martin gave a presentation on the design and logic of online surveys.
In the first segment, Martin Emmer provided a theoretical overview of the design and logic of online surveys. He started by outlining the common challenges and benefits associated with interviewing, with a particular emphasis on social-psychological dynamics. Compared to online surveys, face-to-face interviews offer a more personal, engaging, and interactive experience, enabling interviewers to adjust questions and seek clarification of answers in real time. However, they can be time-consuming and expensive and may introduce biases such as the interviewer effect. On the other hand, the process of conducting online surveys presents its own set of challenges, such as limited control over the interview environment, a low drop-out threshold, and particularities connected with self-administration such as the need for detailed text-based instructions for respondents. Nevertheless, self-administered and computer-administered surveys boast numerous advantages, including cost-effectiveness, rapid data collection, the easy application of visuals and other stimuli, and accessibility to large and geographically dispersed populations. When designing an online survey, Martin stressed the importance of clear question wording, ethical considerations, and robust procedures to ensure voluntary participation and data protection.
Christian shared his insights on survey creation using online access panel providers.
In the second part of the workshop, Christian Strippel delved into the realm of online access panel providers, including the perks and pitfalls associated with utilizing them in survey creation. Panel providers serve as curated pools of potential survey participants managed by institutions, such as Bilendi/Respondi, YouGov, Cint, Civey, and the GESIS Panel. Panel providers oversee the recruitment and management processes, ensuring participants are matched with surveys relevant to their demographics and interests, while also handling survey distribution and data collection. While the use of online panels offers advantages such as accessing a broad participant pool, cost-efficiency, and streamlined sampling of specific sub-groups, they also have their limitations. Online panels are, for example, not entirely representative of the general population as they exclude non-internet users. Moreover, challenges arise from professional respondents such as so-called speeders who rush through surveys, and straight-liners who consistently choose the same response in matrix questions. Strategies to combat these issues include attention checks throughout the questionnaire, systematic exclusion of speeders and straight-liners, and quota-based screening. To conclude, Christian outlined what constitutes a good online panel provider, and shared valuable insights into how to plan a survey using one effectively.
Participants learned how to create their own survey using LimeSurvey during Roland’s live demo.
The third and final segment of the workshop featured a live demonstration by Roland Toth on how to set up an online survey using the open-source software LimeSurvey, which is hosted on the institute’s own servers. During this live demonstration, he created the very evaluation questionnaire administered to the workshop participants at the end of the workshop. Roland began by providing an overview of the general setup and relevant settings for survey creation. Subsequently, he demonstrated various methods of crafting questions with different scales, display conditions, and the incorporation of visual elements such as images. Throughout the demo, Roland addressed issues raised earlier in the first part of the workshop concerning language and phrasing, emphasizing rules for question-wording and why it is important to ask for one piece of information only per question. The live demonstration was wrapped up with a segment on viewing and exporting collected data. After letting the participants complete the evaluation form, the workshop concluded with a Q&A session.
Level: Beginner/Intermediate Category: Data Analysis
After being well received last year, we’re happy to announce the return of our workshop Programming and Data Analysis with R for its second edition. This two-day intensive workshop led by Roland Toth (WI) will take place on Wednesday, April 10, and Thursday, April 11, at the Weizenbaum Institute.
During the first day, attendees will receive comprehensive training in programming fundamentals, essential data wrangling techniques, and Markdown integration. The second day will center around data analysis, providing participants with the chance to engage directly with a dataset and address a research topic independently. A blend of concepts, coding techniques, and smaller practical tasks will be interspersed throughout both days to reinforce hands-on learning.
During my visit to the Center for Industry 4.0, I had the opportunity to participate in the pretest of the HoloLens study and learn more about augmented reality-based learning. The goal of the study, which is a collaboration between the research groups ofGergana Vladova(Education for the Digital World) and Martin Krzywdzinski (Working with Artificial Intelligence), is twofold. In the first part, the research groups investigate the effectiveness of different Augmented Reality (AR) designs on learning and compare them to traditional paper-based methods by using eye-tracking. In the second part, they focus on participants’ decision making and disruption management, guided by suggestions from an AI-assisted system. These participants can operate in either a team-based or hierarchical setting.
The cube travels along the conveyor belt, its screen showcasing lenses for participants to identify and sort out. The AR glasses used in the HoloLens study.
In the first part of the experiment, participants work in a simulated factory environment where they are tasked with producing lenses. The team uses either AR instructions or traditional paper instructions, depending on the experimental condition. The AR head-mounted display guides the team through tasks such as adjusting machine settings, sorting defective lenses, and other simulated problems. The same principle is used in the other experiments, except in this case, participants rely on paper-based learning instead of AR glasses.
In the second part of the experiment, participants apply what they learned in the first part, but without using the AR glasses or the paper instructions. In addition, the errors they must solve are different from those in the previous part. When presented with a problem, participants are expected to solve it collaboratively through effective communication and with the help of AI.
To measure performance, the study uses traditional metrics such as time and error rates. Between each round, knowledge tests in the form of a questionnaire are administered to assess participants’ recall and comprehension. The hypothesis is that process-integrated learning via Augmented Reality can enhance the learning process.
Nicolas Leins and Jana Gonnerman at the Centre for Industry 4.0 Potsdam.
We are excited to announce the Methods Lab’s first workshop of the year, “Introduction to Online Surveys“, which will take place on Thursday, February 22. This workshop will be conducted both at the Weizenbaum Institute and online, and is open to Weizenbaum Institute members as well as external participants. Led by members of the Methods Lab, Martin Emmer, Christian Strippel, and Roland Toth, the workshop will focus on the use of online surveys in the context of social science research, providing participants with a theoretical foundation as well as a hands-on guide. We will cover aspects such as the logic and design of online surveys, how to work with access panel providers, and demonstrate how to effectively set up an online survey using the versatile survey tool LimeSurvey. Crucial topics such as ethics and data protection will also be discussed.
For detailed information about the workshop, please visit our program page. We look forward to your participation!
Manage Cookie Consent
To provide the best experiences, we use cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.