Conference Recap: “Data, Archive & Tool Demos” at DGPUK 2024 (March 14, 2024)

Together with Johannes Breuer, Silke Fürst, Erik Koenen, Dimitri Prandner, and Christian Schwarzenegger, Methods Lab member Christian Strippel organized a “Data, Archive & Tool Demos” session as part of the DGPuK 2024 conference at the University of Erfurt on March 14, 2024. The idea behind this session was to provide space to present and discuss datasets, archives, and software to an interested audience. The event met with great interest, so that all the seats were taken. After a high-density session in which all 13 projects were presented in short talks, the individual projects were discussed in more detail in the following poster and demo session in the hallway.

The 13 contributions were:

CKIT: Construction KIT
— Lisa Dieckmann, Maria Effinger, Anne Klammt, Fabian Offert, & Daniel Röwenstrunk
CKIT is a review journal for research tools and data services in the humanities, founded in 2022. The journal addresses the increasing use of digital tools and online databases across academic disciplines, highlighting the importance of understanding how these tools influence research design and outcomes. Despite their critical role, scholarly examination of these tools has been minimal. CKIT aims to fill this gap by providing a platform for reviews that appeal to both humanities scholars and technical experts, promoting interdisciplinary collaboration. For more details, see here.

Der Querdenken Telegram Datensatz 2020-2022 
Kilian Buehling, Heidi Schulze, & Maximilian Zehring
The Querdenken Telegram Datensatz is a dataset that represents the German-speaking anti-COVID-19 measures protest mobilization from 2020 to 2022. It includes public messages from 390 channels and 611 groups associated with the Querdenken movement and the broader COVID-19 protest movement. Unlike other datasets, it is manually classified and processed to provide a longitudinal view of this specific movement and its networking. 

DOCA – Database of Variables for Content Analysis
Franziska Oehmer-Pedrazzi, Sabrina H. Kessler, Edda Humprech, Katharina Sommer, & Laia Castro
The DOCA database collects, systematizes, and evaluates operationalizations for standardized manual and automated content analysis in communication science. It helps researchers find suitable and established operationalizations and codebooks, making them freely accessible in line with Open Method and Open Access principles. This enhances the comparability of content analytical studies and emphasizes transparency in operationalizations and quality indicators. DOCA includes variables for various areas such as journalism, fictional content, strategic communication, and user-generated content. It is supported by an open-access handbook that consolidates current research. For more info, visit the project’s website here.

A “Community Data Trustee Model” for the Study of Far-Right Online Communication
Jan Rau, Nils Jungmann, Moritz Fürneisen, Gregor Wiedemann, Pascal Siegers, & Heidi Schulze
The community data trustee model is introduced for researching sensitive areas like digital right-wing extremism. This model involves sharing lists of relevant actors and their online presences across various projects to reduce the labor-intensive data collection process. It proposes creating and maintaining these lists as a community effort, with users contributing updates back into a shared repository, facilitated by an online portal. The model aims to incentivize data sharing, ensure legal security and trust, and improve data quality through collaborative efforts.

Development and Publication of Individual Research Apps Using DIKI as an Example
— Anke Stoll
DIKI is a dictionary designed for the automated detection of incivility in German-language online discussions, accessible through a web application. Developed using the Streamlit framework, DIKI allows users to perform automated content analysis via a drag-and-drop interface without needing to install any software. This tool exemplifies how modern frameworks can transform complex analytical methods into user-friendly applications, enhancing the accessibility and reuse of research instruments. By providing an intuitive graphical user interface, DIKI makes advanced analytical capabilities available to those without programming expertise, thus broadening the scope and impact of computational communication science.

The FROG Tool for Gathering Telegram Data
Florian Primig & Fabian Fröschl
The FROG tool is designed to gather data from Telegram, a platform increasingly important for social science research due to its popularity and resilience against deplatforming. FROG addresses the challenges of data loss and the tedious collection process by providing a user-friendly interface capable of scraping multiple channels simultaneously. It allows users to select specific timeframes or perform full channel collections, making it suitable for both qualitative and quantitative research. The tool aims to facilitate data collection for researchers with limited coding skills and invites the community to contribute to its ongoing development. An introduction to the tool can be found here.

Mastodon-Toolbox – Decentralized Data Collection in the Fediverse
— Tim Schatto-Eckrodt
The Mastodon Toolbox is a Python package designed for systematic analysis of user content and network structures on the decentralized social media platform Mastodon. Developed as an alternative to centralized platforms, Mastodon offers more privacy and control over data. The toolbox aids researchers in selecting relevant instances, filtering public posts by hashtags or keywords, collecting interactions such as replies, reblogs, and likes, and exporting data for further analysis. It is particularly useful for researchers with limited programming skills, enabling comprehensive data collection across Mastodon’s decentralized network. More info about the tool can be found here.

Open Source Transformer Models: A Simple Tool for Automated Content Analysis for (German-Speaking) Communication Science
Felix Dietrich, Daniel Possler, Anica Lammers, & Jule Scheper
The “Open Source Transformer Models” tool is designed for automated content analysis in German-language communication science. Leveraging advancements in natural language processing, it utilizes large transformer-based language models to interpret word meanings in context and adapt to specific applications like sentiment analysis and emotion classification. Hosted on the Open Source platform “Hugging Face,” the tool allows researchers to analyze diverse text types with minimal programming skills.

Meteor: A Research Platform for Political Text Data
Paul Balluff, Michele Scotto di Vettimo, Marvin Stecker, Susan Banducci, & Hajo G. Boomgaarden
Meteor is a comprehensive research platform designed to enhance the study of political texts by providing a wide range of resources, including datasets, tools, and scientific publications. It features a curated classification system and an interlinked graph structure to facilitate easy navigation and discoverability of resources. Users can contribute new resources, create personalized collections, and receive updates through a notification system. Additionally, Meteor integrates with AmCAT 4.0 to enable non-consumptive research, ensuring the protection of copyrighted materials. For more details, visit the project’s website here.

rufus – The Portal for Radio Search
— Patricia F. Blume
The “rufus” tool is an online research platform developed by the Leipzig University Library (UBL) to provide easy access to broadcast information from the ZDF archive. This platform allows researchers to search production archive data from an external source for the first time, offering data from nearly 500,000 broadcasts and 2 million segments dating back to 1963. The tool features a versatile user interface with specific search instruments, enabling straightforward viewing requests to the ZDF archive. Built with open-source components, rufus not only facilitates access to valuable audiovisual heritage for communication and media researchers but also supports the integration of additional data providers. For more details, visit the project’s website here.

Weizenbaum Panel
— Martin Emmer, Katharina Heger, Sofie Jokerst, Roland Toth, & Christian Strippel
The Weizenbaum Panel is an annual, representative telephone survey conducted by the Weizenbaum Institute for the Networked Society and the Institute for Journalism and Communication Studies at the Free University of Berlin. Since 2019, around 2,000 German-speaking individuals over the age of 16 are surveyed each year about their media usage, democratic attitudes, civic norms, and social and political engagement, with a special focus on online civic interventions. The survey allows for longitudinal intra-individual analyses and the data is made available for scientific reuse shortly after collection. More information about the panel can be found here.

WhatsR – An R Package for Processing and Analyzing WhatsApp Chat Logs
— Julian Kohne
The WhatsR R-package enables researchers to process and analyze WhatsApp chat logs, addressing the gap in studying private interpersonal communication. It supports parsing, preprocessing, and anonymizing chat data from exported logs, while allowing researchers to analyze either their own data or data voluntarily donated by participants. The package includes a function to exclude data from non-consenting participants and is complemented by the ChatDashboard, an interactive R shiny app for transparent data donation and participant feedback. The package can be found here.

OpenQDA
— Andreas Hepp & Florian Hohmann
The OpenQDA tool is an open source qualitative data analysis tool, and the latest product developed at the ZeMKI institute in Bremen. It is provided as free-to-use research software that enables collaborative text analysis and all basic functions of other QDA software. The tool that is currently still in its beta version can be found here.

Introducing LimeSurvey at WI

Surveys are an important method for data collection. Whether it is for conducting internal assessments, gathering feedback, or collecting valuable research data, a reliable survey tool is an integral piece in the methodological toolkit of any researcher. Using different survey tools for different projects leads to differences in the quality of data collection and unnecessary licensing costs. In order to find a more sustainable solution, the Methods Lab assessed some of the most popular survey tools with the aim of finding the ideal one to cater to the specific needs of the Weizenbaum Institute’s research groups and administrative departments. Important to us was to select a user-friendly, open-source survey tool suitable for research that can be hosted on our own servers

In this blog post, we introduce our choice: LimeSurvey. It is a free, open-source survey software with a strong commitment to data protection. It offers a versatile platform for data collection, making it ideal for researchers, academic institutions, and organizations of all sizes. In doing so, we hope that the insights from our survey tool comparison will prove useful to researchers and institutions beyond our own. 

Here are some of the distinctive advantages that we were able to identify, making LimeSurvey a compelling choice for research and data collection:

  • Cost-Effective and Open Source: LimeSurvey is open source, meaning, it is available for free when hosted on your own servers, thereby eliminating the need for costly licensing fees.
  • Data Protection: LimeSurvey prioritizes data privacy – a particular advantage appreciated by our IT department due to its compliance with the GDPR. Its servers are strategically located in Germany and Finland, ensuring adherence to stringent European data protection regulations.
  • User-Friendly Integration: LimeSurvey seamlessly integrates with existing user accounts, simplifying the onboarding process without requiring additional account setup.
  • Suitable for Research: LimeSurvey is designed with research needs in mind. It offers a wide range of features, including unlimited projects and administrators/accounts. This flexibility makes it suitable for both simple and complex research projects.
  • No Artificial Limits: LimeSurvey imposes no artificial limitations on user accounts, participants, or projects.

The WI LimeSurvey installation can be accessed by members of the institute here: https://limesurvey.weizenbaum-institut.de/index.php/admin/. Its use is documented in the internal Wiki.

Happy surveying!