Editorial to Special Issue and Software Presentation

We are thrilled to announce the contributions of Methods Lab members Christian Strippel and Roland Toth to the latest issue of Publizistik: Vierteljahreshefte für Kommunikationsforschung.

Christian co-authored the editorial and served as a guest editor of this special issue on journalism. Read the editorial “Data, archives, and tools: Introducing new publication formats on infrastructures and resources for communication and media research”, here.

Roland’s research on tracking and the Experience Sampling Method (ESM) app is featured in the same journal. Dive into his article, “One App to Assess Them All – Combining surveys, experience sampling, and logging/data donation in an Android and iOS app”, here, to learn more about MART, the open-source app designed to simplify data collection in social sciences.

Workshop Recap: Whose Data Is It Anyway? Ethical, Practical, and Methodological Challenges of Data Donation in Messenger Groups Research (August 30, 2023)

On August 30th, 2023, the Methods Lab and Olga Pasitselska (U of Groningen) organized the workshop on data donation in messaging groups research. The workshop intended to tackle practical and ethical issues behind data collection, processing, and dissemination in the research of closed messaging groups. We asked four colleagues to share their experiences and struggles and provide their solutions for closed chat groups research. The invited speakers, Sérgio Barbosa (U Coimbra), Katharina Knop-Hülß (HMTMH Hannover), Connie Moon Sehat (Hacks/Hackers), and Julian Kohne (GESIS), paved the way for better conceptualization of messaging groups and application of tailor-made ethical and practical solutions. The workshop allowed for a cross-field discussion of ad-hoc developments in closed groups research and provided many insights for the audience, speakers, and organizers.

Sérgio Barbosa explained his approach of joining activist WhatsApp groups in Brazil. Sérgio suggested that informed consent cannot be assumed as a one-off solution: instead, one should go beyond the check-list of ethical guidelines and learn by doing and negotiating with the group members. When joining these types of groups, researchers should clearly state the purposes of the research and disclose their identity, and also share the outcome of the research and promote it in the local community as well. Different approaches should be taken, depending on the type of groups: for example, pro-democracy groups and extremist groups should be treated differently, independent of the group size.

Sérgio Barbosa shares his experience with digital ethnography in WhatsApp groups.

Dr. Katharina Knop-Hülß shared insights about studying non-professional secondary groups (e.g., choir, sport, volunteer groups) with her highly unobtrusive and highly invasive research approach of scraping chats’ content. Since these groups were representative of intimate environments of everyday communication, they can be considered as “safe spaces”, closed from the public eye. To account for the sensitive nature of the data collection, Katharina used an opt-in approach, provided pseudonymized chat logs to the participants before they consented to participate, and complied with the requirement not to share this data with anyone beyond the research team, even after the data was pseudonymized.

Dr. Katharina Knop-Hülß during her presentation, “The Permanently Connected Group (PeCoG)”.

Julian Kohne introduced his digital platform for WhatsApp data donation that automatically cleans and anonymizes the data, reducing researchers’ exposure to and intervention in the raw data. In his research, Julian takes a participant-centered approach: the data collection tool is designed to maximize usability and control of the data for research participants. They can pre-process the data in a way that allows them to review the chat logs and decide what exactly they want to donate, deleting undesirable pieces of data, up to the possibility of deleting time stamps and other meta-data. With that, the tool also allows researchers to track how much and what types of data was deleted.

Julian Kohne introducing his digital platform for transparent WhatsApp data donation.

Dr. Connie Moon Sehat presented the meta-review of closed messaging apps research that aimed to determine what are the conditions in terms of indexed invites, group size, discussion topics, or other aspects of closed groups that make them arguably public or private. Adding to the previous speakers’ examples of their research with activist/public and hobby and friends/private types of groups, the review summarized the discussed points and provided a framework for mapping chat groups according to the multiple parameters. Whether researchers scraped the groups without entering them, entered with invitation, disclosed or not their identity and research interest, depended on the nature of the groups and public interest that can justify researchers’ intervention into the closed communication spaces. Connie also stressed the possible differences in perceptions of groups’ “publicness” between users, researchers, and platforms, that also should be taken into account.

Dr. Connie Moon Sehat gave the presentation, “Ethical Approaches to Closed Messaging Research… And Data Collection?”.

After four presentations, we continued the discussion with the online and offline audience, addressing the issues of generalizability of messaging data (what slice of the “natural” social interaction are we looking at here?), the role of language, and the differences between long- and short-term groups. We also discussed what is the role of the researcher in the automated versus manual data collection process, and how participants can benefit from data donation.

The workshop provided theoretical and practical insights for messaging groups research and outlined future directions for collaboration in creating the guidelines for ethical closed messaging research and data donation.

Workshop: Interdisciplinarity in Action: Methods for Fruitful Teamwork (October 4, 2023)

We are excited to announce our upcoming workshop, “Interdisciplinarity in Action: Methods for Fruitful Teamwork,” scheduled for Wednesday, October 4, at the Weizenbaum Institute. Led by Silvio Suckow and Sara Saba (both WI), this intensive one-day workshop provides practical tools and knowledge for enhancing teamwork and interdisciplinary collaboration. The workshop offers diverse perspectives and actionable advice for structuring interdisciplinary teams and their work, hands-on practice of various team-building methods, and an input presentation by an external speaker. It is open to anyone interested in interdisciplinary research, whether leading or collaborating on such projects. Please note that spots are limited and allocated on a first-come, first-served basis. A slightly modified online version of the course will be offered separately.

For more details about the workshop, visit our program page. We look forward to seeing you there!

Workshop Recap: Introduction to Topic Modeling (June 15, 2023)

On June 15, 2023, the Methods Lab organized the workshop “Introduction to Topic Modeling” in collaboration with the WI research group “Platform Algorithms and Digital Propaganda”. The workshop aimed to provide participants with a comprehensive understanding of topic modeling, a machine-learning technique used to determine clusters of similar words (i.e., topics) within bodies of text. The event took place at the Weizenbaum Institute in a hybrid format, bringing together researchers from various institutions.

The workshop was conducted by Daniel Matter (TU Munich) who guided the participants through basic concepts and applications of this method. Through theory, demonstrations, and practical examples, attendees gained insight into commonly used algorithms such as Latent Dirichlet Allocation (LDA) and BERT-based topic models. The workshop enabled participants to assess the advantages and drawbacks of each approach, equipping them with a foundation in topic modeling while, at the same time, providing plenty of new insights to those with prior expertise.

Daniel Matter introduces the most important aspects of topic modeling

During the workshop, Daniel explained the distinction between LDA and BERTopic, two popular topic modeling strategies. LDA, or Latent Dirichlet Allocation, a commonly used method for topic modeling, operates as a generative model and treats each document as a mixture of topics. LDA aims to determine the topic and word distributions that maximize the probability of generating the documents in the corpus. With LDA, as opposed to BERTopic, the number of topics must be known beforehand.

BERTopic, on the other hand, belongs to the category of Embeddings-Based Topic Models (EBTM), which take a different approach. Unlike LDA, which treats words as distinct features, BERTopic incorporates semantic relationships between words. BERTopic follows a bottom-up approach, embedding documents in a semantic space and extracting topics from this transformed representation. Unlike LDA, which can be applied to short and long text corpora, BERTopic generally works better on shorter text, such as social media posts or news headlines.

Daniel Matter explains the concept of hierarchical clustering.

When deciding between BERTopic and LDA, it is essential to consider the specific requirements of the text analysis. BERTopic’s strength lies in its flexibility and ability to handle short texts effectively, while LDA is preferred when strong interpretability is necessary.

With this workshop, we at the Methods Lab hope to have provided our attendees with a comprehensive understanding of topic modeling as a method, with a special focus on LDA and BERTopic. By exploring the concepts, applications, and advantages of each approach, these tools can be used to unlock hidden semantic structures within textual data, enabling researchers to employ them in various domains and facilitating tasks such as document clustering, information retrieval, and recommender systems.

The workshop was held in the Flexraum at the Weizenbaum Institute.

We want to thank Daniel for giving this workshop and inducting us into the world of topic modeling and also all participants, both virtually and at the institute.

Our next workshop, “Whose data is it anyway? Ethical, practical, and methodological challenges of data donation in messenger groups research”, will take place on August 30, 2023. We hope to see you there!

Workshop: Theory Construction: Building and Advancing Theories for Empirical Social Science (September 14, 2023)

We are excited to announce our upcoming workshop, “Theory Construction: Building and Advancing Theories for Empirical Social Science,” which will take place on Thursday, September 14 in the Kassenhalle (main hall), WI. Led by Adrian Meier (FAU Erlangen-Nürnberg) and created in collaboration with Dr. Daniel Possler (JMU Würzburg), this intensive “crash course” will equip participants with practical strategies for constructing and advancing social scientific theories. Beginning with an exploration of fundamental concepts, structure, and quality criteria of social scientific theories, Adrian will delve into hands-on techniques for building and advancing theory. The workshop will focus on the theory-building process as well as the micro-level of social analysis, offering examples from media psychology and communication science.

You can find more about the workshop on our program page. See you there!

Workshop Recap: From Civic Tech to Science – Reimagining Science-Society Relations (July 6, 2023)

On July 6, participants gathered in the Flexraum at the Weizenbaum Institute for the workshop “From Civic Tech to Science: Reimagining Science-Society Relations,” led by Nicolas Zehner. Civic tech encompasses a diverse array of empowering technologies that enable democratic participation by allowing citizens to engage with societal issues and contribute to positive change. What insights can science gain from civic tech initiatives? How can they contribute to inclusive knowledge creation? And how can the design of these initiatives help rethink science-society relations? Those were some of the key questions that guided this workshop.

The workshop involved three introductory position statements, each shedding light on different aspects of civic tech’s impact. The position statement on “The Journalism of Things,” exemplified by projects like “Radmesser” and “Bienenlive,” demonstrated how civic tech can impact citizen behavior, raise topic visibility, and foster transdisciplinary knowledge. Dr. Beatrice Jetto’s position statement, “Blockchain-based Civic Tech Ecosystem: Bridging the Gap Between Research and Practice Objectives”, highlighted the potential of blockchain-based civic tech in making citizen participation in urban development more inclusive and transparent. Furthermore, Nicolas Zehner’s statement position, “AI, Environmental Protection, and the Promise of Participation”, discussed how Artificial Intelligence (AI) can serve as a platform for reimagining science-society relations and a gateway to thinking about more global issues by reintroducing the concept of “awareness of uncertainty” as a form of knowledge.

Following the position statements, the workshop engaged participants in group work sessions, facilitating discussions on knowledge transfer beyond conventional science communication. Collaboratively, they explored ways to create infrastructures that foster collaboration and include data subjects, avoiding the reproduction of existing power structures and ensuring equitable civic tech initiatives.

Some results of the group work

Workshop Recap: DSA – Data Access for Research (June 21, 2023)

Data is an invaluable asset for scientific research. However, accessing platform data for academic purposes has become increasingly challenging, particularly with the closure of free access to APIs like Twitter’s. Recognizing the significance of data accessibility for research, the Weizenbaum Institute organized a workshop in collaboration with the European New School of Digital Studies (ENS) titled “Datenzugang für die Forschung – Der Digital Services Act (DSA)” This workshop, held on June 21 at the ChangeHub, aimed to explore the potential of the upcoming Digital Services Act (DSA) in facilitating data access for academic research.

The DSA is set to bring about improvements in data access for researchers under Article 40. However, the DSA’s regulations must be thoughtfully implemented at the national level to achieve these goals fully. With the closure of free access to Twitter’s API, there is an urgency to find robust solutions to enable researchers to access platform data for scientific inquiry. The DSA, expected to come into force in February 2024, holds promises to provide avenues for researchers to obtain the data they need for their academic research. Still, it also brings about its own set of challenges.

The workshop aimed to foster an open forum where researchers from diverse disciplines, particularly those who work or plan to work with platform data, could come together to provide recommendations for the effective implementation of the DSA. Organized by Ulrike Klinger (ENS) and Jakob Ohme (WI) and supported by the Stiftung Mercator, the workshop addressed crucial questions surrounding data access requests, eligible data, and the verification process by authorities and platforms.

The workshop started with a welcoming address from Ulrike Klinger. Jakob Ohme then provided an overview of the DSA’s Article 40, shedding light on its potential implications for researchers. This was followed by presentations on the DSA’s implementation in Germany by Gökhan Cetintas from the Bundesministerium für Digitales und Verkehr and Andrea Sanders-Winter from the Bundesnetzagentur, who offered insights into the data access rules under the DSA.

After a coffee break, Jessica Gabriele Walter from Aarhus University presented on DSA40 and scholarly networks in other EU countries, providing a broader perspective on data access challenges and solutions. Richard Kuchta from Democracy Reporting International later delved into “The Data Access Problem” and emphasized the necessity of a vetting process to ensure data security and accuracy.

The latter part of the workshop involved group work in which participants engaged in the discussion and expansion of a policy paper draft prepared by the Weizenbaum Institute and ENS, based on inputs from an early expert round. The goal was to develop actionable recommendations that would benefit the research community in Germany and the EU. Breakout sessions centered on topics like “Vetting Access,” “Access Modes,” and “Infrastructure,” allowing participants to delve deeper into specific aspects of data access.

The workshop brought together an interdisciplinary group of researchers with a shared vision: enabling access to platform data for academic purposes. By combining their expertise and perspectives, participants crafted recommendations for the effective implementation of the DSA, ensuring that data access for research remains equitable and secure. As the DSA comes into force and takes shape, the outcomes of this workshop are expected to serve as a significant step forward in fostering inclusive dialogue on the future of data accessibility.

Further Information
\ Thursday Lunch Talk Series: Article 40 of the DSA (April 20, 2023)
\ Response to the Call for Evidence DG CNECT-CNECT F2 by the European Commission
\ Interview with Jakob Ohme “Researchers Fight for Data Access under the DSA”

Launch of the Weizenbaum Panel Data Explorer

We are very excited to announce the launch of the Weizenbaum Panel Data Explorer, an interactive website developed by Methods Lab member Roland Toth. The Data Explorer allows you to browse and analyze survey results from the annual survey conducted by the Weizenbaum Panel on media use, political participation, civic norms, and more. In the spirit of open science, it not only presents research data, but also in an easy-to-use manner.

The Weizenbaum Panel aims to shed light on the complex relationship between the digital realm and political engagement. By examining phenomena such as hate speech and fake news, as well as the active commitment to a democratic culture of debate, the telephone survey offers invaluable insights into the ever-evolving dynamics of citizen participation in Germany.

With the launch of Data Explorer, you can explore this comprehensive dataset and gain a deeper understanding of Germany’s social and political landscape. The platform offers six categories: social media platform use, political attitudes, civic norms, political participation, and online civic intervention. Each category presents a unique perspective, allowing you to examine specific aspects of Germany’s social and political fabric.

The Weizenbaum Panel Data Explorer interface

To begin your exploration, simply select a category that piques your interest. Within each category, you will find a selection of questions to delve into. Whether you want to gauge the political news media consumption of the German public, analyze trends in the use of video platforms such as TikTok and Instagram, or find out how often people discuss political issues at work, or with friends and family, the Data Explorer will assist you in this endeavor.

For a nuanced understanding of how different groups within the population engage in social and political activities, you can group the data output by selecting the demographic factors gender, age, level of education, or residence. Moreover, to enhance your experience and facilitate data sharing, you can download any graph in .png format. Each graph includes the question, answering options, and grouping, providing a comprehensive visual representation of the desired data.

A graph downloaded in .png format

The Weizenbaum Data Explorer was developed in Python/Jupyterhub and deployed using Voilà, which are all open-source. It is hosted on Weizenbaum Institute servers, which ensures adequate data protection. This is not the case for typical solutions such as using R Shiny and the deployment platform shinyapps.io. The Data Explorer will be expanded continuously – for example, the fourth wave of the Weizenbaum Panel will be integrated soon.

Whether you’re a researcher, journalist, student, or simply someone curious about Germany’s social and political landscape, the Weizenbaum Panel Data Explorer equips you with the tools to visualize data effortlessly. Happy exploring!

Thursday Lunch Talk Series: Article 40 of the DSA (April 20, 2023)

Researchers in the EU are about to have a new legislative framework to access and study data held by platforms and search engines in the form of Article 40 of the Digital Services Act (DSA) – a major milestone in platform regulation history expected to have spillover effects worldwide. As part of the Thursday Lunch Talk Series, Jakob Ohme (WI) and the Methods Lab jointly organized a talk to gain more insight into what this Article 40 means in the context of German law, and the consequences it might have on researchers’ access to platform data. Tupperware and brown paper bags in hand, hungry participants gathered in the Flexraum to listen to Jakob give the ABCs of the EU’s new data access regime and discuss some of its opportunities, limitations, and grey areas.

Here is a quick summary of Article 40:

  1. Providers of very large online platforms (VLOPs) or search engines (VLOSEs) shall provide access to data necessary for monitoring and assessing compliance with the DSA, at their reasoned request and within a reasonable period specified in that request, access to data necessary to monitor and assess compliance with this regulation.
  2. Data accessed can only be used for monitoring and assessing compliance while taking into account the rights and interests of the platform providers, service recipients, personal data protection, and the security of their services.
  3. Platforms must explain the design, logic, functioning, and testing of their algorithmic systems, including recommender systems, upon request.
  4. Vetted researchers can request access to data to conduct research on “systemic risks” in the EU and assess risk mitigation measures.
  5. Within 15 days, platforms can request to amend a data access request as referred to in §4 if:
    (a) they do not have access to the data
    (b) giving access to the data will lead to significant vulnerabilities in the security of their service or the protection of confidential information, particularly trade secrets.
  6. Requests for amendment pursuant to §5 should propose alternative means for providing access to appropriate and sufficient data.
  7. Platform providers or search engines shall facilitate and provide access to data pursuant to §1 and §4 through appropriate interfaces specified in the request, including online databases or application programming interfaces.
  8. Researchers can be granted the status of “vetted researchers” if they meet specific conditions, including affiliation with a research organization, independence from commercial interests, disclosure of research funding, capability to fulfill data security requirements, and commitment to making research results publicly available.
  9. Researchers can submit applications to the DSC of the Member State they are affiliated with, who conducts an initial assessment before forwarding the application to the DSC of Establishment for a final decision.
  10. The DSC can terminate data access for vetted researchers if they no longer meet the conditions. The coordinator must inform the platform provider and allow the researcher to respond before terminating access.
  11. DSCs must inform the Board about vetted researchers and their research purposes. If access to data is terminated, they must also inform the Board.
  12. Platforms must provide timely access to publicly accessible data, including real-time data, to researchers who meet the conditions and use it for research on systemic risks.
  13. With input from the Board, the Commission will adopt delegated acts to specify technical conditions for data sharing, including with researchers, while considering the rights and interests of platforms and service recipients, protection of confidential information, and maintaining service security.

Both presenter and the audience highlighted several aspects regarding the infrastructure and implications of the article, which made for a vibrant, fruitful discussion. One question focused on the effort platforms would need to make in order to prevent researchers from acquiring data (§5). Though making a projections at this point in time is challenging due to the remaining unknowns, lawyers predict that platforms will try to prevent researchers’ access to data more for certain areas than others. One such area could be questions pertaining to algorithms, which would fall under the so-called “trade-secret exemption.” Another topic of discussion was the “systemic risk research” requirement (§4). More specifically, what do we mean when we speak of systemic risks? As a term that can be understood very widely, it would be possible, hypothetically speaking, to file a request as long as one can argue for a broader understanding of it.

Some details regarding the data vetting process and its implementation remain unclear, such as the establishment of an independent advisory mechanism and the technical conditions under which it would operate. Most of the largest platforms and search engines are based in Ireland, so the DSC of Establishment tasked with vetting researchers will likely be the Irish DSC in many cases. Researchers can also send their applications to their country’s national digital services coordinator. In terms of regulatory oversight in Germany, it is anticipated that the Bundesnetzagentur will play a significant role as the DSC regulator. The future German DSC will be able to provide an opinion about whether to grant a data access request, but the final decision will remain in the hands of the Irish DSC.

DSCs are yet to be appointed by EU member states, and complex vetting may require an independent advisory body responsible for this task. However, the establishment of an independent advisory mechanism comes with its own set of challenges. How much power will the board have? And how will the board make its decisions? During the talk, the difficulty of dealing with and assessing raw data when one does not know what to look for was identified as another potential issue. An alternative model could involve access to publicly accessible data without vetting. This approach would be similar to what the Twitter API has provided in the past, and it may prove to be an exciting option for fueling research, primarily if implemented in real-time and through application programming interfaces.

This edition of the Thursday Lunch Talk Series shed light on several key aspects of Article 40, emphasizing the opportunities and challenges it could create for researchers’ access to platform data in the future. While some details, such as the data vetting process, remain uncertain, the presentation sparked valuable discussions, highlighting the complexities and considerations involved in what lies ahead for platform providers, researchers, and lawmakers in navigating our digital landscape.

Food for thought!

Further Information
\ Response to the Call for Evidence DG CNECT-CNECT F2 by the European Commission
\ Interview with Jakob Ohme “Researchers Fight for Data Access under the DSA”

Research stay at Universidad de Navarra (Pamplona, Spain)

From April 17-23, Methods Lab Data Scientist Roland Toth spent a week at the Institute for Culture and Society (ICS) at Universidad de Navarra in Pamplona, Spain. This flash visiting researcher stay was financed and took place in the context of their project Youth in Transition in which they have collected data every year for four years in a representative sample of the Spanish population. These data include various information on smartphone use, smartphone pervasiveness, and psychological traits.

View of the university campus from the north

Together with the researchers Aurelio Fernández, Javier García-Manglano, and Pedro de la Rosa, Roland wrote a first draft of a research article using these data. As mobile media use is typically measured using indicators of use quantity (duration and frequency) alone, the paper deals with the question whether qualitative dimensions of mobile media use should be involved in its measurement, too. Specifically, the researchers are investigating the role of gratification variety (e.g., for information, social contact, or escapism) and situation variety (e.g., while in a meeting, while watching a movie, or while eating). Both represent defining characteristics of mobile media devices like the smartphone, as we typically use them for various purposes, anytime, and anywhere. For conceptual validation, the researchers examine whether these two qualitative dimensions contribute substantially to predicting the concept of mobile vigilance – the constant salience of mobile media devices and an urge to monitor and remain reactive to them. As such vigilance is tied to mobile media use per definition and emerged in close alignment to its development, it is bound to be associated with smartphone use. In other words: If gratification and situation of smartphone use can explain a share of mobile vigilance that remains unexplained by the quantity of smartphone use, this indicates that both dimensions are substantial to the measurement of mobile media use. The researchers are currently finalizing the article.

Inside the Facultad de Comunicación

Inviting Roland for this stay was a generous gesture of ICS and the researchers and the institute were very welcoming and engaged in the project during his stay. Aside from the productive cooperation, our colleague was delighted with the beautiful campus and the equally charming city of Pamplona (and Donostia-San Sebastián), where spring had actually begun already. We hope that the article can be published successfully and that the cooperation between ICS at Universidad de Navarra and the Methods Lab of the Weizenbaum Institute will continue in future projects!

Roland Toth (left) and Aurelio Fernández (right)