Career Tutorial: LLMs for all Expertise Levels (March 7, 2025)

In a joint effort, the Career Development and the Methods Lab are excited to announce the hybrid “Career Tutorial on LLMs for all Expertise Levels”. In this tutorial, beginning with fundamental concepts of LLMs and in-context learning, we’ll address the “Needle in the Haystack Problem” and compare ultra-long context models with RAG approaches. Through practical demonstrations, participants will gain hands-on experience with RAG’s core functionalities and understand its objectives. The session delves into scaling solutions using vector databases and advanced implementations, including chunking strategies, hybrid RAG, and graph-based RAG architectures. We conclude with an overview of emerging trends, examining agentic RAG and the integration of reasoning models in deep research applications. This comprehensive exploration equips attendees with both theoretical knowledge and practical insights into the latest developments in AI language models.

For more information, visit our program page. We are looking forward to your participation!

Workshop: Introduction to Git

Join us in our first workshop of 2025 for an Introduction to Git, held on Thursday, February 6th. This event will be taking place at the Weizenbaum Institute and welcomes Weizenbaum Institute members to participate.

LK Seiling, an associate researcher, IT administrator Sascha Kostadinoski, and student assistant Quentin Bukold will be the primary instructors leading this event. Together they will guide participants through short theoretical segments, introducing fundamental Git commands and version control concepts. In addition to learning the operations of key Gitlab features, this workshop encourages guests to participate in quizzes and incorporates interactive exercises.

For further details, visit our program page. We hope to see you there!

Show and Tell Recap: OpenQDA – A Sustainable and Open Research Software for Collaborative Qualitative Data Analysis

On November 18, 2024, Karsten Wolf and Florian Hohmann from the University of Bremen presented the software OpenQDA at WI. In this Show and Tell, they gave an overview of OpenQDA and its motivations, functions, and limitations.

In the first part of the Show and Tell, Karsten Wolf presented the development and purpose of the software. It is an open-source alternative to the commercial software MaxQDA, which is a popular tool for text annotation (i.e., coding) in qualitative research. The team at the University of Bremen had been working on OpenQDA for quite some time to not only deliver a free and customizable alternative to MaxQDA, but also allow for (simultaneous) collaboration on projects. In addition, OpenQDA has a plug-in framework that will be expanded over time. For example, atrain is already supported and can be used to transcribe audio files to text, and a plug-in that allows for implementing Python scripts is currently in the works. While OpenQDA is still under development and currently in early-access, the first official release is planned for the near future. It runs on servers at the Unversity of Bremen and can be used by anyone for free.

In the second part of the Show and Tell, Florian Hohmann gave a practical introduction to the most recent version of the software. He showed participants how to create an account, set up a new project, and create a team to work on projects collaboratively. Text content can be added manually, from documents, audio files, and soon even remote sources. These texts can then be annotated/coded using separate, color-coded categories, and it is possible to set up sub-categories for further refinement. The results can be exported in CSV format. In addition, users can create a code portrait, which illustrates the distribution of categories across the text, and a word cloud for quick visual analysis.

At the end of the Show and Tell, participants provided feedback and suggestions for future implementation. For example, the automated conversion of scanned documents to plain text using OCR, and functions like counting and automatic coding, were discussed. Some participants were willing to stay and provide further feedback even after the main event ended. Finally, the team from Bremen, the Methods Lab, and the Weizenbaum Institute IT department discussed the installation of OpenQDA on the Institute’s servers in 2025 to provide a local instance to Weizenbaum Institute researchers.

The Methods Lab would like to thank the colleagues from Bremen for their work, and all participants for providing useful feedback!

Workshop Recap: Research in Practice – Attending to Algorithms in and Around Organizations

On November 26 2024, Maximilian Heimstädt, Professor of Digital Governance & Service Design at the Helmut Schmidt University in Hamburg, shared his experiences and expertise in applying qualitative methods to studying algorithms in organizations. This workshop was co-organized by the Methods Lab and the Research in Practice – PhD Network for Qualitative Research, coordinated by Katharina Berr and Jana Pannier.

The workshop focused on the complexities of studying algorithms from an interpretivist social science perspective; not only the potentials and risks people ascribe to them, but how they are made sense of, enacted, negotiated and integrated into everyday work settings. Drawing on joint research with Simon Egbert on predictive policing, Max shared how he gained access to public sector organizations, approached team-based multi-sited ethnographic fieldwork and learned to understand complex technologies developed and implemented across different empirical sites and over time.

Max introduced three central theoretical approaches from organization studies and critical data studies to research algorithms in practice: technology trajectories, biographies of algorithms, and data journeys that all afford different analytical lenses and offer more nuanced understandings of algorithmic systems. The approach of technology trajectories expands research of the design and use of technologies by integrating broader questions of power, ideology, and institutional change (Bailey & Barley, 2020). Approaching digitalization research from a biographies approach draws attention to the dynamic development of digital technologies, understood as ‘entangled, relational, emergent, and nested assemblages’ across different organizational contexts and time (Glaser, Pollock, & D’Adderio, 2021). Finally, the data journeys approach allows to ‘focus attention on the life of data as they move through space and time, through different sites and cultures of data practice’, and offers a perspective that is attentive to frictions of such data journeys (Bates, Lin, & Goodale, 2016). Based on an introduction of these approaches, the workshop participants explored how their own research has been (both implicitly and explicitly) informed by these approaches, and discussed their practical and epistemic potentials and limits.

The Idea Behind the ‘Research in Practice’ Workshop Series

Qualitative research often feels polished in academic publications, but the reality is that the process can be quite complex at times, and full of twists and turns. We have created this workshop series to center the ‘backstage’ of qualitative research. The goal is to hear directly from scholars about how they conduct their work – the challenges, the unexpected discoveries and unplanned adaptations, the specific methods and digital tools used, and the strategies that help them arrive at interesting and valuable findings. With this workshop format and research network, we aim to create a space for qualitative researchers within and beyond the Weizenbaum Institute to connect, collaborate, and learn from one another.

What to Expect

Each workshop session in the series brings a new perspective on qualitative (digital) research. Invited scholars walk us through their research processes, focusing on how they have handled the challenges of their work. This includes designing studies, building rapport with research participants, analyzing different kinds of qualitative data, theorizing as method, and navigating ethical considerations. The sessions are interactive, offering opportunities to ask questions, share ideas, and discuss in depth. By opening up the processes behind qualitative research, we hope to demystify the work and facilitate conversations that help researchers at all levels.

If you would like to join our network and to be informed about upcoming events, reach out to Katharina Berr and Jana Pannier.

Workshop Recap: Open Research – Principles, Practices, and Implementation

On September 3 2024, Tobias Dienlin from the University of Vienna held the workshop Open Research – Principles, Practices, and Implementation at WI. In this workshop, he gave an overview of Open Research and its motivations, relevance, and formal and technical implementation.

In the first part of the workshop, Tobias argued that certain problems and values in science are the main reasons why researchers should practice Open Research. The problems included the replication crisis (a lack of or low quality of replication studies, especially in the social sciences), questionable research practices (p-hacking, HARKing, errors), and publication bias (journals prefer exciting, expected, and significant results). The values in question included openness as a foundation of science itself and the dedication to scientific advancement instead of emphasizing individuals that achieve it.

In the second part, the formal practices of Open Research were discussed. Tobias first clarified the differences between the terms Open Science, Open Research, and Open Scholarship. To achieve a culture of Open Research, he suggested aiming for open access, pre-/post-printing, open reviews, author contribution statements, open teaching, and citizen science. While these practices ususally require additional work, the burden can be lowered by already considering and preparing them in the initial stages of a research project. For instance, by implementing two of the most important Open Research practices: Preregistrations and registered reports.

  • In a preregistration, any details of a study that are already fixed (e.g., theoretical foundation, research questions, hypotheses, analysis methods, …) are published before conducting the study itself. After conducting the study, the preregistration is referred to in the manuscript, and possible deviations from it are explained. This procedure reduces the possibility and risk of p-hacking and HARKing, and under specific circumstances a preregistration can even take place after the data have already been collected.
  • A registered report is a more elaborate version of a preregistration. It consists of all parts of a submission that do not involve the analysis and the results. The submission can therefore be reviewed before the data and results even exist. This way, reviewers are not influenced by results and publication bias can be avoided. While a preregistration can be published anywhere, the registered report format needs to be offered by the journal itself.

In the last part of the workshop, the focus was on tools and software that help implement Open Research practices. For example, the free-to-use repository OSF can be used for pre-/post-prints, preregistrations, and online supplementary materials such as data, analysis code, or questionnaires. As an exercise, Tobias gave participants the opportunity to implement a basic preregistration or registered report on OSF for a research project they were working on already and try different features, such as linking it to a repository on GitHub. After summarizing the insights of the workshop, Tobias concluded by showing a fitting statement:

Open Science: Just Science Done Right.

During the workshop, participants had plenty of space to ask questions, discuss with everyone or in separate breakout rooms, and interact in various ways. We would like to thank Tobias for this insightful workshop and strongly encourage the implementation of Open Research.

Workshop: Open Research – Principles, Practices, and Implementation (September 3, 2024)

We’re excited to announce our upcoming workshop Open Research – Principles, Practices, and Implementation, which will take place on Tuesday, September 3. This workshop will be conducted both at the Weizenbaum Institute and online, and is open to Weizenbaum Institute members as well as external participants (and the QPD).

Led by Tobias Dienlin, Assistant Professor of Interactive Communication at the University of Vienna, this workshop will equip participants with skills in open research by covering principles of transparency, reproducibility, the replication crisis, and practical sessions on sharing research materials, data, and analyses. It will also include preregistrations, registered reports, preprints, postprints, TOP Guidelines, and initiatives like DORA, CORA, and RESQUE. Participants will engage in drafting preregistration plans and discussing the incentives and challenges of open research, aiming to integrate these practices into their work for a more transparent and robust research community.

For further details, visit our program page. We are looking forward to your participation!

Workshop Recap: Introduction to High-Performance Computing (HPC)

On May 6 2024, Dr. Loris Bennett from FUB-IT at Freie Universität Berlin held the workshop Introduction to High-Performance Computing (HPC) at WI. In this workshop, he gave an overview of the mechanics of HPC and enabled participants to try it out themselves. While the workshop used the HPC cluster provided by FUB-IT as a practical example, most of the contents applied to HPC in general.

Dr. Bennett began with definitions of HPC and core concepts. He described HPC as a cluster of servers providing cores, memory, storage with high-speed interconnections. These resources are shared between users and distributed by the system itself. Users send jobs consisting of one or more tasks to the HPC cluster. Each task will run on a single compute server, also called a node, and can make use of multiple cores up to the maximum available on a node. The number of tasks per node can be set for each job, but defaults to one. Lastly, an HPC cluster may provide different file systems for different purposes. For example, the file system /home is optimized for large numbers of small files used for programs, scripts, and results, while /scratch is optimized for temporary storage of small numbers of large files.

Next, Dr. Bennett proceeded with resource management. When launching a job, many parameters can be set, such as the number of CPU and GPU cores, the amount of memory, and the time used. In order to determine the resources required for jobs, users need to run a few jobs and check what was actually used. This information can then be used to set the requirements for future jobs and thus ensure that the resources are used efficiently. The priority of a job dictates when a job is likely to start and depends mainly on the amount of resources consumed by the user in the last month. A Quality of Service (QoS) can be set per job which will increase the priority of a job, but the jobs within a given QoS will be restricted in the total amount of resources they can use. In addition, it is possible to parallelize tasks by splitting them into subtasks that can be performed simultaneously. Likewise, many similar jobs can be planned efficiently using job arrays.

Finally, participants could log into the FUB-IT HPC cluster themselves either using the command line or graphical interface tools and request first sample jobs. They were shown how to write batch files defining job parameters, use commands to submit, show, or cancel jobs, and check the results and efficiency of a completed job.

The Methods Lab would like to thank Dr. Bennett for his concise but comprehensive introduction to HPC!

Workshop Recap: Research Ethics – Principles and Practice in Digitalization Research

On April 18 2024, the Methods Lab organized the workshop Research Ethics – Principles and Practice in Digitalization Research to meet the increasing relevance and complexity of ethics in digitalization research.

In the first part of the workshop, Christine Normann (WZB) introduced participants to good research practice and research ethics in alignment with the guidelines of the German Research Foundation (DFG). Besides the need to balance the freedom of research and data protection, she informed about important institutions, noted the difficulties of formulating ethics statements for funding applications before study designs are finalized, and provided some practical tips regarding guidance when planning research.

Next, Julian Vuorimäki (WI) guided participants through the handling of research ethics at the Weizenbaum Institute. He focussed on the code of conduct, ombudspersons, guideline for handling research data, and the newly founded review board. The latter is in charge of providing ethics reviews for individual projects and studies, which can be applied for through a questionnaire on the institute website.

In the second part of the workshop, three researchers presented practical ethical implications and learnings from research projects. Methods Lab lead Christian Strippel reported on a study where user comments were annotated to allow for the automatic detection of hate speech. He focused on possible misuse for censorship, the confrontation of coders with questionable content, and the challenges of publishing the results and data regarding copyright and framing. Tianling Yang (WI) presented ethical considerations and challenges in qualitative research. The focus lied on consent acquisition, anonymity and confidentiality, power relations, reciprocity (i.e., incentives and support), and the protection of the researchers themselves due to the physical and emotional impact of qualitative field work. Finally, Maximilian Heimstädt (Helmut Schmidt University Hamburg) talked about ambiguous consent in ethnographic research. He gave insights into a study in cooperation with the state criminal police office to predict crime for regional police agencies. Not all individuals in this research could be informed about the research endeavor, especially when the researchers accompanied the police during their shifts, which raised the question of how to find a balance between overt and covert research.

The Methods Labs thanks all presenters and participants for this insightful workshop!

Workshop: Introduction to High-Performance Computing (HPC) (May 6, 2024)

We’re excited to announce our upcoming workshop Introduction to High-Performance Computing (HPC), scheduled for Monday, May 6th at the Weizenbaum Institute. Led by Loris Bennett (FU) from the HPC service at Freie Universität Berlin, the workshop is open to members of the Weizenbaum Institute with an FU account and access to HPC resources at FU. It aims to provide fundamentals on utilizing HPC resources in general by the example of those offered by FU Berlin.

For further details about the workshop, please visit our program page.