Scraping – WI Methods Lab

This blog post discusses when and when not to use the official TikTokAPI. Additionally, this blog post provides step-by-step instructions for a typical research scenario to inform aspiring researchers about using the API.

When and when not to use it

While being the official way of data access, the official TikTok API is by no means the only way for collecting TikTok data in an automatized fashion. Depending on the research endeavour, one of the other ways might be the way to go:

4Cat + Zeeschumier: Sensible if you want to collect limited data on one or more actors, hashtags, or keywords and/or are not confident in programming for the subsequent analysis.
An in-official TiKTok API (pyktok or the Unofficial TikTok API in Python): Both are great projects that provide significantly more data points than the official API. However, this comes with costs: stability and dependency on developers reacting to changes on TikTok’s site.

But why should you use the official TikTok API if those two options are available?

Reliability. In theory, the official API data access provides more stable access than other solutions.
Legality. Depending on your country or home institution, official data access might be a problem for legal reasons. However, you are on the safer side with official data access. Please consult your institution regarding data access.
User-level data. Other data collection methods are often superior in terms of data points on the video level (Ruz et al. 2023). However, the official TikTok API offers a set of user-level data (User info, liked videos, pinned videos, followers, following, reposted videos), which is not as conveniently available through other data collection methods.

One fundamental limitation still needs to be kept in mind. One can make only 1,000 daily requests, each containing 100 records (e.g., videos, comments) at most. This means that if one can exploit the complete 100 records per request (rarely possible), one can retrieve a maximum of 100,000 records per day.

To start with the official TikTok research API, visit Research API. To gain access, you need to create a developer account and submit an application form. When doing so, please record your access request under DSA40 Data Access Tracker to contribute to an effort to track the data access platforms provided under DSA40.

The official documentation on research API usage is not intuitive, especially for newcomers (Documentation). Using the API within the typical programming language Python/R might still pose a challenge, especially for researchers who are working with an API for the first time. The currently scarce availability of API guidance motivates this blog post to provide such guidance without a paywall.

We are thrilled to announce the release of “Challenges and Perspectives of Hate Speech Research,” a collection of 26 texts on contemporary forms of hate speech by scholars from various disciplines and countries. The anthology is co-edited by Methods Lab members Christian Strippel and Martin Emmer, together with research colleagues Sünje Paasch-Colberg and Joachim Trebbe. Divided into three sections, it covers present-day political issues and developments, provides an overview of key concepts, terms, and definitions, and offers numerous methodological perspectives on the topic. Whether you are a fellow academic researcher or a concerned netizen, this book is a must-read for anyone interested in the dynamic field of interdisciplinary hate speech research and the future of our evolving digital landscape.

Challenges and Perspectives of Hate Speech Research is open access!

This book is the result of a conference that could not take place. It is a collection of 26 texts that address and discuss the latest developments in international hate speech research from a wide range of disciplinary perspectives. This includes case studies from Brazil, Lebanon, Poland, Nigeria, and India, theoretical introductions to the concepts of hate speech, dangerous speech, incivility, toxicity, extreme speech, and dark participation, as well as reflections on methodological challenges such as scraping, annotation, datafication, implicity, explainability, and machine learning. As such, it provides a much-needed forum for cross-national and cross-disciplinary conversations in what is currently a very vibrant field of research.

Tag: Scraping

Tutorial: When and how to use the official TikTok API

When and when not to use it

Book Launch: Challenges and Perspectives of Hate Speech Research