During her fellowship in the DigiMoD Research Group, Dr. Fabienne Lind gave an interesting talk at the Weizenbaum Institute. On September 10th, she spoke on “Studying Global Issues: Accommodating Context and Language Diversity when Employing Computational Text Analysis.”
Addressing and mitigating global issues, such as migration movements or climate change, necessitates collaborative efforts that extend beyond national borders. Understanding the communication dynamics among political actors, citizens, and organizations regarding these global issues is crucial, yet requires employing methods capable of analyzing communication in multiple languages.
In her talk, Dr. Fabienne Lind provided an overview of methodological approaches for managing multilingual datasets from diverse contexts, with a particular focus on the latest innovation: chat-based Language Models (LLMs). LLMs represent cutting-edge tools for coding constructs with exceptional performance across various types of variables. However, their effectiveness is largely rooted in extensive training on textual data, predominantly in high-resource languages such as English, German, and Spanish. This raises concerns about the inclusivity of low-resource languages in the training data, potentially leading to biases when applying these models to less-represented languages. The talk explored how different variations in text language and prompt design — including language choice and contextual information — affect GPT’s coding proficiency based on results from an experiment.
We sincerely thank Dr. Fabienne Lind for sharing her valuable research with us!