AAU Studenterprojekter - besøg Aalborg Universitets studenterprojektportal
A master thesis from Aalborg University

Ud af Mange, En.: En undersøgelse af effekterne af model, temperatur og prompting på varians af LLM-generede syntetisk spørgeskema data i en dansk kontekst.

[Out of Many, One.: An examination of the effects of model, temperature, and prompting on the variance of LLM-generated synthetic survey data in a Danish context.]

Forfatter(e)

Semester

4. semester

Uddannelse

Udgivelsesår

2025

Afleveret

2025-08-29

Antal sider

595 pages

Abstract

Surveys are a key source of data within social science research, however declining response rates have spurred interest in Large Language Models as a source of synthetic data. A hurdle shown by initial research is the lack of variance in model output when compared to human data. This study examines how newer models, and changes in temperature and prompt affect this truncated variance. I utilize Silicon Sampling, prompting models to adopt personas based on Danish demographic and response data from Tryghedsmålingen 2024 to generate and compare 1,000 personas with a real human baseline. The models tested are GPT-3.5-turbo, GPT-4.1 and o3, at temperature 1 and 2, and with a ‘flawed’ and ‘improved’ prompt. I find that models perform worse than initial research had led to expect, with truncated variance and mean bias in both aggregate and item-level distributions. Newer models appear less capable of creating a human-like item-level histograms, possibly caused by stronger alignment. Temperature is either inapplicable, insufficient or cause high rates of response failure. Effects of prompt prove difficult to predict, with results varying across model and temperature. This implies recent LLM developments have not resolved the issue of truncated variance, and that the common methods are currently inadequate. Finally I discuss the possible implications and issues with LLM-generated synthetic data within science and society at large.

Emneord

Dokumenter


Kolofon: Denne side er en del af AAU Studenterprojekter — Aalborg Universitets studenterprojektportal. Her kan du finde og downloade offentligt tilgængelige kandidatspecialer og masterprojekter fra hele universitetet fra 2008 og frem. Studenterprojekter fra før 2008 kan findes i trykt form på Aalborg Universitetsbibliotek.

Har du spørgsmål til AAU Studenterprojekter eller Aalborg Universitets forskningsregistrering, formidling og analyse, er du altid velkommen til at kontakte VBN-teamet. Du kan også læse mere i AAU Studenterprojekter FAQ.