Forfatter(e)
Semester
4. semester
Uddannelse
Udgivelsesår
2025
Afleveret
2025-08-29
Antal sider
595 pages
Abstract
Surveys are a key source of data within social science research, however declining response rates have spurred interest in Large Language Models as a source of synthetic data. A hurdle shown by initial research is the lack of variance in model output when compared to human data. This study examines how newer models, and changes in temperature and prompt affect this truncated variance. I utilize Silicon Sampling, prompting models to adopt personas based on Danish demographic and response data from Tryghedsmålingen 2024 to generate and compare 1,000 personas with a real human baseline. The models tested are GPT-3.5-turbo, GPT-4.1 and o3, at temperature 1 and 2, and with a ‘flawed’ and ‘improved’ prompt. I find that models perform worse than initial research had led to expect, with truncated variance and mean bias in both aggregate and item-level distributions. Newer models appear less capable of creating a human-like item-level histograms, possibly caused by stronger alignment. Temperature is either inapplicable, insufficient or cause high rates of response failure. Effects of prompt prove difficult to predict, with results varying across model and temperature. This implies recent LLM developments have not resolved the issue of truncated variance, and that the common methods are currently inadequate. Finally I discuss the possible implications and issues with LLM-generated synthetic data within science and society at large.
Emneord
Kolofon: Denne side er en del af AAU Studenterprojekter — Aalborg Universitets studenterprojektportal. Her kan du finde og downloade offentligt tilgængelige kandidatspecialer og masterprojekter fra hele universitetet fra 2008 og frem. Studenterprojekter fra før 2008 kan findes i trykt form på Aalborg Universitetsbibliotek.
Har du spørgsmål til AAU Studenterprojekter eller Aalborg Universitets forskningsregistrering, formidling og analyse, er du altid velkommen til at kontakte VBN-teamet. Du kan også læse mere i AAU Studenterprojekter FAQ.