Optimization and Learning-based Scoring in Monte Carlo Tree Search for Scripts of Tribute

Author

Sørensen, Malthe Søgaard

Term

4. term

Education

Computer Science (IT), Master

Publication year

2025

Abstract

This thesis investigates how learned models based on features from game states can replace the rollout phase in Monte Carlo Tree Search for Scripts of Tribute, a simulator of the two-player card game Tales of Tribute. Building on the Aau903Bot baseline, I develop MaltheMCTS, which uses a data-driven heuristic scoring function instead of rollouts to reduce reliance on hand-crafted domain knowledge compared with strong rule-based bots. Complex game states are reduced to a compact feature representation, and models are trained with ML.NET, focusing on ensemble tree methods—Random Forest, Multiple Additive Regression Trees, and LightGBM—while also comparing to linear models. The work examines the trade-off between model accuracy and inference time in a sequential, time-limited setting (single CPU with per-turn budget) and implements additional MCTS optimizations, including branch pruning, one-turn planning, and adjustments for tree reuse across turns. Models trained on expert data and on self-play are benchmarked in MaltheMCTS against the baseline, classic MCTS with rollouts, and competitive bots such as the rule-based winner BestMCTS3. Results show that ensemble-based scoring outperforms classic rollouts and yields strong play against weaker opponents, including the baseline, even without iterative play-and-train cycles, but it does not yet match BestMCTS3. The findings highlight the importance of balancing accuracy and speed and suggest further gains from improved data collection, training, and integration with MCTS.

Denne afhandling undersøger, hvordan lærte modeller baseret på træk fra spiltilstande kan erstatte rollout-fasen i Monte Carlo Tree Search for Scripts of Tribute, en simulator af det turbaserede to-spiller kortspil Tales of Tribute. Med udgangspunkt i Aau903Bot udvikles MaltheMCTS, som bruger en data-drevet heuristisk scoringsfunktion i stedet for rollouts for at reducere afhængigheden af håndkodet domæneviden i forhold til stærke regelbaserede bots. Den komplekse spiltilstand reduceres til en kompakt feature-repræsentation, og der trænes modeller i ML.NET med fokus på ensembletræ-metoderne Random Forest, Multiple Additive Regression Trees og LightGBM, samt sammenligning med lineære modeller. Arbejdet undersøger afvejningen mellem modellernes nøjagtighed og inferenstid i et sekventielt, tidsbegrænset miljø (én CPU og tidsbudget per tur) og implementerer yderligere MCTS-optimeringer, herunder grenbeskæring, én-turs planlægning og justeringer ved genbrug af søgetræet på tværs af ture. Modeller trænes på både ekspertdata og selvspil og benchmarkes i MaltheMCTS mod basen, klassisk MCTS med rollouts og konkurrerende bots som den regelbaserede vinder BestMCTS3. Resultaterne viser, at ensemblebaseret scoring forbedrer ydeevnen i forhold til klassiske rollouts og giver stærkt spil mod svagere modstandere, herunder basen, selv uden iterative spil-og-træningscyklusser, men at løsningen endnu ikke matcher BestMCTS3. Fundene understreger vigtigheden af at balancere nøjagtighed og hastighed og peger på yderligere gevinster gennem bedre dataindsamling, træning og MCTS-integration.

[This abstract has been generated with the help of AI directly from the project full text]

Documents

Download PDF
View record in AAU Student Projects

An executive master's programme thesis from Aalborg University

Optimization and Learning-based Scoring in Monte Carlo Tree Search for Scripts of Tribute