MontyBot: An intelligent agent utilizing MCTS and NN to play Starcraft II
Authors
Lausten, Tobias ; Justesen, Marco Klaustrup ; Kiss, Bence Imre
Term
4. term
Education
Publication year
2025
Submitted on
2025-06-05
Pages
58
Abstract
I dette projekt videreudviklede vi vores tidligere økonomiagent til StarCraft II til en agent, der kan spille hele spillet. Vi delte opgaven i fire dele: økonomistyring, hærudvikling, kamp og informationsindsamling. Til økonomien genbrugte vi vores Monte Carlo-træsøgning (MCTS), en planlægningsmetode, der afprøver mange mulige handlingsforløb og vælger lovende muligheder. Til hærudvikling udvidede vi MCTS med handlinger til at bygge produktionsbygninger og træne kampenheder. Vi tilføjede også en eksplicit repræsentation af modstanderen i spiltilstanden. Værdien af hver tilstand baseres derefter på den estimerede sandsynlighed for at vinde mod modstanderen. Til kamp byggede vi to komponenter. Et Combat Prediction Neural Network estimerer chancen for at vinde en kamp. Et styringsmodul for angreb og forsvar bruger dette estimat til at beslutte, hvornår der skal angribes, og hvordan hæren deles for at forsvare. Til spejdning implementerede vi et modul, der styrer spejderenheder, og et modul, der lagrer observationer og udleder information. Agenten blev evalueret ved at spille kampe mod ni andre agenter tilgængelige på AI Arena. Rapporten konkluderer, at agenten er i stand til at spille hele StarCraft II, med en sejrsrate på 25% på AI Arena.
In this project, we extended our previous economy-focused agent for StarCraft II into one that can play the full game. We divided the task into four parts: managing the economy, building an army, combat, and information gathering. For the economy, we reused our Monte Carlo Tree Search (MCTS), a planning method that explores many possible future action sequences and selects promising ones. For army development, we extended MCTS to include actions for constructing production buildings and training combat units. We also added an explicit representation of the opponent to the game state. The value of each state is then based on the estimated probability of winning against that opponent. For combat, we built two components. A Combat Prediction Neural Network, a machine-learning model, estimates the chance of winning a battle. A control module for offense and defense uses this estimate to decide when to attack and how to split the army to defend. For scouting, we implemented a module that controls scout units and another that stores observations and infers information. We evaluated the agent by playing matches against nine other agents available on AI Arena. The report concludes that the agent can play full games of StarCraft II, with a 25% win rate on AI Arena.
[This summary has been rewritten with the help of AI based on the project's original abstract]
Keywords
Documents
