Adaptive Bayesian Networks for Zero-Sum Games
Author
Jørgensen, Thomas
Term
4. term
Education
Publication year
2000
Abstract
Dette speciale undersøger, hvordan Bayesianske netværk—probabilistiske grafiske modeller, der repræsenterer usikkerhed—kan bruges til at løse endelige to-personers nulsumsspil med én beslutning (hvor den enes gevinst er den andens tab). Vi analyserer i detaljer en iterativ metode foreslået af George W. Brown i 1949 og bekræfter dens korrekthed både teoretisk og i praksis. For de spil, vi bruger som test, viser vi, at de løsninger, metoden finder, er Nash-ligevægte—stabile strategikombinationer, hvor ingen spiller kan forbedre sit udfald ved alene at ændre strategi. Med udgangspunkt i disse principper konstruerer vi adaptive Bayesianske netværk, der styrer lærende agenter. Når disse agenter gentagne gange spiller mod hinanden, justeres deres sandsynlighedsfordelinger over handlinger over tid og konvergerer mod spillets Nash-ligevægt. Endelig viser vi, at de samme træningsprincipper gør det muligt for Bayesianske netværk at håndtere mere komplekse spil med flere beslutninger, ikke kun spil med én beslutning.
This thesis investigates how Bayesian networks—probabilistic graphical models that represent uncertainty—can be used to solve finite two-player zero-sum games with a single decision (where one player's gain equals the other's loss). We study an iterative method proposed by George W. Brown in 1949 and verify it both theoretically and in practice. For our test games, we show that the solutions produced by Brown's method are Nash equilibria—stable strategy pairs in which neither player can improve by changing strategy alone. Building on these principles, we create adaptive Bayesian networks that control learning agents. When these agents repeatedly play against each other, their probability distributions over actions adjust over time and converge to the game's Nash equilibrium. Finally, we demonstrate that the same training principles allow Bayesian networks to tackle more complex games with multiple decisions, not just single-decision cases.
[This abstract was generated with the help of AI]
Documents
