General Game Playing as a Bandit-Arms Problem: A Multiagent Monte-Carlo Solution Exploiting Nash Equilibria
Location
King Building 239
Document Type
Presentation
Start Date
4-27-2019 4:00 PM
End Date
4-27-2019 5:20 PM
Abstract
One of the main drawbacks of game playing programs in the field of A.I. is that the success of many highly renowned programs such as Alpha-Go and Alpha-Star have come at the expense of generality. Exploiting heuristics or having prior professional gameplay data is how these systems inevitably succeed, which, while impressive, requires significant amounts of human intuition about the game. This level of human intervention in game playing programs begs to ask the question: Is the program the originator of the unique strategies that arise, or is it simply a reflection of what humans are capable of? However, general game playing hopes to fill in this lack of generality and remove the need for excessive human intervention. This project approaches general game playing in a unique way by combining popular methods of stochastic tree searching with a Multiagent System and a unique algorithm that I call the “Wise Explorer” algorithm. The goal of the system is to explore the worst possible branches of the game first to rule them out, followed by an in-depth search on the most promising branches. The system constantly refers to the data it collects during its extensive search, and it outputs a strategic move for any given state of a game. In essence, if you’re ever in a bind during a game of tic-tac-toe, the system will tell you exactly what your best move is.
Keywords:
Artificial Intelligence, Machine Learning, Computer Learning
Recommended Citation
Banda, Matt, "General Game Playing as a Bandit-Arms Problem: A Multiagent Monte-Carlo Solution Exploiting Nash Equilibria" (04/27/19). Senior Symposium. 1.
https://digitalcommons.oberlin.edu/seniorsymp/2019/panel_17/1
Major
Computer Science
Advisor(s)
Roberto Hoyle, Computer Science
Project Mentor(s)
Bob Geitz, Computer Science
April 2019
General Game Playing as a Bandit-Arms Problem: A Multiagent Monte-Carlo Solution Exploiting Nash Equilibria
King Building 239
One of the main drawbacks of game playing programs in the field of A.I. is that the success of many highly renowned programs such as Alpha-Go and Alpha-Star have come at the expense of generality. Exploiting heuristics or having prior professional gameplay data is how these systems inevitably succeed, which, while impressive, requires significant amounts of human intuition about the game. This level of human intervention in game playing programs begs to ask the question: Is the program the originator of the unique strategies that arise, or is it simply a reflection of what humans are capable of? However, general game playing hopes to fill in this lack of generality and remove the need for excessive human intervention. This project approaches general game playing in a unique way by combining popular methods of stochastic tree searching with a Multiagent System and a unique algorithm that I call the “Wise Explorer” algorithm. The goal of the system is to explore the worst possible branches of the game first to rule them out, followed by an in-depth search on the most promising branches. The system constantly refers to the data it collects during its extensive search, and it outputs a strategic move for any given state of a game. In essence, if you’re ever in a bind during a game of tic-tac-toe, the system will tell you exactly what your best move is.
Notes
Session VI, Panel 17 - Computer | Simulation
Moderator: Jason Stalnaker, Associate Professor of Physics