General Game Playing as a Bandit-Arms Problem: A Multiagent Monte-Carlo Solution Exploiting Nash Equilibria

Presenter Information

Matt Banda, Oberlin CollegeFollow

Location

King Building 239

Document Type

Presentation

Start Date

4-27-2019 4:00 PM

End Date

4-27-2019 5:20 PM

Abstract

One of the main drawbacks of game playing programs in the field of A.I. is that the success of many highly renowned programs such as Alpha-Go and Alpha-Star have come at the expense of generality. Exploiting heuristics or having prior professional gameplay data is how these systems inevitably succeed, which, while impressive, requires significant amounts of human intuition about the game. This level of human intervention in game playing programs begs to ask the question: Is the program the originator of the unique strategies that arise, or is it simply a reflection of what humans are capable of? However, general game playing hopes to fill in this lack of generality and remove the need for excessive human intervention. This project approaches general game playing in a unique way by combining popular methods of stochastic tree searching with a Multiagent System and a unique algorithm that I call the “Wise Explorer” algorithm. The goal of the system is to explore the worst possible branches of the game first to rule them out, followed by an in-depth search on the most promising branches. The system constantly refers to the data it collects during its extensive search, and it outputs a strategic move for any given state of a game. In essence, if you’re ever in a bind during a game of tic-tac-toe, the system will tell you exactly what your best move is.

Keywords:

Artificial Intelligence, Machine Learning, Computer Learning

Notes

Session VI, Panel 17 - Computer | Simulation
Moderator: Jason Stalnaker, Associate Professor of Physics

Major

Computer Science

Advisor(s)

Roberto Hoyle, Computer Science

Project Mentor(s)

Bob Geitz, Computer Science

April 2019

This document is currently not available here.

Share

COinS
 
Apr 27th, 4:00 PM Apr 27th, 5:20 PM

General Game Playing as a Bandit-Arms Problem: A Multiagent Monte-Carlo Solution Exploiting Nash Equilibria

King Building 239

One of the main drawbacks of game playing programs in the field of A.I. is that the success of many highly renowned programs such as Alpha-Go and Alpha-Star have come at the expense of generality. Exploiting heuristics or having prior professional gameplay data is how these systems inevitably succeed, which, while impressive, requires significant amounts of human intuition about the game. This level of human intervention in game playing programs begs to ask the question: Is the program the originator of the unique strategies that arise, or is it simply a reflection of what humans are capable of? However, general game playing hopes to fill in this lack of generality and remove the need for excessive human intervention. This project approaches general game playing in a unique way by combining popular methods of stochastic tree searching with a Multiagent System and a unique algorithm that I call the “Wise Explorer” algorithm. The goal of the system is to explore the worst possible branches of the game first to rule them out, followed by an in-depth search on the most promising branches. The system constantly refers to the data it collects during its extensive search, and it outputs a strategic move for any given state of a game. In essence, if you’re ever in a bind during a game of tic-tac-toe, the system will tell you exactly what your best move is.