Evolving simple Go playing bots
Motivation
A few years ago I read the excellent book Evolved to Win, a collection of experiences creating game-playing genetic programs.
As I’ve been playing Go a bit lately, I decided I’d try to write a Go bot using genetic programming. The source for the project is on Github.
If there is one takeaway from the project it’s that evolution tends to find amusingly lazy solutions to problems. Having well defined evaluation criteria is important. I capture some of the amusing Go playing strategies that evolved below.
Genetic Programming Approach
In genetic programming you provide programming building blocks and hand them over to evolution to combine into a useful program.
In this case, the program is a tree structure where each node is a function or constant. The program is used to score a game move. Get scores for all the possible moves a bot can make, and then pick the best scoring move.
I implemented some generic functions such as Plus
and Multiply
, and some Go domain specific ones like MyLiberties
and MyScore
. MyLiberties
for instance, can report how the move will affect the player’s liberties (i.e. open spaces surrounding stones). As an example, here is a function produced by the program:
Try to increase number of connected stones. Try to increase score. Try to increase number of liberties.
Most of the resulting programs are of this form, a weighting of attributes of the move. They frequently have some amount of duplication, and are often much larger.
Meet the bots
Presenting memorable bot personalities I encountered during the project.
RandomBot
I handcrafted this bot to assist with scoring. Now we have a simple bot for other bots to compete against.
Every move has a random score.
CheckerBot: The Sprawling Honeycomb
I set out to evolve a bot to beat RandomBot. Using RandomBot as the benchmark, the very first run produced this simple but effective strategy: create a honeycomb structure and rack up territory points.
Stay away from own stones. The honeycomb pattern is a result of the order in which the program evaluates moves.
While it’s trivial for a human to defeat CheckerBot, RandomBot doesn’t stand a chance at being able to make the precise moves needed to take down the honeycomb. CheckerBot wins.
red: CheckerBot, black: RandomBot
SmotherBot: Machine of War
I decided to use CheckerBot as the new benchmark. I began adding new functions to help build a bot that can defeat CheckerBot. New functions included ways to detect if your move can be captured, and to detect the number of liberties you have.
The result: SmotherBot
Remove opponent liberties. Scoring is good.
SmotherBot’s strategy is to surround the prey like a Boa constrictor snake. CheckerBot is no match. Its fragile honeycombs crumble and fall:
red: SmotherBot, black: CheckerBot
In contrast to CheckBot’s defensive style, SmotherBot has a very aggressive play style.
EmpireBot: Builder of Monoliths
Continuing the trend forward, I began searching for a bot to stand up to SmotherBot.
Introducing EmpireBot: the base-building bot that creates large, well-connected structures.
Increase liberties. Increase size of stone clusters.
Indeed, SmotherBot’s overly aggressive tactics leave it without the support of a home base. Its weak, disjoint structures are dominated by the EmpireBot monolith.
In fact, EmpireBot is oblivious of its opponents. It actually doesn’t know they exist. The monolith crushes them by mere accident.
red: EmpireBot, black: SmotherBot
Avoiding Overspecialization
At this point I wanted to avoid overspecialization against the benchmark. Some of the highest scoring bots were excellent at exploiting the weaknesses of the benchmark, but leaving the back door wide open.
The final approach I settled upon is to score the bots against one another in peer matches. Score is based on average score in 30 games against random peers.
As verification I independently test them against RandomBot. If the bots are indeed evolving more general strategies, you would expect performance against RandomBot to increase, even with no impact on evolutionary path.
Progress of the bots over a few generations. Each data point is a bot score. Score is the average score of the bot playing 4 games against RandomBot on a 7x7 board.
While the bots are still not human competitive, I found this approach to produce the most challenging bots. Some enhancements to improve performance would be to look ahead at future moves instead of only considering the current move, and to provide more ways for bots to read the state of the board.
Battling the bots above. Red: bot from top of the page, Black: RandomBot, CheckerBot, SmotherBot, EmpireBot