Pure Autonomous Snake Agent

This agent is used as a baseline for finetuning given human expert game play. The agent was trained for two hundred episodes using a deep q network and experience replay.

Game Play

Autonomous agent playing a game of snake. The agent achieves a score of 37.

The agent achieves a score of 46. The agent is able to avoid walls and eat food. However, the agent exhibits some strange behavior. For example, notice how the agent is inneficient with some movement, wrapping around itself before proceeding with its search for food.

Training Scores

Initial training episodes result in low scores as the agent explores policies. There is a significant jump in performance around the one hundredth episode. Performance improvements become progressively smaller as the agent approaches the two hundredth episode.