Our hero Billy wants to reach his darling Sally. But the way up to her is complicated and sometimes blocked by bad guys called Gruber. But Billy is clever enough to learn the maze by Q-Learning. Or is he really so?
This simulation is done as a project for the Machine Learning course.
So what results we got:
Can he find the path always?
No. If there is no path, he can't. Also if learning is done with too few tries, he may not find path. (Under 100 cycles)
Can he find optimal path?
Yes. But he should make enough tries when learning. (1000 cycle is mostly enough for this size of map).
Can it be used in games for pathfinding?
If the goal is always known and static, it may be used. Otherwise it will be too slow for the bigger maps. Beside A* offers much better results for pathfinding.
*Song sample is from Pornophonique's - Sad Robot