I was trying to implement a genetic algorithm for the game 'tic tac toe'. How I am doing it at the moment is the following:
- Initiliaze 50 random networks
- Let each network play against each network.
- After that, each network will have played 98 games (against each player on each side)
- the fitness of each network is calculated this way:
fitness = wins + draws -loses
Apply the genetic algorithm:
5.1. Select the 5 best networks by using their fitness value.
5.2. Copy these 5 networks onto the other 45 networks. (So 10 times the first, 10 times the second...)
5.3. Change p percent of the weights with a random gaussian multiplied with a factor of s.
if(p < Random(0,1)){ weight/bias += Random.gaussian() * s; }
My values that I am using right now is:
p = 0.05
s = 0.05
And my network is making decisions the following way:
I am using a minimax algorithm with a depth of 2 (not a lot, I know). The evaluation of the board is being done by the network.
My problem right now is that it does not work as expected. When I play against it afterwards, it usually does stupid moves and let's me win or does not win itself when I let it.
I am very happy if someone could help me with this one.
Greetings, Finn