Improvments

Time was wasting on generating tables with iterations. I am now writing the files to a text file separating the values with commas. Now I only need to run the learning process once. Then I can create agents by giving that text file as the table.

Second improvement: An error was present while selecting the maximum Q Value. I am using the standard finding maximum method. However, I was initializing the first value to 0. So when the reward is negative, it just continues with value 0. I fixed that.

After all the fixes, it is running faster and a bit better. Below are some screen shots from the algorithm. Reward for entering an obstacle is still higher.

This slideshow requires JavaScript.

I think I have good results now. I will commit this to git and start to create new actual simulations. Hopefully I will be able to conduct a user study. Seems like so but we will see.

I also need to improve the training process but now I can just add on the training files I already have.

Leave a comment