PSet 13: Q-Learning
(videos: Q-Learning Example)
Q-learning update rule:
Pick up where I left off in the Q-learning video. Do two more trials, one where you go N,N,E,E,S,E, and one where you go N,N,E,E,E. Show your work.
Given your results above, if you are in <3,2>, output East, but accidentally go North, which Q value would be affected and how would it change?