Best Multi-Armed Bandit Strategy? (feat: UCB Method)
https://www.youtube.com/watch?v=FgmMK6RPU1c
Reinforcement Learning: Complete Course:
https://www.youtube.com/watch?v=4SLGEq_HZxk&list=PLnn6VZp3hqNvRrdnMOVtgV64F_O-61C1D
From the Book by Sutton: http://incompleteideas.net/book/RLbook2018.pdf
https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf
Just did a google search, and I see that the algorithms from the book are implemented and provided at:
https://github.com/LyWangPX/Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions
Article:
https://towardsdatascience.com/a-comparison-of-bandit-algorithms-24b4adfcabb
https://towardsdatascience.com/a-comparison-of-bandit-algorithms-24b4adfcabb