Ref and Read: https://medium.com/@gridflowai/multi-armed-bandits-an-overview-on-classical-rl-algorithms-9a1e047cd98e
https://www.kdnuggets.com/2023/01/introduction-multiarmed-bandit-problems.html
Epsilon Greedy:
https://medium.com/opex-analytics/multi-armed-bandits-101-6f4ac62b6bd6
https://cxl.com/blog/bandit-tests
https://www.geeksforgeeks.org/epsilon-greedy-algorithm-in-reinforcement-learning