{"id":14773,"date":"2019-05-17T14:10:24","date_gmt":"2019-05-17T18:10:24","guid":{"rendered":"https:\/\/bangla.salearningschool.com\/recent-posts\/reinforcement-learning-concepts-explained-in-a-simple-way\/"},"modified":"2020-02-08T09:43:36","modified_gmt":"2020-02-08T14:43:36","slug":"reinforcement-learning-concepts-explained-in-a-simple-way","status":"publish","type":"post","link":"http:\/\/bangla.sitestree.com\/?p=14773","title":{"rendered":"Reinforcement Learning Concepts Explained in a Simple Way."},"content":{"rendered":"<p><strong>Reinforcement Learning Concepts Explained in a Simple (or not) Way.<\/strong><\/p>\n<p>This is intended for the beginners who want to know the concepts used in Reinforcement Learning i.e. Interactive Learning.<\/p>\n<p>Reinforcement Learning is also one aspect of Machine Learning, Data Science, and AI<\/p>\n<p><strong>Summary of Tabular Methods in Reinforcement Learning<\/strong><\/p>\n<h2>Comparison between the different tabular methods in Reinforcement Learning<\/h2>\n<p><a href=\"https:\/\/towardsdatascience.com\/summary-of-tabular-methods-in-reinforcement-learning-39d653e904af\">https:\/\/towardsdatascience.com\/summary-of-tabular-methods-in-reinforcement-learning-39d653e904af<\/a><\/p>\n<p>&#8212;<\/p>\n<h1>Silver\u200a\u2014\u200aLecture 6: Value Function Approximation<br \/>\n<\/h1>\n<h1><a href=\"https:\/\/medium.com\/@SeoJaeDuk\/archived-post-rl-course-by-david-silver-lecture-6-value-function-approximation-241695feeb1f\">https:\/\/medium.com\/@SeoJaeDuk\/archived-post-rl-course-by-david-silver-lecture-6-value-function-approximation-241695feeb1f<\/a><br \/>\n<\/h1>\n<p>&#8212;-<\/p>\n<h1>Going Deeper Into Reinforcement Learning: Understanding Q-Learning and Linear Function Approximation (might not be simple\/easy)<\/h1>\n<p><a href=\"https:\/\/danieltakeshi.github.io\/2016\/10\/31\/going-deeper-into-reinforcement-learning-understanding-q-learning-and-linear-function-approximation\/\">https:\/\/danieltakeshi.github.io\/2016\/10\/31\/going-deeper-into-reinforcement-learning-understanding-q-learning-and-linear-function-approximation\/<\/a><\/p>\n<p><strong> Reinforcement in Psychology<\/strong><\/p>\n<p>&quot;Understanding <strong>Reinforcement<\/strong> in <strong>Psychology<\/strong>. <strong>Reinforcement<\/strong> is a term used in operant conditioning to <strong>refer<\/strong> to anything that increases the likelihood that a response will occur. &#8230; <strong>Reinforcement<\/strong> can include anything that strengthens or increases a behavior, including specific tangible rewards, events, and situations.&quot;<\/p>\n<p><a href=\"https:\/\/www.verywellmind.com\/what-is-reinforcement-2795414\"><\/a><\/p>\n<h3><a href=\"https:\/\/www.verywellmind.com\/what-is-reinforcement-2795414\">What Is Reinforcement in Operant Conditioning? &#8211; Verywell Mind<\/a><\/h3>\n<p><a href=\"https:\/\/www.verywellmind.com\/what-is-reinforcement-2795414\"><br \/>\nhttps:\/\/www.verywellmind.com\/what-is-reinforcement-2795414<br \/>\n<\/a><\/p>\n<p><strong>Introduction to Reinforcement Learning\u200a<\/strong><\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/introduction-to-reinforcement-learning-chapter-1-fc8a196a09e8\">https:\/\/towardsdatascience.com\/introduction-to-reinforcement-learning-chapter-1-fc8a196a09e8<\/a><\/p>\n<p><strong>Solving the Multi-Armed Bandit Problem<\/strong><\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/solving-the-multi-armed-bandit-problem-b72de40db97c\">https:\/\/towardsdatascience.com\/solving-the-multi-armed-bandit-problem-b72de40db97c<\/a><\/p>\n<p><strong>My Journey to Reinforcement Learning\u200a\u2014\u200aPart 2: Multi-Armed Bandit Problem<\/strong><\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/my-journey-to-reinforcement-learning-part-2-multi-armed-bandit-problem-eefe1afab73c\">https:\/\/towardsdatascience.com\/my-journey-to-reinforcement-learning-part-2-multi-armed-bandit-problem-eefe1afab73c<\/a><\/p>\n<p><strong>Self Learning AI-Agents Part I: Markov Decision Processes<\/strong><\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/self-learning-ai-agents-part-i-markov-decision-processes-baf6b8fc4c5f\">https:\/\/towardsdatascience.com\/self-learning-ai-agents-part-i-markov-decision-processes-baf6b8fc4c5f<\/a><\/p>\n<p><strong>Reinforcement Learning Demystified: Markov Decision Processes (Part 1)<\/strong><\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/reinforcement-learning-demystified-markov-decision-processes-part-1-bf00dda41690\">https:\/\/towardsdatascience.com\/reinforcement-learning-demystified-markov-decision-processes-part-1-bf00dda41690<\/a><\/p>\n<p><strong>Reinforcement Learning Demystified: Solving MDPs with Dynamic Programming<\/strong><\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/reinforcement-learning-demystified-solving-mdps-with-dynamic-programming-b52c8093c919\">https:\/\/towardsdatascience.com\/reinforcement-learning-demystified-solving-mdps-with-dynamic-programming-b52c8093c919<\/a><\/p>\n<p><strong>Planning by Dynamic Programming: Reinforcement Learning<\/strong><\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/planning-by-dynamic-programming-reinforcement-learning-ed4924bbaa4c\">https:\/\/towardsdatascience.com\/planning-by-dynamic-programming-reinforcement-learning-ed4924bbaa4c<\/a><\/p>\n<p><strong>Monte Carlo: Reinforcement Learning for Meal Planning based on Meeting a Set Budget and Personal Preferences (Monte Carlo)<\/strong><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/reinforcement-learning-for-meal-planning-based-on-meeting-a-set-budget-and-personal-preferences-9624a520cce4\">https:\/\/towardsdatascience.com\/reinforcement-learning-for-meal-planning-based-on-meeting-a-set-budget-and-personal-preferences-9624a520cce4<\/a><\/p>\n<p><strong>Monte Carlo Simulations with Python (Part 1)<\/strong><\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/monte-carlo-simulations-with-python-part-1-f5627b7d60b0\">https:\/\/towardsdatascience.com\/monte-carlo-simulations-with-python-part-1-f5627b7d60b0<\/a><\/p>\n<p><strong>Monte Carlo Without the Math<\/strong><\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/monte-carlo-without-the-math-90630344ff7b\">https:\/\/towardsdatascience.com\/monte-carlo-without-the-math-90630344ff7b<\/a><\/p>\n<p><strong>Simple Reinforcement Learning: Temporal Difference Learning<\/strong><\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/simple-reinforcement-learning-temporal-difference-learning-53d1b3263d79\">https:\/\/towardsdatascience.com\/simple-reinforcement-learning-temporal-difference-learning-53d1b3263d79<\/a><\/p>\n<p><strong>Model-Free Prediction: Reinforcement Learning<\/strong><\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/model-free-prediction-reinforcement-learning-507297e8e2ad\">https:\/\/towardsdatascience.com\/model-free-prediction-reinforcement-learning-507297e8e2ad<\/a><\/p>\n<p><a href=\"https:\/\/datascience.stackexchange.com\/questions\/26938\/what-exactly-is-bootstrapping-in-reinforcement-learning\"><strong>What exactly is bootstrapping in reinforcement learning?<\/strong><\/a><\/p>\n<p><a href=\"https:\/\/datascience.stackexchange.com\/questions\/26938\/what-exactly-is-bootstrapping-in-reinforcement-learning\">https:\/\/datascience.stackexchange.com\/questions\/26938\/what-exactly-is-bootstrapping-in-reinforcement-learning<\/a><\/p>\n<p><strong>n-step bootstrapping<\/strong><\/p>\n<p><a href=\"http:\/\/ipvs.informatik.uni-stuttgart.de\/mlr\/wp-content\/uploads\/2018\/06\/18-RL-nstep.pdf\">http:\/\/ipvs.informatik.uni-stuttgart.de\/mlr\/wp-content\/uploads\/2018\/06\/18-RL-nstep.pdf<\/a><\/p>\n<p><strong>Planning and Learning with Tabular Methods <\/strong><\/p>\n<p><a href=\"https:\/\/medium.com\/@SeoJaeDuk\/archived-post-planning-and-learning-with-tabular-methods-8-1-8-4-bf8f836614d0\">https:\/\/medium.com\/@SeoJaeDuk\/archived-post-planning-and-learning-with-tabular-methods-8-1-8-4-bf8f836614d0<\/a><\/p>\n<p>Sayed Ahmed<\/p>\n<p>Linkedin: <a href=\"https:\/\/ca.linkedin.com\/in\/sayedjustetc\">https:\/\/ca.linkedin.com\/in\/sayedjustetc<\/a><\/p>\n<p>Blog: <a href=\"http:\/\/sitestree.com\">http:\/\/sitestree.com<\/a>, <a href=\"http:\/\/bangla.salearningschool.com\">http:\/\/bangla.salearningschool.com<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Reinforcement Learning Concepts Explained in a Simple (or not) Way. This is intended for the beginners who want to know the concepts used in Reinforcement Learning i.e. Interactive Learning. Reinforcement Learning is also one aspect of Machine Learning, Data Science, and AI Summary of Tabular Methods in Reinforcement Learning Comparison between the different tabular methods &hellip; <\/p>\n<p><a class=\"more-link btn\" href=\"http:\/\/bangla.sitestree.com\/?p=14773\">Continue reading<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[1910,182],"tags":[],"class_list":["post-14773","post","type-post","status-publish","format-standard","hentry","category-ai-ml-ds-rl-dl-nn-nlp-data-mining-optimization","category---blog","item-wrap"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[{"id":14751,"url":"http:\/\/bangla.sitestree.com\/?p=14751","url_meta":{"origin":14773,"position":0},"title":"Applications and Research on Reinforcement Learning","author":"Sayed","date":"May 3, 2019","format":false,"excerpt":"\"WHAT ARE MAJOR REINFORCEMENT LEARNING ACHIEVEMENTS & PAPERS FROM 2018?\" Reference: https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-10 \" Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Temporal Difference Models: Model-Free Deep RL for Model-Based Control Addressing Function Approximation Error in Actor-Critic Methods\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":19601,"url":"http:\/\/bangla.sitestree.com\/?p=19601","url_meta":{"origin":14773,"position":1},"title":"Reinforcement Learning Examples\/DQN Examples","author":"Sayed","date":"February 2, 2021","format":false,"excerpt":"What I was looking for is: A DQN (Deep Q Learning Neural Network) or a Reinforcement Learning example that can learn from existing simulation data, and then can use that learning to interactively optimize an objective. The challenge will be: Whether my data can be learned from (whether the format\/structure\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":26368,"url":"http:\/\/bangla.sitestree.com\/?p=26368","url_meta":{"origin":14773,"position":2},"title":"Reinforcement Learning Examples\/DQN Examples #Root","author":"Author-Check- Article-or-Video","date":"April 22, 2021","format":false,"excerpt":"What I was looking for is: A DQN (Deep Q Learning Neural Network) or a Reinforcement Learning example that can learn from existing simulation data, and then can use that learning to interactively optimize an objective. The challenge will be: Whether my data can be learned from (whether the format\/structure\u2026","rel":"","context":"In &quot;FromSitesTree.com&quot;","block_context":{"text":"FromSitesTree.com","link":"http:\/\/bangla.sitestree.com\/?cat=1917"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":24971,"url":"http:\/\/bangla.sitestree.com\/?p=24971","url_meta":{"origin":14773,"position":3},"title":"AI Implementation Platforms: Reinforcement Learning Platforms and Applications: #Root","author":"Author-Check- Article-or-Video","date":"April 14, 2021","format":false,"excerpt":"\" Gym: https:\/\/gym.openai.com\/ Gym is a toolkit for developing and comparing .... It supports teaching agents everything from walking to playing games like Pong or Pinball. \" https:\/\/gym.openai.com\/ ---- \"Project Malmo integrates (deep) reinforcement learning, cognitive science, and many ideas from artificial intelligence. \" https:\/\/www.microsoft.com\/en-us\/research\/project\/project-malmo\/ ---- DeepMind: \"DeepMind's scientific mission\u2026","rel":"","context":"In &quot;FromSitesTree.com&quot;","block_context":{"text":"FromSitesTree.com","link":"http:\/\/bangla.sitestree.com\/?cat=1917"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":14752,"url":"http:\/\/bangla.sitestree.com\/?p=14752","url_meta":{"origin":14773,"position":4},"title":"AI Implementation Platforms: Reinforcement Learning Platforms and Applications:","author":"Sayed","date":"May 3, 2019","format":false,"excerpt":"\" Gym: https:\/\/gym.openai.com\/ Gym is a toolkit for developing and comparing .... It supports teaching agents everything from walking to playing games like Pong or Pinball. \" https:\/\/gym.openai.com\/ ---- \"Project Malmo integrates (deep) reinforcement learning, cognitive science, and many ideas from artificial intelligence. \" https:\/\/www.microsoft.com\/en-us\/research\/project\/project-malmo\/ ---- DeepMind: \"DeepMind's scientific mission\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":24925,"url":"http:\/\/bangla.sitestree.com\/?p=24925","url_meta":{"origin":14773,"position":5},"title":"On Reinforcement Learning: #Root","author":"Author-Check- Article-or-Video","date":"April 13, 2021","format":false,"excerpt":"On Reinforcement Learning: Questions and Answers https:\/\/www.inf.ed.ac.uk\/teaching\/courses\/rl\/tutorials.html Monte Carlo: https:\/\/medium.com\/@zsalloum\/monte-carlo-in-reinforcement-learning-the-easy-way-564c53010511 TD in Reinforcement Learning, the Easy Way: Temporal Difference https:\/\/towardsdatascience.com\/td-in-reinforcement-learning-the-easy-way-f92ecfa9f3ce Implementations of TD Algorithms: https:\/\/github.com\/dennybritz\/reinforcement-learning\/tree\/master\/TD Learning and Planning: https:\/\/courses.cs.washington.edu\/courses\/csep573\/12au\/lectures\/18-rl.pdf Sayed Ahmed sayedum Linkedin: https:\/\/ca.linkedin.com\/in\/sayedjustetc Blog: http:\/\/sitestree.com, http:\/\/bangla.salearningschool.com From: http:\/\/sitestree.com\/on-reinforcement-learning\/ Categories:RootTags: Post Data:2019-04-17 12:45:47 Shop Online: https:\/\/www.ShopForSoul.com\/ (Big Data, Cloud, Security,\u2026","rel":"","context":"In &quot;FromSitesTree.com&quot;","block_context":{"text":"FromSitesTree.com","link":"http:\/\/bangla.sitestree.com\/?cat=1917"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/14773","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=14773"}],"version-history":[{"count":1,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/14773\/revisions"}],"predecessor-version":[{"id":16821,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/14773\/revisions\/16821"}],"wp:attachment":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=14773"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=14773"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=14773"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}