{"id":14751,"date":"2019-05-03T11:56:46","date_gmt":"2019-05-03T15:56:46","guid":{"rendered":"http:\/\/bangla.salearningschool.com\/recent-posts\/applications-and-research-on-reinforcement-learning\/"},"modified":"2019-05-03T11:56:46","modified_gmt":"2019-05-03T15:56:46","slug":"applications-and-research-on-reinforcement-learning","status":"publish","type":"post","link":"http:\/\/bangla.sitestree.com\/?p=14751","title":{"rendered":"Applications and Research on Reinforcement Learning"},"content":{"rendered":"<h1>&quot;WHAT ARE MAJOR REINFORCEMENT LEARNING ACHIEVEMENTS &amp; PAPERS FROM 2018?&quot;<\/h1>\n<p>Reference:<br \/>\n<a href=\"https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-10\">https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-10<\/a><\/p>\n<p>&quot;<\/p>\n<ol>\n<li><a href=\"https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-1\">Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor<\/a><\/li>\n<li><a href=\"https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-2\">IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures<\/a><\/li>\n<li><a href=\"https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-3\">Temporal Difference Models: Model-Free Deep RL for Model-Based Control<\/a><\/li>\n<li><a href=\"https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-4\">Addressing Function Approximation Error in Actor-Critic Methods<\/a><\/li>\n<li><a href=\"https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-5\">Learning by Playing \u2013 Solving Sparse Reward Tasks from Scratch<\/a><\/li>\n<li><a href=\"https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-6\">Hierarchical Imitation and Reinforcement Learning<\/a><\/li>\n<li><a href=\"https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-7\">Unsupervised Predictive Memory in a Goal-Directed Agent<\/a><\/li>\n<li><a href=\"https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-8\">Data-Efficient Hierarchical Reinforcement Learning<\/a><\/li>\n<li><a href=\"https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-9\">Visual Reinforcement Learning with Imagined Goals<\/a><\/li>\n<li><a href=\"https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-10\">Horizon: Facebook\u2019s Open Source Applied Reinforcement Learning Platform<\/a><\/li>\n<\/ol>\n<p>&quot;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&quot;WHAT ARE MAJOR REINFORCEMENT LEARNING ACHIEVEMENTS &amp; PAPERS FROM 2018?&quot; Reference: https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-10 &quot; Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Temporal Difference Models: Model-Free Deep RL for Model-Based Control Addressing Function Approximation Error in Actor-Critic Methods Learning by Playing \u2013 Solving &hellip; <\/p>\n<p><a class=\"more-link btn\" href=\"http:\/\/bangla.sitestree.com\/?p=14751\">Continue reading<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[182],"tags":[],"class_list":["post-14751","post","type-post","status-publish","format-standard","hentry","category---blog","item-wrap"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[{"id":14752,"url":"http:\/\/bangla.sitestree.com\/?p=14752","url_meta":{"origin":14751,"position":0},"title":"AI Implementation Platforms: Reinforcement Learning Platforms and Applications:","author":"Sayed","date":"May 3, 2019","format":false,"excerpt":"\" Gym: https:\/\/gym.openai.com\/ Gym is a toolkit for developing and comparing .... It supports teaching agents everything from walking to playing games like Pong or Pinball. \" https:\/\/gym.openai.com\/ ---- \"Project Malmo integrates (deep) reinforcement learning, cognitive science, and many ideas from artificial intelligence. \" https:\/\/www.microsoft.com\/en-us\/research\/project\/project-malmo\/ ---- DeepMind: \"DeepMind's scientific mission\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":24971,"url":"http:\/\/bangla.sitestree.com\/?p=24971","url_meta":{"origin":14751,"position":1},"title":"AI Implementation Platforms: Reinforcement Learning Platforms and Applications: #Root","author":"Author-Check- Article-or-Video","date":"April 14, 2021","format":false,"excerpt":"\" Gym: https:\/\/gym.openai.com\/ Gym is a toolkit for developing and comparing .... It supports teaching agents everything from walking to playing games like Pong or Pinball. \" https:\/\/gym.openai.com\/ ---- \"Project Malmo integrates (deep) reinforcement learning, cognitive science, and many ideas from artificial intelligence. \" https:\/\/www.microsoft.com\/en-us\/research\/project\/project-malmo\/ ---- DeepMind: \"DeepMind's scientific mission\u2026","rel":"","context":"In &quot;FromSitesTree.com&quot;","block_context":{"text":"FromSitesTree.com","link":"http:\/\/bangla.sitestree.com\/?cat=1917"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":19601,"url":"http:\/\/bangla.sitestree.com\/?p=19601","url_meta":{"origin":14751,"position":2},"title":"Reinforcement Learning Examples\/DQN Examples","author":"Sayed","date":"February 2, 2021","format":false,"excerpt":"What I was looking for is: A DQN (Deep Q Learning Neural Network) or a Reinforcement Learning example that can learn from existing simulation data, and then can use that learning to interactively optimize an objective. The challenge will be: Whether my data can be learned from (whether the format\/structure\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":26368,"url":"http:\/\/bangla.sitestree.com\/?p=26368","url_meta":{"origin":14751,"position":3},"title":"Reinforcement Learning Examples\/DQN Examples #Root","author":"Author-Check- Article-or-Video","date":"April 22, 2021","format":false,"excerpt":"What I was looking for is: A DQN (Deep Q Learning Neural Network) or a Reinforcement Learning example that can learn from existing simulation data, and then can use that learning to interactively optimize an objective. The challenge will be: Whether my data can be learned from (whether the format\/structure\u2026","rel":"","context":"In &quot;FromSitesTree.com&quot;","block_context":{"text":"FromSitesTree.com","link":"http:\/\/bangla.sitestree.com\/?cat=1917"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":14773,"url":"http:\/\/bangla.sitestree.com\/?p=14773","url_meta":{"origin":14751,"position":4},"title":"Reinforcement Learning Concepts Explained in a Simple Way.","author":"Sayed","date":"May 17, 2019","format":false,"excerpt":"Reinforcement Learning Concepts Explained in a Simple (or not) Way. This is intended for the beginners who want to know the concepts used in Reinforcement Learning i.e. Interactive Learning. Reinforcement Learning is also one aspect of Machine Learning, Data Science, and AI Summary of Tabular Methods in Reinforcement Learning Comparison\u2026","rel":"","context":"In &quot;AI ML DS RL DL NN NLP Data Mining Optimization&quot;","block_context":{"text":"AI ML DS RL DL NN NLP Data Mining Optimization","link":"http:\/\/bangla.sitestree.com\/?cat=1910"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":19381,"url":"http:\/\/bangla.sitestree.com\/?p=19381","url_meta":{"origin":14751,"position":5},"title":"Reinforcement Learning and Deep Learning","author":"Sayed","date":"January 28, 2021","format":false,"excerpt":"An interactive notebook training Keras to play Catch https:\/\/github.com\/JannesKlaas\/sometimes_deep_sometimes_learning\/blob\/master\/reinforcement.ipynb Key Papers in Deep RL https:\/\/spinningup.openai.com\/en\/latest\/spinningup\/keypapers.html#key-papers-in-deep-rl DEEP REINFORCEMENT LEARNING https:\/\/arxiv.org\/pdf\/1810.06339v1.pdf Tesla: Deep Learning https:\/\/quantdare.com\/deep-reinforcement-trading\/ Playing Atari with Deep Reinforcement Learning https:\/\/arxiv.org\/pdf\/1312.5602.pdf AlphaGo is the first computer program to defeat a professional human Go player, the first to defeat a Go world\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/14751","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=14751"}],"version-history":[{"count":0,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/14751\/revisions"}],"wp:attachment":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=14751"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=14751"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=14751"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}