{"id":19381,"date":"2021-01-28T22:01:32","date_gmt":"2021-01-29T03:01:32","guid":{"rendered":"http:\/\/bangla.salearningschool.com\/recent-posts\/reinforcement-learning-and-deep-learning\/"},"modified":"2021-01-28T22:01:32","modified_gmt":"2021-01-29T03:01:32","slug":"reinforcement-learning-and-deep-learning","status":"publish","type":"post","link":"http:\/\/bangla.sitestree.com\/?p=19381","title":{"rendered":"Reinforcement Learning and Deep Learning"},"content":{"rendered":"<p>An interactive notebook training Keras to play Catch<\/p>\n<p><a href=\"https:\/\/github.com\/JannesKlaas\/sometimes_deep_sometimes_learning\/blob\/master\/reinforcement.ipynb\">https:\/\/github.com\/JannesKlaas\/sometimes_deep_sometimes_learning\/blob\/master\/reinforcement.ipynb<\/a><\/p>\n<h1><a href=\"https:\/\/spinningup.openai.com\/en\/latest\/spinningup\/keypapers.html#id106\">Key Papers in Deep RL<\/a><\/h1>\n<p><a href=\"https:\/\/spinningup.openai.com\/en\/latest\/spinningup\/keypapers.html#key-papers-in-deep-rl\">https:\/\/spinningup.openai.com\/en\/latest\/spinningup\/keypapers.html#key-papers-in-deep-rl<\/a><\/p>\n<p>DEEP REINFORCEMENT LEARNING<\/p>\n<p><a href=\"https:\/\/arxiv.org\/pdf\/1810.06339v1.pdf\">https:\/\/arxiv.org\/pdf\/1810.06339v1.pdf<\/a><\/p>\n<p>Tesla: Deep Learning<br \/>\n<a href=\"https:\/\/quantdare.com\/deep-reinforcement-trading\/\">https:\/\/quantdare.com\/deep-reinforcement-trading\/<\/a><\/p>\n<p>Playing Atari with Deep Reinforcement Learning<\/p>\n<p><a href=\"https:\/\/arxiv.org\/pdf\/1312.5602.pdf\">https:\/\/arxiv.org\/pdf\/1312.5602.pdf<\/a><\/p>\n<h4>AlphaGo is the first computer program to defeat a professional human Go player, the first to defeat a Go world champion, and is arguably the strongest Go player in history.<\/h4>\n<p><a href=\"https:\/\/deepmind.com\/research\/case-studies\/alphago-the-story-so-far\">https:\/\/deepmind.com\/research\/case-studies\/alphago-the-story-so-far<\/a><\/p>\n<h1>Using Deep Q-Learning in FIFA 18 to perfect the art of free-kicks<\/h1>\n<p><a href=\"https:\/\/towardsdatascience.com\/using-deep-q-learning-in-fifa-18-to-perfect-the-art-of-free-kicks-f2e4e979ee66\">https:\/\/towardsdatascience.com\/using-deep-q-learning-in-fifa-18-to-perfect-the-art-of-free-kicks-f2e4e979ee66<\/a><\/p>\n<p><a href=\"https:\/\/becominghuman.ai\/reinforcement-learning-with-fifa-and-keras-85ec792e25b2\">https:\/\/becominghuman.ai\/reinforcement-learning-with-fifa-and-keras-85ec792e25b2<\/a><\/p>\n<p>A saturation-balancing control method for enhancing dynamic vehicle stability<br \/>\n<a href=\"https:\/\/cecas.clemson.edu\/ayalew\/Papers\/Vehicle%20Systems%20Dynamics%20and%20Control\/Papers\/A%20Saturation%20Balancing%20Control%20Method%20for%20Enhancing%20Dynamic%20Vehicle%20Stability\/IJVD%2061_1-4_Paper%203.pdf\">https:\/\/cecas.clemson.edu\/ayalew\/Papers\/Vehicle%20Systems%20Dynamics%20and%20Control\/Papers\/A%20Saturation%20Balancing%20Control%20Method%20for%20Enhancing%20Dynamic%20Vehicle%20Stability\/IJVD%2061_1-4_Paper%203.pdf<\/a><\/p>\n<p>Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm<br \/>\n<a href=\"https:\/\/arxiv.org\/pdf\/1712.01815.pdf\">https:\/\/arxiv.org\/pdf\/1712.01815.pdf<\/a><\/p>\n<h1>Reinforcement Learning Demystified: Solving MDPs with Dynamic Programming<\/h1>\n<p> <a href=\"https:\/\/towardsdatascience.com\/reinforcement-learning-demystified-solving-mdps-with-dynamic-programming-b52c8093c919\">https:\/\/towardsdatascience.com\/reinforcement-learning-demystified-solving-mdps-with-dynamic-programming-b52c8093c919<\/a><\/p>\n<p>This game presents moves along a linear chain of states, with two actions:<\/p>\n<p><a href=\"https:\/\/github.com\/openai\/gym\/blob\/master\/gym\/envs\/toy_text\/nchain.py\">https:\/\/github.com\/openai\/gym\/blob\/master\/gym\/envs\/toy_text\/nchain.py<\/a><\/p>\n<h1>The Artificial Intelligence Wiki<\/h1>\n<p><a href=\"https:\/\/wiki.pathmind.com\/\">https:\/\/wiki.pathmind.com\/<\/a><\/p>\n<p>Glossary:<br \/>\n<a href=\"https:\/\/wiki.pathmind.com\/glossary\">https:\/\/wiki.pathmind.com\/glossary<\/a><\/p>\n<h1>Train DDPG Agent to Control Flying Robot<\/h1>\n<p><a href=\"https:\/\/www.mathworks.com\/help\/\/\/deeplearning\/ug\/train-ddpg-agent-to-control-flying-robot.html\">https:\/\/www.mathworks.com\/help\/\/\/deeplearning\/ug\/train-ddpg-agent-to-control-flying-robot.html<\/a><\/p>\n<h1>Why should I choose matlab deep learning toolbox over other opensource frameworks like caffe, onnx, pytorch, torch etc?<\/h1>\n<p><a href=\"https:\/\/www.mathworks.com\/matlabcentral\/answers\/421259-why-should-i-choose-matlab-deep-learning-toolbox-over-other-opensource-frameworks-like-caffe-onnx\">https:\/\/www.mathworks.com\/matlabcentral\/answers\/421259-why-should-i-choose-matlab-deep-learning-toolbox-over-other-opensource-frameworks-like-caffe-onnx<\/a><\/p>\n<p>MnasNet: Platform-Aware Neural Architecture Search for Mobile<\/p>\n<p><a href=\"https:\/\/github.com\/tensorflow\/tpu\/tree\/master\/models\/official\/mnasnet\">https:\/\/github.com\/tensorflow\/tpu\/tree\/master\/models\/official\/mnasnet<\/a><\/p>\n<h1>A visual debugger for Jupyter<\/h1>\n<p>Debugger: Jupyter<br \/>\n<a href=\"https:\/\/blog.jupyter.org\/a-visual-debugger-for-jupyter-914e61716559\">https:\/\/blog.jupyter.org\/a-visual-debugger-for-jupyter-914e61716559<\/a><\/p>\n<p>IPython: <a href=\"https:\/\/chrieke.medium.com\/jupyter-tips-and-tricks-994fdddb2057\">https:\/\/chrieke.medium.com\/jupyter-tips-and-tricks-994fdddb2057<\/a><\/p>\n<p>*** . *** *** . *** . *** . ***<\/p>\n<p><em><strong><em><strong>Courses: <\/strong><a href=\"http:\/\/training.sitestree.com\/\">http:\/\/Training.SitesTree.com<\/a> (Big Data, Cloud, Security, Machine Learning)<\/em><br \/>\nBlog<\/strong>: <a href=\"http:\/\/bangla.salearningschool.com\/\">http:\/\/Bangla.SaLearningSchool.com<\/a>, <a href=\"http:\/\/sitestree.com\">http:\/\/SitesTree.com<\/a><\/em><br \/>\n<em><strong>8112223 Canada Inc.\/JustEtc<\/strong>: <a href=\"http:\/\/JustEtc.net\">http:\/\/JustEtc.net<\/a><\/em><\/p>\n<p><em><strong>Shop Online: <\/strong><\/em><a href=\"http:\/\/www.shopforsoul.com\/\">https:\/\/www.ShopForSoul.com\/<\/a><br \/>\n<em><strong>Linkedin<\/strong>: <a href=\"https:\/\/ca.linkedin.com\/in\/sayedjustetc\">https:\/\/ca.linkedin.com\/in\/sayedjustetc<\/a><\/em><\/p>\n<p><strong>Medium<\/strong>: <a href=\"https:\/\/medium.com\/@SayedAhmedCanada\">https:\/\/medium.com\/@SayedAhmedCanada<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>An interactive notebook training Keras to play Catch https:\/\/github.com\/JannesKlaas\/sometimes_deep_sometimes_learning\/blob\/master\/reinforcement.ipynb Key Papers in Deep RL https:\/\/spinningup.openai.com\/en\/latest\/spinningup\/keypapers.html#key-papers-in-deep-rl DEEP REINFORCEMENT LEARNING https:\/\/arxiv.org\/pdf\/1810.06339v1.pdf Tesla: Deep Learning https:\/\/quantdare.com\/deep-reinforcement-trading\/ Playing Atari with Deep Reinforcement Learning https:\/\/arxiv.org\/pdf\/1312.5602.pdf AlphaGo is the first computer program to defeat a professional human Go player, the first to defeat a Go world champion, and is arguably the &hellip; <\/p>\n<p><a class=\"more-link btn\" href=\"http:\/\/bangla.sitestree.com\/?p=19381\">Continue reading<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[182],"tags":[],"class_list":["post-19381","post","type-post","status-publish","format-standard","hentry","category---blog","item-wrap"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[{"id":19382,"url":"http:\/\/bangla.sitestree.com\/?p=19382","url_meta":{"origin":19381,"position":0},"title":"Resources: Reinforcement Learning and Deep Reinforcement Learning","author":"Sayed","date":"January 28, 2021","format":false,"excerpt":"Platform: https:\/\/gym.openai.com\/ Code Examples: https:\/\/towardsdatascience.com\/using-deep-q-learning-in-fifa-18-to-perfect-the-art-of-free-kicks-f2e4e979ee66?gi=b96ce845729c https:\/\/becominghuman.ai\/reinforcement-learning-with-fifa-and-keras-85ec792e25b2 https:\/\/towardsdatascience.com\/reinforcement-learning-demystified-solving-mdps-with-dynamic-programming-b52c8093c919 https:\/\/github.com\/openai\/gym\/blob\/master\/gym\/envs\/toy_text\/nchain.py Theory https:\/\/towardsdatascience.com\/introduction-to-various-reinforcement-learning-algorithms-i-q-learning-sarsa-dqn-ddpg-72a5e0cb6287 https:\/\/cecas.clemson.edu\/ayalew\/Papers\/Vehicle%20Systems%20Dynamics%20and%20Control\/Papers\/A%20Saturation%20Balancing%20Control%20Method%20for%20Enhancing%20Dynamic%20Vehicle%20Stability\/IJVD%2061_1-4_Paper%203.pdf https:\/\/arxiv.org\/pdf\/1712.01815.pdf *** . *** *** . *** . *** . *** Courses: http:\/\/Training.SitesTree.com (Big Data, Cloud, Security, Machine Learning) Blog: http:\/\/Bangla.SaLearningSchool.com, http:\/\/SitesTree.com 8112223 Canada Inc.\/JustEtc: http:\/\/JustEtc.net Shop Online: https:\/\/www.ShopForSoul.com\/ Linkedin: https:\/\/ca.linkedin.com\/in\/sayedjustetc Medium: https:\/\/medium.com\/@SayedAhmedCanada","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":14751,"url":"http:\/\/bangla.sitestree.com\/?p=14751","url_meta":{"origin":19381,"position":1},"title":"Applications and Research on Reinforcement Learning","author":"Sayed","date":"May 3, 2019","format":false,"excerpt":"\"WHAT ARE MAJOR REINFORCEMENT LEARNING ACHIEVEMENTS & PAPERS FROM 2018?\" Reference: https:\/\/www.topbots.com\/most-important-ai-reinforcement-learning-research\/#ai-rl-paper-2018-10 \" Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures Temporal Difference Models: Model-Free Deep RL for Model-Based Control Addressing Function Approximation Error in Actor-Critic Methods\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":19601,"url":"http:\/\/bangla.sitestree.com\/?p=19601","url_meta":{"origin":19381,"position":2},"title":"Reinforcement Learning Examples\/DQN Examples","author":"Sayed","date":"February 2, 2021","format":false,"excerpt":"What I was looking for is: A DQN (Deep Q Learning Neural Network) or a Reinforcement Learning example that can learn from existing simulation data, and then can use that learning to interactively optimize an objective. The challenge will be: Whether my data can be learned from (whether the format\/structure\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":26368,"url":"http:\/\/bangla.sitestree.com\/?p=26368","url_meta":{"origin":19381,"position":3},"title":"Reinforcement Learning Examples\/DQN Examples #Root","author":"Author-Check- Article-or-Video","date":"April 22, 2021","format":false,"excerpt":"What I was looking for is: A DQN (Deep Q Learning Neural Network) or a Reinforcement Learning example that can learn from existing simulation data, and then can use that learning to interactively optimize an objective. The challenge will be: Whether my data can be learned from (whether the format\/structure\u2026","rel":"","context":"In &quot;FromSitesTree.com&quot;","block_context":{"text":"FromSitesTree.com","link":"http:\/\/bangla.sitestree.com\/?cat=1917"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":14752,"url":"http:\/\/bangla.sitestree.com\/?p=14752","url_meta":{"origin":19381,"position":4},"title":"AI Implementation Platforms: Reinforcement Learning Platforms and Applications:","author":"Sayed","date":"May 3, 2019","format":false,"excerpt":"\" Gym: https:\/\/gym.openai.com\/ Gym is a toolkit for developing and comparing .... It supports teaching agents everything from walking to playing games like Pong or Pinball. \" https:\/\/gym.openai.com\/ ---- \"Project Malmo integrates (deep) reinforcement learning, cognitive science, and many ideas from artificial intelligence. \" https:\/\/www.microsoft.com\/en-us\/research\/project\/project-malmo\/ ---- DeepMind: \"DeepMind's scientific mission\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":24971,"url":"http:\/\/bangla.sitestree.com\/?p=24971","url_meta":{"origin":19381,"position":5},"title":"AI Implementation Platforms: Reinforcement Learning Platforms and Applications: #Root","author":"Author-Check- Article-or-Video","date":"April 14, 2021","format":false,"excerpt":"\" Gym: https:\/\/gym.openai.com\/ Gym is a toolkit for developing and comparing .... It supports teaching agents everything from walking to playing games like Pong or Pinball. \" https:\/\/gym.openai.com\/ ---- \"Project Malmo integrates (deep) reinforcement learning, cognitive science, and many ideas from artificial intelligence. \" https:\/\/www.microsoft.com\/en-us\/research\/project\/project-malmo\/ ---- DeepMind: \"DeepMind's scientific mission\u2026","rel":"","context":"In &quot;FromSitesTree.com&quot;","block_context":{"text":"FromSitesTree.com","link":"http:\/\/bangla.sitestree.com\/?cat=1917"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/19381","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=19381"}],"version-history":[{"count":0,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/19381\/revisions"}],"wp:attachment":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=19381"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=19381"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=19381"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}