Enroll Now for Spring Discount15% discount on select spring courses for all subscribers.
Coronavirus (COVID-19) Update
Our courses are taught remotely through spring 2021. Please check our coronavirus update page for our latest announcements.
Deep Reinforcement Learning | AISV.802
This advanced course starts with a quick review of some deep learning architectures followed by an introduction to fundamental concepts of reinforcement learning (RL) that we illustrate with concrete examples. Next, we’ll explore the Bellman equation, policies, models, Q-learning, the SARSA algorithm, and temporal difference (TD) learning.
In this deep reinforcement learning (DRL) course, you will learn how to solve common tasks in RL, including some well-known simulations, such as CartPole, MountainCar, and FrozenLake. You will be introduced to concepts such as clipping regions and policy gradients, as well as an extensive collection of algorithms, including DQN, prioritized experience replay, DDQN, D4PG, A2C, PPO, TRPO, DDPG, A2C, and SAC.
Eventually the course introduces additional algorithms, such as ACER and ACTKR, as well as DRL libraries, such as Google Dopamine and Tensor Flow-Agents. In almost all cases, the code samples are written in TF2.Keras, along with a limited number of code samples in PyTorch. The development of a plethora of DRL algorithms has improved the accuracy of diverse areas, such as natural language processing and robotics. In addition, DRL-based systems represent the state-of-the-art in Go as well as highly sophisticated multi-player games (including StarCraft and Dota).
- Deep learning architectures
- Markov decision processes
- Reinforcement and deep reinforcement learning
- Policy gradients and various algorithms
- Proximal policy optimization
- Various actor/critic algorithms
- Deep RL libraries
At the conclusion of the course, the student should be able to:
- Describe how a bi-LSTM differs from a standard LSTM
- Explain how n-grams work
- Describe the BERT architecture
- Describe Q learning, models, and policies
- Define the purpose of the Bellman equation
- Discuss the advantages/disadvantages of reinforcement learning
- Explain how the epsilon-greedy algorithm differs from a pure greedy algorithm
- Discuss how deep learning enhances reinforcement learning
- Describe GANs and how they pertain to autonomous vehicles
Prerequisites - Please note that this course covers advanced topics, and students are expected to have completed one of the prerequisite courses or have equivalent experience."
Course Availability Notification
Please use this form to be notified when this course is open for enrollment.