Foundations of Reinforcement Learning: From Bandits to Deep RL

This 3-week course introduces university students to the exciting world of Reinforcement Learning (RL), a powerful machine learning approach where agents learn to make decisions through trial and error. Starting with the basics of agent-environment interactions and the multi-armed bandit problem, the course progresses to Markov Decision Processes, model-free methods like Q-learning, and advanced deep RL techniques. Through lecture notes, presentations, hands-on demos, and assessments, students will build practical skills in Python using libraries like Gymnasium and PyTorch. Ideal for those with basic probability and programming knowledge, this course equips learners to apply RL to real-world problems like gaming, robotics, and optimization.

Write your awesome label here.

Reinforcement Learning: How does an AI agent learn by trial and error through rewards and penalties from its environment?

Reinforcement Learning (RL) is a type of machine learning where an agent learns to make decisions by interacting with an environment, receiving rewards for good actions and penalties for bad ones. Over time, the agent improves its strategy (policy) to maximize long-term rewards.

What We Offer

What's included in this online course?

A step-by-step guide

Start from the basics of Reinforcement Learning

Progress through Markov Decision Processes (MDPs)

Learn model-free methods: Temporal Difference, Q-learning, SARSA

Each week builds on the previous with notes, demos, and quizzes

Study at your own pace

Access course materials anytime, anywhere

Learn at your convenience with flexible scheduling

Revisit lectures and coding exercises as often as needed

Complete quizzes and projects at your own speed

Unique learning experience

Hands-on coding in Python with real simulations (e.g., Bandit, FrozenLake)

Interactive quizzes and assessments to test understanding

Real-world case studies and projects for practical application

Engaging mix of lectures, demos, and exercises
Created with