Foundations of Reinforcement Learning: From Bandits to Deep RL
This 3-week course introduces university students to the exciting world of Reinforcement Learning (RL), a powerful machine learning approach where agents learn to make decisions through trial and error. Starting with the basics of agent-environment interactions and the multi-armed bandit problem, the course progresses to Markov Decision Processes, model-free methods like Q-learning, and advanced deep RL techniques. Through lecture notes, presentations, hands-on demos, and assessments, students will build practical skills in Python using libraries like Gymnasium and PyTorch. Ideal for those with basic probability and programming knowledge, this course equips learners to apply RL to real-world problems like gaming, robotics, and optimization.
Reinforcement Learning: How does an AI agent learn by trial and error through rewards and penalties from its environment?
Reinforcement Learning (RL) is a type of machine learning where an agent learns to make decisions by interacting with an environment, receiving rewards for good actions and penalties for bad ones. Over time, the agent improves its strategy (policy) to maximize long-term rewards.
