Syllabus for Reinforcement Learning 2020

Open online course, Fall 2020 (PDF version)

Brief description

​Reinforcement learning (RL) is a family of modern machine learning techniques which has obtained unprecedented successes in artificial intelligence benchmarks, see for instance Google’s AlphaGo’s successes against humans. Using RL techniques, computers can autonomously learn to make decisions using feed-back from real and/or simulated environments/data. This course will give a Master level introduction to these techniques.

In particular, the general course objective is the following: Evaluate the applicability and limitations of RL approaches to a given problem, choose and implement the basic form of a suitable RL method.

The course is directly related to artificial intelligence and machine learning with applications to various fields including self-driving cars and big data.


Ayca Ozcelikkale (Department of Electrical Engineering, Uppsala University, Course Responsible, Contact Person, ​​;
Per Mattsson (Department of Information Technology, Uppsala University);
André Teixeira (Department of Electrical Engineering, Uppsala University).

Starting Time and Duration

Week 45, 2020. The course runs for 6 weeks, ending in Week 50.


The course targets participants with a Bachelor degree in engineering, natural sciences, mathematics or a similar subject field. The course is mainly on Master’s level. Since the course targets a rapidly evolving field, a limited amount of advanced material will be also presented.

Recommended Background

Programming experience in Python, basic knowledge in linear algebra and probability.

Language of instruction


Forms of instruction

Online lecture videos, slides, and self-study instructions with Jupyter notebooks with computational and analytical work, assignments, study groups, class meetings.

Class meetings

The class meetings will be held over Zoom at the following times: Nov.10, Tuesday, 10:15-12:00; Nov.17, Tuesday, 10:15-12:00; Nov. 24, Tuesday, 10:15-12:00; Dec. 1, Tuesday, 10:15-12:00; Dec. 10, Thursday, 13:15-15:00.

Course Content

Markov Decision Processes, Dynamic Programming (Policy Evaluation, Policy Iteration, Value Iteration), Model-free RL (Monte-Carlo Learning, Temporal-Difference Methods), Model-based RL, Approximation Methods for RL, Policy Gradient Methods.

Course Literature

Richard S. Sutton and Andrew G. Barto, "Reinforcement Learning: An Introduction", Second Edition, MIT Press. A copy of the same is available here. 

Last modified: 2021-05-14