RL: Lecture Schedule

Note: See the "Coding Proficiency Self-Check" below for details about programming requirements.

Lecture times and location:

  • Tuesdays and Fridays, 14.10–15.00
  • 40 George Square, Lecture Theatre B (Entrance in annex building, look for signage)

Lecture slides are linked in the table below. Slides may be updated, and the new slides will be uploaded at least 24h before the lecture time.

The course follows the 2nd-edition RL book by Richard Sutton and Andrew Barto, the 2022 version of which can be downloaded for free here. Required reading from the RL book for each lecture is given in the last slide of the lecture and in the table below.

WeekDateTopic (slides)Required Reading
114 January '25IntroductionCh. 1 (1.1–1.4); Coding Proficiency Self-Check
17 January '25Multi-armed banditsCh. 2 (2.1–2.8)
221 January '25Markov decision processes (part 1)Ch. 3 (3.1–3.7)
24. JanuaryLecture cancelled due to adverse weather conditions  
328 January '25Markov decision processes (part 2) and Dynamic programming (part 1)Ch. 4 (4.1–4.7)
31 January '25Monte Carlo methods (and DP part 2)  Ch. 5 (5.1–5.7)
404 February '25Temporal-difference learningCh. 6 (6.1–6.2, 6.4–6.6), Ch. 7 (7.1–7.3)
07 February '25Tutorial lecture: Building a complete RL systemDemo code on Learn
511 February '25Coursework introductionCoursework on Learn (from 11/2/25)
14 February '25Planning and learning (incl. n-step TD)Ch. 8 (8.1–8.3, 8.10–8.11)
Flexible learning week (no lectures)
625 February '25Value function approximationCh. 9 (9.1–9.5), Ch. 10 (10.1), Ch. 11 (11.1)
28 February '25Policy gradient methodsCh. 13 (13.1–13.5)
704 March '25Deep reinforcement learningThese topics will not be examined.
07 March '25Reward: Neuroscience, Reward Hypothesis, Inverse RL, Shaping
811 March '25RL Beyond the Markov Property
14 March '25
(last lecture)
Multi-agent reinforcement learning

License
All rights reserved The University of Edinburgh