Overview

Decision theory forms the basis of modern artificial intelligence and economics. This course gives a firm foundation to decision theory from mainly a statistical, but also a philosophical perspective. The course has two aims:

To give a thorough understanding of statistical decision theory, automatic methods for designing experiments and relate the theory to human decision making.
Apply the theory to practical problems in reinforcement learning and artificial intelligence, through the development of algorithms and experiments with intelligent agents.

The course may be split in two parts.

In the first part, we introduce the concepts of subjective probability and utility, and how they can be used to represent and solve decision problems. We then cover the estimation of unknown parameter and hypothesis testing. Finally, we discuss sequential sampling, sequential experiments, and more generally, sequential decision making.

The second part focuses on recent research on decision making under uncertainty, and in particular reinforcement learning and learning with expert advice. First, we examine a few representative statistical models. We then give an overview of algorithms for making optimal decisions using these models. Finally, we look at the problem of learning to act by following expert advice, which a field that has recently found many applications in online advertising, game-tree search and optimisation.

Prerequisites

Mathematical background: Good knowledge of basic calculus and probability is required. Mathematical maturity is necessary.

Computer science background: Basic courses in algorithms and datastructures are necessary for being able to develop the main algorithms. Programming experience relating to numerical problems is valuable for the homework exercises and the project. Your programs can be in any cross-platform portable language, but a level of programming maturity is expected.

Complementary courses: Previous exposure to artificial intelligence, machine learning, optimisation, or statistics, is partially beneficial but is not necessary. You are encouraged to take such courses either before or after this course in order to have a more rounded view of the field.

Assessment

Continous assessment, based on exercises, course participation and a project. The exercises require a mixture of basic applied mathematics (calculus and probability), programming and some thought. The exercises are designed so that by the end of exercise set 8, you have a basic set of algorithms that you can use in the project.

The project is organised as a competition. Students form 2-person teams. Each team creates an environment and an agent. These are presented at the end of the course. Everybody's agents are tested on everybody's environments, and the results are also shown at the end of the course.

Recommended Books

As this is an advanced course, we do not follow a particular book, but pick and mix from a number of sources. Roughly, the initial part of the course concentrates on De Groot, while the second part starts with Puterman and then some parts of Bertsekas and Tsitsiklis. If we have time, we'll also cover some parts of the Cesa-Bianchi and Lugosi book.

Statistics
De Groot, "Optimal Statistical Decisions".
Gives a good overview of Bayesian statistics and decision theory, and an introduction to Markov decision processes from a statistical perspective. Most reinforcement learning problems can be reduced to MDPs of this type.
Savage, "The foundations of statistics".
This early work gives a more verbose (but rigorous) alternative to de Groot. Most basic ideas of statistical decision theory are laid out in this book, including minimax problems.
Robert and Casella, "Monte Carlo Statistical Methods".
Gives an excellent overview of Monte Carlo methods, including theory and many useful examples.

Reinforcement learning
Puterman, "Markov Decision Processes".
Examines various types of Markov decision processes and gives detailed proofs of basic dynamic programming algorithms, as well as important cases not treated elsewhere.
Bertsekas and Tsitsiklis, "Neuro-dynamic programming".
This work connects basic MDP theory to reinforcement learning, in the case where we perform approximate dynamic programming via simulation. Many algorithms are described and analysed (in the limit).
Sutton and Barto, "Reinforcement learning: an introduction".
Gives an overview of the problem of reinforcement learning, and many important basic algorithms. Includes no analysis, but it is recommended for a first approach to the subject.
Csaba Szepesvari "Algorithms for reinforcement learning".
This is a good reference book for modern reinforcement learning algorithms. Useful for advanced students.

General sequential learning problems
Cesa-Bianchi and Lugosi, "Prediction, Learning and Games".
Considers the case where we do not necessarily have an explicit, or estimated environment model, and where we may be acting against an adversarial environment. This book requires patience, but is very rewarding.

Overview

Prerequisites

Assessment

Recommended Books

Statistics

Reinforcement learning

General sequential learning problems