Markov decision problems and reinforcement learning have been active research areas in the past decade. Compared with the rapid algorithmic developments, the linear/convex programming formulations of the Markov decision problem are less well-known. In the first part of talk, we will discuss the convex optimization formulations of Markov decision problems in the primal, dual, and primal-dual forms. In the second part of the talk, we will present two new algorithms that are inspired by these optimization formulations and exhibit exponential or even super-exponential convergence.
Pizza lunch will be provided.
We acknowledge financial support from the Pacific Institute for the Mathematical Sciences (PIMS) and the UBC Institute of Applied Mathematics (IAM).