autonlab.org

Covariant Policy Search (2003)

Drew Bagnell, Jeff Schneider

Tags

Markov Decision Processes, Optimization, Reinforcement Learning

Abstract

Abstract We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geometric methods. This leads us to propose a natural metric on controller parameterization that results from considering the manifold of probability distributions over paths induced by a stochastic controller. Investigation of this approach leads to a covariant gradient ascent rule. Interesting properties of this rule are discussed, including its relation with actor-critic style reinforcement learning algorithms. The algorithms discussed here are computationally quite efficient and on some interesting problems lead to dramatic performance improvement over non-covariant rules.

Full text

Download (application/pdf, 135.8 kB)

Approximate BibTeX Entry

@inproceedings{bagnellCovariant,
    Month = {July},
    Year = {2003},
    Booktitle = {Proceedings of the International Joint Conference on Artificial Intelligence},
    Author = { Drew Bagnell, Jeff Schneider },
    Title = {Covariant Policy Search}
}

Copyright 2010, Carnegie Mellon University, Auton Lab. All Rights Reserved.