Inverse Reinforcement Learning

DOI

Implements selected inverse reinforcement learning (IRL) algorithms as part of COMP3710, supervised by Dr Mayank Daswani and Dr Marcus Hutter. My final report is available here and describes the implemented algorithms.

If you use this code in your work, you can cite it as follows:

@misc{alger16,
  author       = {Matthew Alger},
  title        = {Inverse Reinforcement Learning},
  year         = 2016,
  doi          = {10.5281/zenodo.555999},
  url          = {https://doi.org/10.5281/zenodo.555999}
}

Algorithms implemented

Additionally, the following MDP domains are implemented:

Requirements

Module documentation

Following is a brief list of functions and classes exported by modules. Full documentation is included in the docstrings of each function or class; only functions and classes intended for use outside the module are documented here.

linear_irl

Implements linear programming inverse reinforcement learning (Ng & Russell, 2000).

Functions:

maxent

Implements maximum entropy inverse reinforcement learning (Ziebart et al., 2008).

Functions:

deep_maxent

Implements deep maximum entropy inverse reinforcement learning based on Ziebart et al., 2008 and Wulfmeier et al., 2015, using symbolic methods with Theano.

Functions:

value_iteration

Find the value function associated with a policy. Based on Sutton & Barto, 1998.

Functions:

mdp

gridworld

Implements the gridworld MDP.

Classes, instance attributes, methods:

objectworld

Implements the objectworld MDP described in Levine et al. 2011.

Classes, instance attributes, methods: