A3C Continuous Reinforcement Learning

Tensorflow implementation of the asynchronous advantage actor-critic (A3C) reinforcement learning algorithm (paper) for continuous action space. Code is mostly based on Morvan Zhou (github).

Components

Results

Pendulum environment before training:

before

After 1500 episodes:

after