navbot

It's a collection for mapless robot navigation using RGB image as visual input. It contains the test 
environment and motion planners, aiming at realizing all the three levels of mapless navigation:
1. memorizing efficiently; 
2. from memorizing to reasoning; 
3. more powerful reasoning
The experiment data is in ./materials/record folder.

Environment

I built the environment as benchmark for testing the algorithms.

It has the following properties:

Diverse complexity.
Gym-style Interface.
Support ROS.

Quickstart example code to use this benckmark.

import env
maze0 = env.GazeboMaze(maze_id=0, continuous=True)
observation = maze0.reset()
done = False
while not done:
     # Stochastic strategy
     action = dict()
     action['linear_vel'] = np.random.uniform(0, 1)
     action['angular_vel'] = np.random.uniform(-1, 1)
     observation, done, reward = maze0.execute(action)
     print(action, reward)
maze0.close()

1. Memorizing

VAE-based planner

VAE Structure and Training

The designed VAE strcture is shown in the lower left figure. Train it in maze1 and maze2. The kl_tolerace is set to 0.5 (We stop optimizing for KL loss term once it is lower than some level, rather than letting it go to near zero) and latent dim is 32, thus the total loss is trained as close as possible to 16.

The following results are tested in maze3 to verify the ability of generalization.

Planner Structure

VAE-based planner & Baseline network structure

Performance

The proposed trajectory is blue and the baseline is green.
The success rate comparision in maze1.
Performance comparision

SPL Benchmark Proposed

maze1 0.702 0.703

maze2 0.611 0.626

SPL	Benchmark	Proposed
maze1	0.702	0.703
maze2	0.611	0.626

That is, the proposed motion planner not only has much better sample-efficience, but also it has better performance. Actually, the shortest path in two mazes are both found by proposed motion planner (26 timesteps in maze1 and 29 time steps in maze2 with acceleration in simulation).

2. From Memorizing to Reasoning

Stacked LSTM and network structure

Stacked LSTM

network structure

Result

Success rate in maze1

Install

Ddependencies

tensorflow: 1.5.0
OS: Ubuntu 16.04
Python: 2.7
OpenCV: 3
ROS: Kinetic
Gazebo: 7
tensorforce: https://github.com/tensorforce/tensorforce

# install tensorflow-gpu after cudnn and cuda are installed
pip install tensorflow-gpu==1.5.0
# or just use tensorflow-cpu if no Nvidia GPU, it can also work.
pip install tensorflow==1.5.0
# install OpenCV: https://docs.opencv.org/master/d7/d9f/tutorial_linux_install.html
# install ROS: http://wiki.ros.org/kinetic/Installation/Ubuntu
# install Gazebo 
sudo apt-get install gazebo7 libgazebo7-dev
# install old version that supports python2 of tensorforce form source

Run

sudo apt-get install ros-kinetic-gazebo-ros-pkgs ros-kinetic-gazebo-ros-control
sudo apt-get install ros-kinetic-turtlebot-*
sudo apt-get remove ros-kinetic-turtlebot-description
sudo apt-get install ros-kinetic-kobuki-description
# change to catkin_ws/src
git clone https://github.com/marooncn/navbot
cd ..
catkin_make
source ./devel/setup.bash
# you can change the configure in config.py
cd src/navbot/rl_nav/scripts
# run the proposed model for memorizing
python PPO.py
# run the proposed model for reasoning
python E2E_PPO_rnn.py

Details

The default environment is maze1, you need to change maze_id in nav_gazebo.launch and config.py if you want change the environment.
To execute 01_generate_data.py to generate data, you need to comment the goal-related code in nav_gazebo.launch and env.py.
maze1 and maze2 are speeded up 10 times to train, if you want speed up other environments, just change
0.001 1
to
0.01
in the environment file in worlds.
To reproduce the result, please change the related parameters in config.py according to config.txt.
PPO is not a deterministic policy gradient algorithm, the action at every timestep is sampled according to the distribution. It can be seen as "noise" and it's useful for explorations and generalizations. If you want to use the best strategy after the model is trained, just change 'deterministic = True' in config.py and the performance will be improved.

Cite

If your find the work is helpful in your research, please cite the following papers:

Blog

Introduction to tensorforce(Chinese)
Introduction to this work(Chinese)

Reference

tensorforce(blog)
gym_gazebo
gazebo
roslaunch python API
turtlebot_description
kobuki_description
WorldModelsExperiments(official)
WorldModels(by Applied Data Science)