SSD: Single Shot MultiBox Object Detector

SSD is an unified framework for object detection with a single network.

You can use the code to train/evaluate/test for object detection task.

Disclaimer

This is a re-implementation of original SSD which is based on caffe. The official repository is available here. The arXiv paper is available here.

This example is intended for reproducing the nice detector while fully utilize the remarkable traits of MXNet.

What's new

Demo results

demo1 demo2 demo3

mAP

Model Training data Test data mAP Note
VGG16_reduced 300x300 VOC07+12 trainval VOC07 test 77.8 fast
VGG16_reduced 512x512 VOC07+12 trainval VOC07 test 79.9 slow
Inception-v3 512x512 VOC07+12 trainval VOC07 test 78.9 fastest
Resnet-50 512x512 VOC07+12 trainval VOC07 test 79.1 fast
MobileNet 512x512 VOC07+12 trainval VOC07 test 72.5 super fast
MobileNet 608x608 VOC07+12 trainval VOC07 test 74.7 super fast

More to be added

Speed

Model GPU CUDNN Batch-size FPS*
VGG16_reduced 300x300 TITAN X(Maxwell) v5.1 16 95
VGG16_reduced 300x300 TITAN X(Maxwell) v5.1 8 95
VGG16_reduced 300x300 TITAN X(Maxwell) v5.1 1 64
VGG16_reduced 300x300 TITAN X(Maxwell) N/A 8 36
VGG16_reduced 300x300 TITAN X(Maxwell) N/A 1 28

Forward time only, data loading and drawing excluded.

Getting started

Try the demo

Train the model

This example only covers training on Pascal VOC dataset. Other datasets should be easily supported by adding subclass derived from class Imdb in dataset/imdb.py. See example of dataset/pascal_voc.py for details.

Evalute trained model

Use:

# cd /path/to/mxnet-ssd
python evaluate.py --gpus 0,1 --batch-size 128 --epoch 0

Convert model to deploy mode

This simply removes all loss layers, and attach a layer for merging results and non-maximum suppression. Useful when loading python symbol is not available.

# cd /path/to/mxnet-ssd
python deploy.py --num-class 20
# then you can run demo with new model without loading python symbol
python demo.py --prefix model/ssd_300_deploy --epoch 0 --deploy

Convert caffemodel

Converter from caffe is available at /path/to/mxnet-ssd/tools/caffe_converter

This is specifically modified to handle custom layer in caffe-ssd. Usage:

cd /path/to/mxnet-ssd/tools/caffe_converter
make
python convert_model.py deploy.prototxt name_of_pretrained_caffe_model.caffemodel ssd_converted
# you will use this model in deploy mode without loading from python symbol
python demo.py --prefix ssd_converted --epoch 1 --deploy

There is no guarantee that conversion will always work, but at least it's good for now.

Legacy models

Since the new interface for composing network is introduced, the old models have inconsistent names for weights. You can still load the previous model by rename the symbol to legacy_xxx.py and call with python train/demo.py --network legacy_xxx For example:

python demo.py --network 'legacy_vgg16_ssd_300.py' --prefix model/ssd_300 --epoch 0

Docker

First make sure docker is installed. The docker plugin nvidia-docker is required to run on Nvidia GPUs.

Tensorboard

Tensorboard visualizations

loss AP ROC