Learning Spatial Fusion for Single-Shot Object Detection

By Songtao Liu, Di Huang, Yunhong Wang

Introduction

In this work, we propose a novel and data driven strategy for pyramidal feature fusion, referred to as adaptively spatial feature fusion (ASFF). It learns the way to spatially filter conflictive information to suppress the inconsistency, thus improving the scale-invariance of features, and introduces nearly free inference overhead. For more details, please refer to our arXiv paper.

Updates:

COCO

System test-dev mAP Time (V100) Time (2080ti)
YOLOv3 608 33.0 20ms 26ms
YOLOv3 608+ BoFs 37.0 20ms 26ms
YOLOv3 608 (our baseline) 38.8 20ms 26ms
YOLOv3 608+ ASFF 40.6 22ms 30ms
YOLOv3 608+ ASFF* 42.4 22ms 30ms
YOLOv3 800+ ASFF* 43.9 34ms 38ms
YOLOv3 MobileNetV1 416 + BoFs 28.6 - 22 ms
YOLOv3 MobileNetV2 416 (our baseline) 29.0 - 22 ms
YOLOv3 MobileNetV2 416 +ASFF 30.6 - 24 ms

Citing

Please cite our paper in your publications if it helps your research:

@article{liu2019asff,
    title = {Learning Spatial Fusion for Single-Shot Object Detection},
    author = {Songtao Liu, Di Huang and Yunhong Wang},
    booktitle = {arxiv preprint arXiv:1911.09516},
    year = {2019}
}

Contents

  1. Installation
  2. Datasets
  3. Training
  4. Evaluation
  5. Models

Installation

Prerequisites

Demo

python demo.py -i /path/to/your/image \
--cfg config/yolov3_baseline.cfg -d COCO \
--checkpoint /path/to/you/weights --half --asff --rfb -s 608

Datasets

Note: We currently only support COCO and VOC.
To make things easy, we provide simple COCO and VOC dataset loader that inherits torch.utils.data.Dataset making it fully compatible with the torchvision.datasets API.

Moreover, we also implement the Mix-up strategy in BoFs and distributed random resizing in YOLov3.

COCO Dataset

Install the MS COCO dataset at /path/to/coco from official website, default is ./data/COCO, and a soft-link is recommended.

ln -s /path/to/coco ./data/COCO

It should have this basic structure

$COCO/
$COCO/annotations/
$COCO/images/
$COCO/images/test2017/
$COCO/images/train2017/
$COCO/images/val2017/

The current COCO dataset has released new train2017 and val2017 sets, and we defaultly train our model on train2017 and evaluate on val2017.

VOC Dataset

Install the VOC dataset as ./data/VOC. We also recommend a soft-link:

ln -s /path/to/VOCdevkit ./data/VOC

Training

Evaluation

To evaluate a trained network, you can use the following command:

python -m torch.distributed.launch --nproc_per_node=10 --master_port=${RANDOM+10000} eval.py \
--cfg config/yolov3_baseline.cfg -d COCO --distributed --ngpu 10 \
--checkpoint /path/to/you/weights --half --asff --rfb -s 608

By default, it will directly output the mAP results on COCO val2017 or VOC test 2007.

Models