Media-Smart/vedaseg

Introduction

vedaseg is an open source semantic segmentation toolbox based on PyTorch.

Features

Modular Design

We decompose the semantic segmentation framework into different components. The flexible and extensible design make it easy to implement a customized semantic segmentation project by combining different modules like building Lego.
Support of several popular frameworks

The toolbox supports several popular and semantic segmentation frameworks out of box, e.g. DeepLabv3+, DeepLabv3, U-Net, PSPNet, FPN, etc.

License

This project is released under the Apache 2.0 license.

Benchmark and model zoo

Note: All models are trained only on PASCAL VOC 2012 trainaug dataset and evaluated on PASCAL VOC 2012 val dataset.

Architecture	backbone	OS	MS & Flip	mIOU
DeepLabv3plus	ResNet-101	16	True	79.80%
DeepLabv3plus	ResNet-101	16	False	78.19%
DeepLabv3	ResNet-101	16	True	78.94%
DeepLabv3	ResNet101	16	False	77.07%
FPN	ResNet-101	2	True	75.42%
FPN	ResNet-101	2	False	73.65%
PSPNet	ResNet-101	8	True	74.68%
PSPNet	ResNet-101	8	False	73.71%
U-Net	ResNet-101	1	True	73.09%
U-Net	ResNet-101	1	False	70.98%

OS: Output stride used during evaluation\ MS: Multi-scale inputs during evaluation\ Flip: Adding left-right flipped inputs during evaluation

Models above are available in the GoogleDrive.

Installation

Requirements

Linux
Python 3.7+
PyTorch 1.1.0 or higher
CUDA 9.0 or higher

We have tested the following versions of OS and softwares:

OS: Ubuntu 16.04.6 LTS
CUDA: 9.0
Python 3.7.3

Install vedaseg

a. Create a conda virtual environment and activate it.

conda create -n vedaseg python=3.7 -y
conda activate vedaseg

b. Install PyTorch and torchvision following the official instructions, e.g.,

conda install pytorch torchvision -c pytorch

c. Clone the vedaseg repository.

git clone https://github.com/Media-Smart/vedaseg.git
cd vedaseg
vedaseg_root=${PWD}

d. Install dependencies.

pip install -r requirements.txt

Prepare data

Download Pascal VOC 2012 and Pascal VOC 2012 augmented, resulting in 10,582 training images(trainaug), 1,449 validatation images.

cd ${vedaseg_root}
mkdir ${vedaseg_root}/data
cd ${vedaseg_root}/data

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz

tar xf VOCtrainval_11-May-2012.tar
tar xf benchmark.tgz

python ../tools/encode_voc12_aug.py
python ../tools/encode_voc12.py

mkdir VOCdevkit/VOC2012/EncodeSegmentationClass
#cp benchmark_RELEASE/dataset/encode_cls/* VOCdevkit/VOC2012/EncodeSegmentationClass
(cd benchmark_RELEASE/dataset/encode_cls; cp * ${vedaseg_root}/data/VOCdevkit/VOC2012/EncodeSegmentationClass)
#cp VOCdevkit/VOC2012/EncodeSegmentationClassPart/* VOCdevkit/VOC2012/EncodeSegmentationClass
(cd VOCdevkit/VOC2012/EncodeSegmentationClassPart; cp * ${vedaseg_root}/data/VOCdevkit/VOC2012/EncodeSegmentationClass)

comm -23 <(cat benchmark_RELEASE/dataset/{train,val}.txt VOCdevkit/VOC2012/ImageSets/Segmentation/train.txt | sort -u) <(cat VOCdevkit/VOC2012/ImageSets/Segmentation/val.txt | sort -u) > VOCdevkit/VOC2012/ImageSets/Segmentation/trainaug.txt

To avoid tedious operations, you could save the above linux commands as a shell file and execute it.

Train

a. Config

Modify some configuration accordingly in the config file like configs/deeplabv3plus.py

b. Run

python tools/trainval.py configs/deeplabv3plus.py

Snapshots and logs will be generated at ${vedaseg_root}/workdir.

Test

a. Config

Modify some configuration accordingly in the config file like configs/deeplabv3plus.py

b. Run

python tools/test.py configs/deeplabv3plus.py path_to_deeplabv3plus_weights

Contact

This repository is currently maintained by Hongxiang Cai (@hxcai), Yichao Xiong (@mileistone).

Credits

We got a lot of code from mmcv and mmdetection, thanks to open-mmlab.