Introduction

vedaseg is an open source semantic segmentation toolbox based on PyTorch.

Features

License

This project is released under the Apache 2.0 license.

Benchmark and model zoo

Note: All models are trained only on PASCAL VOC 2012 trainaug dataset and evaluated on PASCAL VOC 2012 val dataset.

Architecture backbone OS MS & Flip mIOU
DeepLabv3plus ResNet-101 16 True 79.80%
DeepLabv3plus ResNet-101 16 False 78.19%
DeepLabv3 ResNet-101 16 True 78.94%
DeepLabv3 ResNet101 16 False 77.07%
FPN ResNet-101 2 True 75.42%
FPN ResNet-101 2 False 73.65%
PSPNet ResNet-101 8 True 74.68%
PSPNet ResNet-101 8 False 73.71%
U-Net ResNet-101 1 True 73.09%
U-Net ResNet-101 1 False 70.98%

OS: Output stride used during evaluation\ MS: Multi-scale inputs during evaluation\ Flip: Adding left-right flipped inputs during evaluation

Models above are available in the GoogleDrive.

Installation

Requirements

We have tested the following versions of OS and softwares:

Install vedaseg

a. Create a conda virtual environment and activate it.

conda create -n vedaseg python=3.7 -y
conda activate vedaseg

b. Install PyTorch and torchvision following the official instructions, e.g.,

conda install pytorch torchvision -c pytorch

c. Clone the vedaseg repository.

git clone https://github.com/Media-Smart/vedaseg.git
cd vedaseg
vedaseg_root=${PWD}

d. Install dependencies.

pip install -r requirements.txt

Prepare data

Download Pascal VOC 2012 and Pascal VOC 2012 augmented, resulting in 10,582 training images(trainaug), 1,449 validatation images.

cd ${vedaseg_root}
mkdir ${vedaseg_root}/data
cd ${vedaseg_root}/data

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz

tar xf VOCtrainval_11-May-2012.tar
tar xf benchmark.tgz

python ../tools/encode_voc12_aug.py
python ../tools/encode_voc12.py

mkdir VOCdevkit/VOC2012/EncodeSegmentationClass
#cp benchmark_RELEASE/dataset/encode_cls/* VOCdevkit/VOC2012/EncodeSegmentationClass
(cd benchmark_RELEASE/dataset/encode_cls; cp * ${vedaseg_root}/data/VOCdevkit/VOC2012/EncodeSegmentationClass)
#cp VOCdevkit/VOC2012/EncodeSegmentationClassPart/* VOCdevkit/VOC2012/EncodeSegmentationClass
(cd VOCdevkit/VOC2012/EncodeSegmentationClassPart; cp * ${vedaseg_root}/data/VOCdevkit/VOC2012/EncodeSegmentationClass)

comm -23 <(cat benchmark_RELEASE/dataset/{train,val}.txt VOCdevkit/VOC2012/ImageSets/Segmentation/train.txt | sort -u) <(cat VOCdevkit/VOC2012/ImageSets/Segmentation/val.txt | sort -u) > VOCdevkit/VOC2012/ImageSets/Segmentation/trainaug.txt

To avoid tedious operations, you could save the above linux commands as a shell file and execute it.

Train

a. Config

Modify some configuration accordingly in the config file like configs/deeplabv3plus.py

b. Run

python tools/trainval.py configs/deeplabv3plus.py

Snapshots and logs will be generated at ${vedaseg_root}/workdir.

Test

a. Config

Modify some configuration accordingly in the config file like configs/deeplabv3plus.py

b. Run

python tools/test.py configs/deeplabv3plus.py path_to_deeplabv3plus_weights

Contact

This repository is currently maintained by Hongxiang Cai (@hxcai), Yichao Xiong (@mileistone).

Credits

We got a lot of code from mmcv and mmdetection, thanks to open-mmlab.