Collaborative Learning for Weakly Supervised Object Detection

If you use this code in your research, please cite

  title     = {Collaborative Learning for Weakly Supervised Object Detection},
  author    = {Jiajie Wang and Jiangchao Yao and Ya Zhang and Rui Zhang},
  booktitle = {Proceedings of the Twenty-Seventh International Joint Conference on
               Artificial Intelligence, {IJCAI-18}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},             
  pages     = {971--977},
  year      = {2018},
  month     = {7},
  doi       = {10.24963/ijcai.2018/135},
  url       = {},



  1. Clone the repository

    git clone
  2. Choose your -arch option to match your GPU for step 3 and 4.

    GPU model Architecture
    TitanX (Maxwell/Pascal) sm_52
    GTX 960M sm_50
    GTX 1080 (Ti) sm_61
    Grid K520 (AWS g2.2xlarge) sm_30
    Tesla K80 (AWS p2.xlarge) sm_37

    Note: You are welcome to contribute the settings on your end if you have made the code work properly on other GPUs.

  3. Build RoiPooling module

    cd pytorch-faster-rcnn/lib/layer_utils/roi_pooling/src/cuda
    echo "Compiling roi_pooling kernels by nvcc..."
    nvcc -c -o -x cu -Xcompiler -fPIC -arch=sm_52
    cd ../../
    cd ../../../
  4. Build NMS

    cd lib/nms/src/cuda
    echo "Compiling nms kernels by nvcc..."
    nvcc -c -o -x cu -Xcompiler -fPIC -arch=sm_52
    cd ../../
    cd ../../

Setup data

Please follow the instructions of py-faster-rcnn here to setup VOC. The steps involve downloading data and optionally creating soft links in the data folder. Since faster RCNN does not rely on pre-computed proposals, it is safe to ignore the steps that setup proposals.

If you find it useful, the data/cache folder created on Xinlei's side is also shared here.

Train your own model

  1. Download pre-trained models and weights. For the pretrained wsddn model, you can find the download link here. For other pre-trained models like VGG16 and Resnet V1 models, they are provided by pytorch-vgg and pytorch-resnet (the ones with caffe in the name). You can download them in the data/imagenet_weights folder. For example for VGG16 model, you can set up like:

    mkdir -p data/imagenet_weights
    cd data/imagenet_weights
    python # open python in terminal and run the following Python code
    import torch
    from torch.utils.model_zoo import load_url
    from torchvision import models
    sd = load_url("")
    sd['classifier.0.weight'] = sd['classifier.1.weight']
    sd['classifier.0.bias'] = sd['classifier.1.bias']
    del sd['classifier.1.weight']
    del sd['classifier.1.bias']
    sd['classifier.3.weight'] = sd['classifier.4.weight']
    sd['classifier.3.bias'] = sd['classifier.4.bias']
    del sd['classifier.4.weight']
    del sd['classifier.4.bias'], "vgg16.pth")
    cd ../..

    For Resnet101, you can set up like:

    mkdir -p data/imagenet_weights
    cd data/imagenet_weights
    # download from my gdrive (link in pytorch-resnet)
    mv resnet101-caffe.pth res101.pth
    cd ../..
  2. Train (and test, evaluation)

    ./experiments/scripts/ [GPU_ID] [DATASET] [NET] [WSDDN_PRETRAINED]
    # Examples:
    ./experiments/scripts/ 0 pascal_voc vgg16 path_to_wsddn_pretrained_model
  3. Visualization with Tensorboard

    tensorboard --logdir=tensorboard/vgg16/voc_2007_trainval/ --port=7001 &
  4. Test and evaluate

    ./experiments/scripts/ [GPU_ID] [DATASET] [NET] [WSDDN_PRETRAINED]
    # Examples:
    ./experiments/scripts/ 0 pascal_voc vgg16 path_to_wsddn_pretrained_model

By default, trained networks are saved under:


Test outputs are saved under:


Tensorboard information for train and validation is saved under:


Our results can be found here