Masking GAN - Generating image attribute mask (pytorch)

Motivation

When I first approached semantic manipulation problem there was no solution like CycleGAN, or later findings. And even now all of them, including new methods produce artifacts when changing image content.

Approach

diagram

Use generator architecture with built-in segmentation.
Mix original image with new patches through the segmentation mask.
Train the whole network end-to-end.
Use L1 identity loss to constrain Generator and reduce changes.

Instructions

I am using CelebA dataset to train the model. There are two files you would need to reproduce results: img_align_celeba.zip and list_attr_celeba.txt

You can download them from here http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html and put into {PROJECT_DIR}/data
After that initialize data and train the model by running

sh init_data.sh
python train.py

Results

picked sample

In case a person is already smiling it doesn’t do any changes at all.
It handles poorly some extreme head angles.
There are still artifacts.

Advices

Consider following advices if you want to build this kind of a model:

Make sure your GAN model converges without appling mask and L1 loss.

Acknowledgments

The code is inspired by pytorch-CycleGAN-and-pix2pix . This paper GANimation: Anatomically-aware Facial Animation from a Single Image arXiv:1807.09251 describes similar training scheme.