Recent projects:
[pix2pix]: Torch implementation for learning a mapping from input images to output images.
[CycleGAN]: Torch implementation for learning an image-to-image translation (i.e., pix2pix) without input-output pairs.
[pytorch-CycleGAN-and-pix2pix]: PyTorch implementation for both unpaired and paired image-to-image translation.
iGAN (aka. interactive GAN) is the author's implementation of interactive image generation interface described in:
"Generative Visual Manipulation on the Natural Image Manifold"
Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, Alexei A. Efros
In European Conference on Computer Vision (ECCV) 2016
Given a few user strokes, our system could produce photo-realistic samples that best satisfy the user edits in real-time. Our system is based on deep generative models such as Generative Adversarial Networks (GAN) and DCGAN. The system serves the following two purposes:
Please cite our paper if you find this code useful in your research. (Contact: Jun-Yan Zhu, junyanz at mit dot edu)
Install the python libraries. (See Requirements).
Download the code from GitHub:
git clone https://github.com/junyanz/iGAN
cd iGAN
Download the model. (See Model Zoo
for details):
bash ./models/scripts/download_dcgan_model.sh outdoor_64
Run the python script:
THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python iGAN_main.py --model_name outdoor_64
The code is written in Python2 and requires the following 3rd party libraries:
sudo apt-get install python-opencv
sudo pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git
sudo apt-get install python-qt4
sudo pip install qdarkstyle
sudo pip install dominate
For Python3
users, you need to replace pip
with pip3
:
sudo apt-get install python3-pyqt4
See [Youtube] at 2:18s for the interactive image generation demos.
Edits
button to display/hide user edits. Coloring Brush
for changing the color of a specific region; Sketching brush
for outlining the shape. Warping brush
for modifying the shape more explicitly.Play
: play the interpolation sequence; Fix
: use the current result as additional constraints for further editing Restart
: restart the system; Save
: save the result to a webpage. Edits
: Check the box if you would like to show the edits on top of the generated image.Coloring Brush
: right-click to select a color; hold left click to paint; scroll the mouse wheel to adjust the width of the brush.Sketching Brush
: hold left-click to sketch the shape.Warping Brush
: We recommend you first use coloring and sketching before the warping brush. Right-click to select a square region; hold left click to drag the region; scroll the mouse wheel to adjust the size of the square region.Play
, F for Fix
, R for Restart
; S for Save
; E for Edits
; Q for quitting the program.Download the Theano DCGAN model (e.g., outdoor_64). Before using our system, please check out the random real images vs. DCGAN generated samples to see which kind of images that a model can produce.
bash ./models/scripts/download_dcgan_model.sh outdoor_64
--shadow
flag)We provide a simple script to generate samples from a pre-trained DCGAN model. You can run this script to test if Theano, CUDA, cuDNN are configured properly before running our interface.
THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python generate_samples.py --model_name outdoor_64 --output_image outdoor_64_dcgan.png
Type python iGAN_main.py --help
for a complete list of the arguments. Here we discuss some important arguments:
--model_name
: the name of the model (e.g., outdoor_64, shoes_64, etc.)--model_type
: currently only supports dcgan_theano.--model_file
: the file that stores the generative model; If not specified, model_file='./models/%s.%s' % (model_name, model_type)
--top_k
: the number of the candidate results being displayed--average
: show an average image in the main window. Inspired by AverageExplorer, average image is a weighted average of multiple generated results, with the weights reflecting user-indicated importance. You can switch between average mode and normal mode by press A
.--shadow
: We build a sketching assistance system for guiding the freeform drawing of objects inspired by ShadowDraw
To use the interface, download the model hed_shoes_64
and run the following script
THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python iGAN_main.py --model_name hed_shoes_64 --shadow --average
See more details here
We provide a script to project an image into latent space (i.e., x->z
):
conv4
):
bash models/scripts/download_alexnet.sh conv4
shoes_64.dcgan_theano
, and input image ./pics/shoes_test.png
)
THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python iGAN_predict.py --model_name shoes_64 --input_image ./pics/shoes_test.png --solver cnn_opt
./pics/shoes_test_cnn_opt.png
opt
for optimization method; cnn
for feed-forward network method (fastest); cnn_opt
hybrid of the previous methods (default and best). Type python iGAN_predict.py --help
for a complete list of the arguments.We also provide a standalone script that should work without UI. Given user constraints (i.e., a color map, a color mask, and an edge map), the script generates multiple images that mostly satisfy the user constraints. See python iGAN_script.py --help
for more details.
THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python iGAN_script.py --model_name outdoor_64
@inproceedings{zhu2016generative,
title={Generative Visual Manipulation on the Natural Image Manifold},
author={Zhu, Jun-Yan and Kr{\"a}henb{\"u}hl, Philipp and Shechtman, Eli and Efros, Alexei A.},
booktitle={Proceedings of European Conference on Computer Vision (ECCV)},
year={2016}
}
If you love cats, and love reading cool graphics, vision, and learning papers, please check out our Cat Paper Collection:
[Github] [Webpage]