SynthText

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

Synthetic Scene-Text Image Samples Synthetic Scene-Text Samples

The code in the master branch is for Python2. Python3 is supported in the python3 branch.

The main dependencies are:

pygame, opencv (cv2), PIL (Image), numpy, matplotlib, h5py, scipy

Generating samples

python gen.py --viz

This will download a data file (~56M) to the data directory. This data file includes:

This script will generate random scene-text image samples and store them in an h5 file in results/SynthText.h5. If the --viz option is specified, the generated output will be visualized as the script is being run; omit the --viz option to turn-off the visualizations. If you want to visualize the results stored in results/SynthText.h5 later, run:

python visualize_results.py

Pre-generated Dataset

A dataset with approximately 800000 synthetic scene-text images generated with this code can be found here.

Adding New Images

Segmentation and depth-maps are required to use new images as background. Sample scripts for obtaining these are available here.

For an explanation of the fields in dset.h5 (e.g.: seg,area,label), please check this comment.

Pre-processed Background Images

The 8,000 background images used in the paper, along with their segmentation and depth masks, have been uploaded here: http://www.robots.ox.ac.uk/~vgg/data/scenetext/preproc/<filename>, where, <filename> can be:

filenames size description md5 hash
imnames.cp 180K names of images which do not contain background text
bg_img.tar.gz 8.9G images (filter these using imnames.cp) 3eac26af5f731792c9d95838a23b5047
depth.h5 15G depth maps af97f6e6c9651af4efb7b1ff12a5dc1b
seg.h5 6.9G segmentation maps 1605f6e629b2524a3902a5ea729e86b2

Note: due to large size, depth.h5 is also available for download as 3-part split-files of 5G each. These part files are named: depth.h5-00, depth.h5-01, depth.h5-02. Download using the path above, and put them together using cat depth.h5-0* > depth.h5.

use_preproc_bg.py provides sample code for reading this data.

Note: I do not own the copyright to these images.

Generating Samples with Text in non-Latin (English) Scripts

Further Information

Please refer to the paper for more information, or contact me (email address in the paper).