Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.
Synthetic Scene-Text Image Samples
The code in the master
branch is for Python2. Python3 is supported in the python3
branch.
The main dependencies are:
pygame, opencv (cv2), PIL (Image), numpy, matplotlib, h5py, scipy
python gen.py --viz
This will download a data file (~56M) to the data
directory. This data file includes:
fonts/fontlist.txt
with their paths).text_utils.py
to see how the text inside this file is used by the renderer.invert_font_size.py
).This script will generate random scene-text image samples and store them in an h5 file in results/SynthText.h5
. If the --viz
option is specified, the generated output will be visualized as the script is being run; omit the --viz
option to turn-off the visualizations. If you want to visualize the results stored in results/SynthText.h5
later, run:
python visualize_results.py
A dataset with approximately 800000 synthetic scene-text images generated with this code can be found here.
Segmentation and depth-maps are required to use new images as background. Sample scripts for obtaining these are available here.
predict_depth.m
MATLAB script to regress a depth mask for a given RGB image; uses the network of Liu etal. However, more recent works (e.g., this) might give better results.run_ucm.m
and floodFill.py
for getting segmentation masks using gPb-UCM.For an explanation of the fields in dset.h5
(e.g.: seg
,area
,label
), please check this comment.
The 8,000 background images used in the paper, along with their segmentation and depth masks, have been uploaded here:
http://www.robots.ox.ac.uk/~vgg/data/scenetext/preproc/<filename>
, where, <filename>
can be:
filenames | size | description | md5 hash |
---|---|---|---|
imnames.cp |
180K | names of images which do not contain background text | |
bg_img.tar.gz |
8.9G | images (filter these using imnames.cp ) |
3eac26af5f731792c9d95838a23b5047 |
depth.h5 |
15G | depth maps | af97f6e6c9651af4efb7b1ff12a5dc1b |
seg.h5 |
6.9G | segmentation maps | 1605f6e629b2524a3902a5ea729e86b2 |
Note: due to large size, depth.h5
is also available for download as 3-part split-files of 5G each.
These part files are named: depth.h5-00, depth.h5-01, depth.h5-02
. Download using the path above, and put them together using cat depth.h5-0* > depth.h5
.
use_preproc_bg.py
provides sample code for reading this data.
Note: I do not own the copyright to these images.
Please refer to the paper for more information, or contact me (email address in the paper).