Table Extractor From Image

This repository contains the code that extracts a table from an image and exports it to an Excel. To do this, the image is "read" by an OCR which provides a JSON output which is used as the input to the program. The program then arranges the cells row and column-wise as per the JSON input.

NOTE: Only those input cells read by the OCR will be displayed in the Excel.

Modules Required

os
copy
pandas==0.22.0
openpyxl==2.4.9
You can also use requirements.txt to install the packages. How? Follow this link.

Flow

Image -> JSON -> Excel

Steps

  1. First of all, install all the import packages specified in the requirements.txt
  2. For "reading" an image, use an OCR that converts the format to JSON.
  3. In the program, change the input path and output path according to your requirement.
  4. Run the program (JSON-to-Excel.py).

Sample Test Case


Input Image:

Input Image

It's Corresponding JSON:

JSON

Excel Output:

Excel Output