Classify real time desktop and speech

Overview

Team DeepThings (Mez Gebre and I) won the Best Product Category at the Deep Learning Hackathon in San Francisco. We developed in three days a real-time system capable of identifying objects and speaking what it sees, thinking about making a useful tool for the visually impaired, as it could make navigation easier. Proof of concept on a laptop, final model running on Android.

This is only the first prototype for Windows.

The goals / steps of this project are the following:

Get the Webcam feed without bottlenecks.
Recognize images using Inception v3.
Text to speech with Google TTS API.
Making a functional model.
Tuninning the parameters.
Output visual display of the results.

Dependencies

This module requires:
Python 3.6.1
Tensorflow-gpu 1.0
Opencv 3.2
Numpy 1.12
Gtts 1.1
Pygame 1.9

Usage

Just run: python classify_real_time_v2.py

The output should look like this:

alt text

More details

For more information, check my medium post here

Classify real time desktop and speech

Overview

The goals / steps of this project are the following:

Dependencies

Usage

More details

Licence