Coding exercise to extend TensorFlow for ImageNet to Active Learning, for use in a job interview or similar context.
The code runs, but IS DELIBERATELY A BAD IMPLEMENTATION OF ACTIVE LEARNING
The task is to fix this code.
A company has approached you that wants to classify a large set of sports images according to the ImageNet set of labels.
However, there are several problems:
This is a real-world situation that occurs regularly. For this exercise, we will use the open set of sports images from CrowdFlower's Data for Everyone program:
https://www.crowdflower.com/data-for-everyone/
ImageNet uses a classification scheme based on WordNet, where words are grouped by synonyms, called 'synsets'. Each 'synset' is a group of closely related words. These synsets are the labels for this task, which you will see when you run the code. For example, the label for 'racing car' is 'racer, race car, racing car'
.
The output from the classifier will therefore look something like this:
['candle, taper, wax light', 0.079653569], ['wreck', 0.055132806], ['tow truck, tow car, wrecker', 0.038218945], ...
This indicates that the image being classified has a 0.07965 probability of being a 'candle, taper, wax light'
, a 0.05513 probability of being a 'wreck'
, etc.
While the classifier is flat, WordNet itself organizes the synsets in hierarchies. For example 'baseball' and 'cricket' could be types of 'sport', and in turn 'sport' could be a type of 'activity'. Generally, items that are closer in the hierarchy tend to be closer in real-life. For example, 'sports car' and 'racing car' are both types of 'cars' in WordNet/ImageNet, and are also closely related in real-life. By contrast, 'sports car' and 'pine tree' are not closely related in WordNet/ImageNet, or in real-life.
A (250MB) subset of the images is available at:
http://www.robertmunro.com/research/test_images.tar
The starter code is in the same directory as this readme:
active_learning_for_images.py
Your exercise is to improve the code so that it:
Steps 1 and 2 are implemented (but could be improved).
Step 3 currently orders the images from the most-to-least confidently classified, which is a bad strategy.
Most of the code to be edited is within order_images_for_active_learning()
, but you may edit any code that you think will improve the output.
There are many extensions to this code, from a 1 hour exercise to improve how confidence is used to order the images, to a multiple week exercise that could included retraining all/parts of the model and providing interfaces that are optimal for different kinds of human labeling.
For the 2 Hour Exercise, there is a coding and a written component.
It is recommended that you take 30 minutes to become familiar with the problem and decide on your approach, 60 minutes for the coding exercise, and 30 minutes on the writing exercise.
You may use any resources that are available to you on your machine or on the internet. You can ask the instructor for any clarifications questions, but please complete this as a solo exercise without live input from other people.
Reimplement order_images_for_active_learning()
so that it has a better strategy to determine which images will be the most valuable to classify. Your strategy should identify images as highly valuable if are one or more of the following:
Aim to have working code. It's likely that you won't have time to implement everything that you would like, so the writing exercise allows you to talk about the other strategies that you thought about:
For the writing exercise, pretend like you are addressing the customer about what you have implemented and what you are proposing to build. You can keep the style casual: assume it's a professional email sent to their technical team, not a formal proposal for their executives.
First, write a 1-paragraph description of what you implemented in the coding exercise, justifying each decision. There are many possible solutions that can be implemented in about 60 minutes, so this is more about your reasoning than your exact strategy.
Second, please write a few paragraphs or bullet points proposing other strategies that you might with the customer, covering:
Please email the updated code and written exercise to the instructor when you are complete.
A (250MB) subset of the images is available at:
http://www.robertmunro.com/research/test_images.tar
The company made starter code that is in this same directory at:
active_learning_for_images.py
The code is in the style of TensorFlow tutorials and is adapted with thanks to the original authors of:
https://github.com/tensorflow/models/blob/master/tutorials/image/imagenet/classify_image.py
The code can be run in the same tutorial folder (although not required):
https://github.com/tensorflow/models/tree/master/tutorials/image/imagenet
To install TensorFlow and for more context on this problem, see:
https://www.tensorflow.org/tutorials/image_recognition
In short, you can clone tensorflow at:
git clone https://github.com/tensorflow/models
And then find the location of the tutorial from which this was based at:
cd models/tutorials/image/imagenet
This tutorial is not required reading for this exercise, but will give you more context if you are not familiar with TensorFlow or ImageNet.
If you are on a Mac, you might need to install TensorFlow with the following command:
sudo -H pip install tensorflow --upgrade --ignore-installed
Usage:
python active_learning_for_images.py --directory=DIRECTORY_OF_IMAGES
Where DIRECTORY_OF_IMAGES
is the directory containing the JPGs you want to apply Active Learning to.