Medium post on building the first version from scratch: https://becominghuman.ai/extract-a-feature-vector-for-any-image-with-pytorch-9717561d1d4c
Tested on Python 3.6
Requires Pytorch: http://pytorch.org/
pip install img2vec_pytorch
from img2vec_pytorch import Img2Vec
from PIL import Image
# Initialize Img2Vec with GPU
img2vec = Img2Vec(cuda=True)
# Read in an image
img = Image.open('test.jpg')
# Get a vector from img2vec, returned as a torch FloatTensor
vec = img2vec.get_vec(img, tensor=True)
# Or submit a list
vectors = img2vec.get_vec(list_of_PIL_images)
pip install Pillow
pip install scikit-learn
git clone https://github.com/christiansafka/img2vec.git
cd img2vec/example
python test_img_to_vec.py
Which filename would you like similarities for?
cat.jpg
0.72832 cat2.jpg
0.641478 catdog.jpg
0.575845 face.jpg
0.516689 face2.jpg
Which filename would you like similarities for?
face2.jpg
0.668525 face.jpg
0.516689 cat.jpg
0.50084 cat2.jpg
0.484863 catdog.jpg
Try adding your own photos!
cuda = (True, False) # Run on GPU? default: False
model = ('resnet-18', 'alexnet') # Which model to use? default: 'resnet-18'
layer = 'layer_name' or int # For advanced users, which layer of the model to extract the output from. default: 'avgpool'
layer_output_size = int # Size of the output of your selected layer
Defaults: (layer = 'avgpool', layer_output_size = 512)
Layer parameter must be an string representing the name of a layer below
conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)
bn1 = nn.BatchNorm2d(64)
relu = nn.ReLU(inplace=True)
maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
layer1 = self._make_layer(block, 64, layers[0])
layer2 = self._make_layer(block, 128, layers[1], stride=2)
layer3 = self._make_layer(block, 256, layers[2], stride=2)
layer4 = self._make_layer(block, 512, layers[3], stride=2)
avgpool = nn.AvgPool2d(7)
fc = nn.Linear(512 * block.expansion, num_classes)
Defaults: (layer = 2, layer_output_size = 4096)
Layer parameter must be an integer representing one of the layers below
alexnet.classifier = nn.Sequential(
7. nn.Dropout(), < - output_size = 9216
6. nn.Linear(256 * 6 * 6, 4096), < - output_size = 4096
5. nn.ReLU(inplace=True), < - output_size = 4096
4. nn.Dropout(), < - output_size = 4096
3. nn.Linear(4096, 4096), < - output_size = 4096
2. nn.ReLU(inplace=True), < - output_size = 4096
1. nn.Linear(4096, num_classes), < - output_size = 4096
)