Benchmarks of the CNN inference task over some popular deep learning frameworks.
Currently, we support five deep learning frameworks: Caffe, Caffe2, PyTorch, MXNet, TensorFlow. Some commonly used imagenet models (i.e. alexnet, resnet50, resnet101 and resnet152) are ready to test. For convenience, we provide all the code or network definition files here. There is no need to download pre-trained weights because we will randomly initialize them.
In order to exclude the impacts of different storage devices and different IO implementations over these deep learning frameworks, we generate all data randomly in advance. The time we calculate in the benchmark experiments only include the cpu-memory to gpu-memory data copy time and the GPU forward time.
I may add benchmark code for more networks (i.e. inception-bn, inception-v3) and deep learning frameworks in the future but no specific plans have been made yet. Thus, anyone is welcomed to submit PRs.
run.sh
.run.sh
to the gpu device you want to use. (In order to get accurate results, please select a GPU without any other process running on it.)sh run.sh
.cache/results/${DLLIB}_${NETWORK}_${BATCH_SIZE}.txt
. Columns in these files represent "framework name", "network", "batch size", "speed(images/s)", "gpu memory(MB)" respectively.cache/results/${NETWORK}_speed.png
demonstrates the network's inference speed of different batch size in different frameworks. cache/results/${NETWORK}_gpu_memory.png
demonstrates the network's gpu memory cost of different batch size in different frameworks.This project is licensed under an Apache-2.0 license.