Key Features • Stream Processing Pipeline • How To Use • Configuration • Examples • Scaling Performance • Credits • Contact
Scalable - Get desired Frame Rate over multiple cameras, by just spinning more consumer nodes or more consumer processes in the same node. The producers and consumers are designed as python processes, as subclass of multiprocessing.Process
Stream Processing in Python - This app essentially processes the stream of frames in python from the "raw frames" topic and publishes them into "predicted frames topic". Kafka Stream API not yet available in Python, future work includes implementation of frame processing using stream api in scala. This system design can extend to other stream processing applications as well.
Modular approach - Replace Face recognition model with desired Image processing model to detect entities as per your use case.
To clone and run this application, you'll need Git, python3 (also install pip) and kafka (v1.0.0 and v1.1.0 with scala v2.11 and v2.12) (all combinations) installed on your cluster. I used Pegasus for the cluster setup on aws with environment setup modified to this custom setup file.
# Clone this repository
$ git clone https://github.com/rrqq/eye_of_sauron.git
# Go into the repository
$ cd eye_of_sauron
# Install dependencies
$ sudo pip3 install -r requirements.txt
# Change permissions
$ chmod +x run_producers.py
# Run the app
$ ./run_producers.py
or
# Run the app
$ python3 run_producers.py
Note: If you're using Linux Bash you might need to convert run files as
$ sudo apt-get install dos2unix
$ dos2unix run_producers.py
$ dos2unix run_consumers.py
# Clone this repository
$ git clone https://github.com/rrqq/eye_of_sauron.git
# Install dependencies
$ sudo pip3 install -r requirements.txt
# Run consumers
$ python3 run_consumers.py
SET_PARTITIONS to set number of partitions for FRAME_TOPIC and PROCESSED_FRAME_TOPIC, this controls the level of parallelism. Rule of thumb when latency is a key factor, is to keep number of partitions to be less than 100 x b x r where b is the number of brokers in the cluster and r is the replication factor. Multi-partition is good for fault-tolerance, dealing with the scaling (up or down) and reassignment scenarios. If one (or more) of the consumer is stopped during the process, the assignor will take this into account and reassign the non-consumed partitions to valid consumers.
ROUND_ROBIN set True if you want to partition messages using RoundRobinPartitioner else Murmur2Partitioner(Better option) will be used.
ConsumeFrames class (inherits multiprocessing.Process) consumes messages containing encoded frames, timestamped and keyed. Processes each frame (detects faces in the frame specifically their locations, and calculates face encodings) and pushes the result to PROCESSED_FRAME_TOPIC.
PredictFrames class (inherits multiprocessing.Process) consumes messages containing encoded frames, detected face locations and encodings. The process waits for User Input i.e. Query or Target faces to look for. Matches detected faces with the query face and publishes the result to respective camera topic, ready to be consumed by steam app for viewing purpose. Here the results can also be pushed to database for analysis.
This software uses following open source packages.