kubeface Build Status

Python library for parallel maps running directly on Kubernetes. Intended for running many expensive tasks (minutes in runtime). Alpha stage. Currently supports only Google Cloud.

Overview

Kubeface aims for reasonably efficient execution of many long running Python tasks with medium sized (up to a few gigabytes) inputs and outputs. Design choices and assumptions:

The primary motivating application has been neural network model selection for the MHCflurry project.

See example.py for a simple working example.

Nomenclature

Backends

Life of a job

If a user calls (where client is a kubeface.Client instance):

client.map(lambda x: x**2, range(10))

This creates a job containing 10 tasks. The return value is a generator that will yield the square of the numbers 0-9. The job is executed as follows:

Docker images

Kubeface tasks execute in the context of a particular docker image, since they run in a kubernetes pod. You can use any docker image with python installed. If your docker image does not have kubeface installed, then by default kubeface will try to install itself using pip. This is inefficient since it will run for every task. If you plan on running many tasks it's a good idea to create your own docker image with kubeface installed.

Inspecting job status

Kubeface writes out HTML and JSON status pages to cloud storage and logs to stdout. However, the best way to figure out what's going on with your job is to use kubernetes directly, via kubectl get pods and kubectl logs <pod-name>.

Installation

From a checkout:

pip install -e .

To run the tests:

# Setting this environment variable is optional.
# If you set it in the tests will run against a real google storage bucket.
# See https://developers.google.com/identity/protocols/application-default-credentials#howtheywork;
# you need to get Application Default Credentials before writing to your bucket.
KUBEFACE_STORAGE=gs://kubeface-test  # tests will write to gs://kubeface-test.

# Run tests:
nosetests

Shell Example

The kubeface-run command runs a job from the shell, which is useful for testing or simple tasks.

If you don’t already have a kubernetes cluster running, use a command like this to start one:

gcloud config set compute/zone us-east1-c
gcloud components install kubectl  # if you haven't already installed kubectl
gcloud container clusters create kubeface-cluster-$(whoami) \
    --scopes storage-full \
    --zone us-east1-c \
    --num-nodes=2 \
    --enable-autoscaling --min-nodes=1 --max-nodes=100 \
    --machine-type=n1-standard-16

You should see your cluster listed here: https://console.cloud.google.com/kubernetes/list

Then run this to set it as the default for your session:

gcloud config set container/cluster kubeface-cluster-$(whoami)
gcloud container clusters get-credentials kubeface-cluster-$(whoami)

Now launch a command:

kubeface-run \
    --expression 'value**2' \
    --generator-expression 'range(10)' \
    --kubeface-max-simultaneous-tasks 10 \
    --kubeface-backend kubernetes \
    --kubeface-worker-image continuumio/anaconda3 \
    --kubeface-kubernetes-task-resources-cpu 1 \
    --kubeface-kubernetes-task-resources-memory-mb 500 \
    --verbose \
    --out-csv /tmp/result.csv

If you kill the above command, you can run this to kill all the running pods in your cluster:

kubectl delete pods --all

When you’re done working, delete your cluster:

gcloud container clusters delete kubeface-cluster-$(whoami)