Stateful Functions

Stateful Functions is an Apache Flink library that simplifies building distributed stateful applications. It's based on functions with persistent state that can interact dynamically with strong consistency guarantees.

Stateful Functions makes it possible to combine a powerful approach to state and composition with the elasticity, rapid scaling/scale-to-zero and rolling upgrade capabilities of FaaS implementations like AWS Lambda and modern resource orchestration frameworks like Kubernetes. With these features, it addresses two of the most cited shortcomings of many FaaS setups today: consistent state and efficient messaging between functions.

This README is meant as a brief walkthrough on the core concepts and how to set things up to get yourself started with Stateful Functions.

For a fully detailed documentation, please visit the official docs.

For code examples, please take a look at the examples.

Build Status

Table of Contents

Core Concepts

Abstraction

A Stateful Functions application consists of the following primitives: stateful functions, ingresses, routers and egresses.

Stateful functions

If you know Apache Flink’s DataStream API, you can think of stateful functions a bit like a lightweight KeyedProcessFunction. The function type is the process function transformation, while the ID is the key. The difference is that functions are not assembled in a Directed Acyclic Graph (DAG) that defines the flow of data (the streaming topology), but rather send events arbitrarily to all other functions using addresses.

Ingresses and Egresses

Modules

A module is the entry point for adding the core building block primitives to a Stateful Functions application, i.e. ingresses, egresses, routers and stateful functions.

A single application may be a combination of multiple modules, each contributing a part of the whole application. This allows different parts of the application to be contributed by different modules; for example, one module may provide ingresses and egresses, while other modules may individually contribute specific parts of the business logic as stateful functions. This facilitates working in independent teams, but still deploying into the same larger application.

Runtime

The Stateful Functions runtime is designed to provide a set of properties similar to what characterizes serverless functions, but applied to stateful problems.

The runtime is built on Apache Flink®, with the following design principles:

This makes it possible to execute functions on a Kubernetes deployment, a FaaS platform or behind a (micro)service, while providing consistent state and lightweight messaging between functions.

Getting Started

Follow the steps here to get started right away with Stateful Functions.

This guide will walk you through setting up to start developing and testing your own Stateful Functions (Java) application, and running an existing example. If you prefer to get started with Python, have a look into the StateFun Python SDK and the Python Greeter example.

Project Setup

Prerequisites:

You can quickly get started building Stateful Functions applications using the provided quickstart Maven archetype:

mvn archetype:generate \
  -DarchetypeGroupId=org.apache.flink \
  -DarchetypeArtifactId=statefun-quickstart \
  -DarchetypeVersion=2.2-SNAPSHOT

This allows you to name your newly created project. It will interactively ask you for the GroupId, ArtifactId and package name. There will be a new directory with the same name as your ArtifactId.

We recommend you import this project into your IDE to develop and test it. IntelliJ IDEA supports Maven projects out of the box. If you use Eclipse, the m2e plugin allows to import Maven projects. Some Eclipse bundles include that plugin by default, others require you to install it manually.

Building the Project

If you want to build/package your project, go to your project directory and run the mvn clean package command. You will find a JAR file that contains your application, plus any libraries that you may have added as dependencies to the application: target/<artifact-id>-<version>.jar.

Running from the IDE

To test out your application, you can directly run it in the IDE without any further packaging or deployments.

Please see the Harness example on how to do that.

Running a full example

As a simple demonstration, we will be going through the steps to run the Greeter example.

Before anything else, make sure that you have locally built the project as well as the base Stateful Functions Docker image. Then, follow the next steps to run the example:

cd statefun-examples/statefun-greeter-example
docker-compose build
docker-compose up

This example contains a very basic stateful function with a Kafka ingress and a Kafka egress.

To see the example in action, send some messages to the topic names, and see what comes out out of the topic greetings:

docker-compose exec kafka-broker kafka-console-producer.sh \
     --broker-list localhost:9092 \
     --topic names
docker-compose exec kafka-broker kafka-console-consumer.sh \
     --bootstrap-server localhost:9092 \
     --isolation-level read_committed \
     --from-beginning \
     --topic greetings 

Deploying Applications

Stateful Functions applications can be packaged as either standalone applications or Flink jobs that can be submitted to a Flink cluster.

Deploying with a Docker image

Below is an example Dockerfile for building a Stateful Functions image with an embedded module (Java) for an application called statefun-example.

FROM flink-statefun[:version-tag]

RUN mkdir -p /opt/statefun/modules/statefun-example

COPY target/statefun-example*jar /opt/statefun/modules/statefun-example/

Deploying as a Flink job

If you prefer to package your Stateful Functions application as a Flink job to submit to an existing Flink cluster, simply include statefun-flink-distribution as a dependency to your application.

<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>statefun-flink-distribution</artifactId>
    <version>2.2-SNAPSHOT</version>
</dependency>

It includes all the runtime dependencies and configures the application's main entry-point. You do not need to take any action beyond adding the dependency to your POM file.

Attention: The distribution must be bundled in your application fat JAR so that it is on Flink's user code class loader
{$FLINK_DIR}/bin/flink run ./statefun-example.jar

Contributing

There are multiple ways to enhance the Stateful Functions API for different types of applications; the runtime and operations will also evolve with the developments in Apache Flink.

You can learn more about how to contribute in the Apache Flink website. For code contributions, please read carefully the Contributing Code section and check the Stateful Functions component in Jira for an overview of ongoing community work.

License

The code in this repository is licensed under the Apache Software License 2.