Build Status codecov Maven Central License

DataEnum

DataEnum allows you to work with algebraic data types in Java.

You can think of it as an enum where every individual value can have different data associated with it.

What problem does it solve?

The idea of algebraic data types is not new and already exists in many other programming languages, for example:

It is possible to represent such algebraic data types using subclasses: the parent class is the "enumeration" type, and each child class represents a case of the enumeration with it's associated parameters. This will however either require you to spread out your business logic in all the subclasses, or to cast the child class manually to access the parameters and be very careful to only cast if you know for sure that the class is of the right type.

The goal of DataEnum is to help you generate all these classes and give you a fluent API for easily accessing their data in a type-safe manner.

The primary use-case we had when designing DataEnum was to execute different business logic depending on an incoming message. And as mentioned above, we wanted to keep all that business logic in one place, and not spread it out in different classes. With plain Java, you’d have to write something like this:

if (message instanceof Login) {
    Login login = (Login) message;
    // login logic here
} else if (message instanceof Logout) {
    Logout logout = (Logout) message;
    // logout logic here
}

There are a number of things here that developers tend to not like: repeated if-else statements, manual instanceof checks and safe-but-noisy typecasting. On top of that it doesn't look very idiomatic and there's a high risk that mistakes get introduced over time. If you use DataEnum, you can instead write the same expression like this:

message.match(
   login -> { /* login logic; the 'login' parameter is 'message' but cast to the type Login. */ },
   logout -> { /* logout logic; the 'logout' parameter is 'message' but cast to the type Logout. */ }
);

In this example only one of the two lambdas will be executed depending on the message type, just like with the if-statements. match is just a method that takes functions as arguments, but if you write expressions with linebreaks like in the example above it looks quite similar to a switch-statement, a match-expression in Scala, or a when-expression in Kotlin. DataEnum makes use of this similarity to make match-statements look and feel like a language construct.

There are many compelling use-cases for using an algebraic data type to represent values. To name a few:

Status

DataEnum is in Beta status, meaning it is used in production in Spotify Android applications, but we may keep making changes relatively quickly.

It is currently built for Java 7 (because Android doesn't support Java 8 well yet), hence the duplication of some concepts defined in java.util.function (Consumer, Function, Supplier).

Using it in your project

The latest version of DataEnum is available through Maven Central (LATEST_RELEASE below is latest not found):

Gradle

implementation 'com.spotify.dataenum:dataenum:LATEST_RELEASE'                
annotationProcessor 'com.spotify.dataenum:dataenum-processor:LATEST_RELEASE' 

Maven

<dependencies>
  <dependency>
    <groupId>com.spotify.dataenum</groupId>
    <artifactId>dataenum</artifactId>
    <version>LATEST_RELEASE</version>
  </dependency>
  <dependency>
    <groupId>com.spotify.dataenum</groupId>
    <artifactId>dataenum-processor</artifactId>
    <version>LATEST_RELEASE</version>
    <scope>provided</scope>
  </dependency>
</dependencies>

It may be an option to use the annotationProcessorPaths configuration option of the maven-compiler-plugin rather than an optional dependency.

How do I create a DataEnum type?

First, you define all the cases and their parameters in an interface like this:

@DataEnum
interface MyMessages_dataenum {
    dataenum_case Login(String userName, String password);
    dataenum_case Logout();
    dataenum_case ResetPassword(String userName);
}

Then, you apply the dataenum-processor annotation processor to that code, and your DataEnum case classes will be generated for you.

Some things to note:

Using the generated DataEnum class

Some usage examples, based on the @DataEnum specification above:

// Instantiate by passing in the required parameters. 
// You’ll get something that is of the super type - this is to help Java’s 
// not-always-great type inference do the right thing in many common cases.
MyMessages message = MyMessages.login("petter", "s3cr3t");

// If you actually needed the subtype you can easily cast it using the as-methods.
Logout logout = MyMessages.logout().asLogout();

// For every as-method there is also an is-method to check the type of the message.
assertThat(message.isLogin(), is(true));

// Apply different business logic to different message types. Note how getters are generated (but not
// setters, DataEnum case types should be considered immutable).
message.match(
    login -> Logger.debug("got a login request from user: {}", login.userName()),
    logout -> Logger.debug("user logged out"),
    resetPassword -> Logger.debug("password reset requested for user: {}", resetPassword.userName())
);

// So far we've been looking at 'match', but there is also the very useful 'map' which is used to
// transform values. When using 'map' you define how the message should be transformed in each case.
int passwordLength = message.map(
    login -> login.password().length(),
    logout -> 0,
    resetPassword -> -1);
}

// There are some utility methods provided that allow you to deal with unimplemented or illegal cases:
int passwordLength = message.map(
    login -> login.password().length(),
    logout -> Cases.illegal("logout message does not contain a password"), // throws IllegalStateException
    resetPassword -> Cases.todo()); // throws UnsupportedOperationException
}

// Sometimes, only a minority of cases are handled differently, in which case a 'map' or 'match'
// can lead to duplication:
int passwordLength = message.map(
    login -> handleLogin(login),
    logout -> Cases.illegal("only login is allowed"),
    resetPassword -> Cases.illegal("only login is allowed")
    // This could really get bad if there are many cases here
);

// For those scenarios you can just use regular language control structures (like if-else):
if (message.isLogin()) {
  return handleLogin(message.asLogin()); // Technically just a cast but easier to read than manual casting.
} else {
  throw new IllegalStateException("only login is allowed");
}

Features

Configuration

DataEnum currently has a single configurable setting determining the visibility of constructors in generated code. Generally speaking, private is best as it ensures there is a single way of creating case instances (the generated static factory methods like MyMessages.login(String, String) above). However, for Android development, you want to keep the method count down to a minimum, and private constructors lead to synthetic constructors being generated, increasing the method count. Since that is an important use case for us, we've chosen the package-private as the default. This is configurable through adding a @ConstructorAccess annotation to a package-info.java file. See the javadocs for more information.

Known weaknesses of DataEnum

Alternatives

An alternative implementation of algebraic data types for Java is ADT4J. We feel DataEnum has the advantage of being less verbose than ADT4J, although ADT4J is more flexible in terms of customising your generated types.

Features that might be added in the future

Why is it called DataEnum?

The name ‘DataEnum’ comes from the fact that it’s used similarly to an enum, but you can easily and type-safely have different data attached to each enum value.

Code of Conduct

This project adheres to the Open Code of Conduct. By participating, you are expected to honor this code.