jMetalSP: A framework for Big Data Optimization with multi-objective metaheuristics

jMetalSP is a software platform for dynamic multi-objective Big Data Optimization. It combines the features of the jMetal multi-objective optimization framework with the Apache Spark cluster computing system. jMetal provides the optimization infrastructure for implementing both the Big Data optimization problems and the dynamic metaheuristic to solve them; Spark allows to manage the streaming data sources, allowing to effectively use the computing power of Hadoop clusters when processing large amounts of data.

Please, note that jMetalSP is a project in continuous development. If you have any question or suggestion of desirable features to be included, feel free to contact us. The current version is jMetalSP 2.0.

Current status

We are currently working on a redesign of the framework with the following ideas in mind:

Spark, Flink and Kafka are uncoupled in separate modules, so users only interested in non-Big Data dynamic optimization problems can use the core of jMetal without Spark.
The architecture is being refactored:
- The have introduced the observer pattern to link the stream data sources and algorithm outputs (the observables) with the problems and data consumers (the observers).
- Unnecessary classes (i.e. problem and algorithm builders) have been removed.
- Five different runtime systems can be used: plain Java, Java+Spark, Java+SparkStructured, Java+Flink and Java+KafkaStreams.
We are refactoring the example published in the MOD 2016 paper because the original Web service to obtain traffic data has changed.
Algorithms included:
- Dynamic versions of NSGA-II, NSGA-III, R-NSGA-II, MOCell, SMPSO, SMPSO/RP, WASF-GA
- Dynamic version of WASF-GA, algorithm including a preference articulation mechanism based on a reference point.
- A new algorithm called InDM2 (Interactive Dynamic Multi-Objective Decision Making).
- Interactive versions of R-NSGA-II, SMPSO/RP,WASF-GA
A component to draw the Pareto front approximations in a chart during the algorithm execution.
Problems included: bi-objective TSP, FDA benchmark, DF benchmark.

Architecture

The architecture of the current development version of jMetalSP (Version 1.1) is depicted in the following UML class diagram:

jMetalSP architecture

A jMetalSPApplication is composed of:

An instance of the DynamicProblem class (the problem to be solved).
An instance of the DynamicAlgorithm class (the algorithm to use).
One or more StreamingDataSource objects, which process the incoming data and as a result they change make some change in the DynamicProblem.
One or more AlgorithmDataConsumer objects, which receive the results being generated by the DynamicAlgorithm.
A StreamingRuntime object to configure the streaming engine.

The implementation of jMetalSP applies Java generics to ensure that all the componentes are compatible. The declaration of the jMetalSPApplication class and its main componentes is the following:

public class JMetalSPApplication<
        S extends Solution<?>,
        P extends DynamicProblem<S, ?>,
        A extends DynamicAlgorithm<?, ? extends ObservedData<?>>> {

  private List<StreamingDataSource<?>> streamingDataSourceList;
  private List<DataConsumer<?>> algorithmDataConsumerList;
  private StreamingRuntime streamingRuntime;

  private P problem;
  private A algorithm;
  ...
}

This way, by using generics the Java compiler can check that all the components fit together.

InDM2

InDM2 a new dynamic multi-objective optimization algorithm that allows the preferences of the decision maker (DM) to be incorporated into the search process. When solving a dynamic multi-objective optimization problem with InDM2, the DM can not only express her/his preferences by means of one or more reference points, which define the desired region of interest, but also those points can be also modified interactively.

Examples

The following example applications are included in the current development version:

DynamicContinuousApplication. Example of using NSGA-II, MOCell, SMPSO or WASF-GA to solve the FDA problems using the default streaming runtime, i.e. without Spark
DynamicContinuousApplicationWithSpark. Example of using NSGA-II, MOCell, SMPSO or WASF-GA to solve the FDA problems using Spark.
DynamicTSPApplication. Example of using NSGA-II or MOCell or SMPSO to solve a bi-objective TSP problem using the default streaming runtime, i.e. without Spark. The streaming data source simulates changes in the cost matrix (no external data source is used). This is a simplied version the algorithm presented in MOD 2016.
NSGAIIRunner. This example, included in the paper to be presented in EMO 2017, shows how to configure the standard NSGA-II algorithm to solve a modified version of the ZDT1 problem using the Spark evaluator to evaluate the solutions of the population in parallel.
InDM2RunnerForContinuousProblems. This example, included in the paper published in SWEVO 2018, shows how to configure the InDM2 with Interactive R-NSGA-II and Interactive WASFGA algorithms to solve FDA problems.
InDM2RunnerForContinuousProblems3D. This example, included in the paper published in SWEVO 2018, shows how to configure the InDM2 with Interactive R-NSGA-II and Interactive WASFGA algorithms to solve FDA problems, showing results in a 3D plot.
InDM2RunnerForNYTSP. This example, included in the paper to be presented in SWEVO 2018, shows how to configure the InDM2 with Interactive R-NSGA-II and Interactive WASFGA algorithms to solve TSP problem with traffic data from New York.
InDM2RunnerForContinuousAutoUpdateProblems. This example is as InDM2RunnerForContinuousProblems but in this case the problem is updated after 25000 algorithm evaluations.
DynamicContinuousApplicationAVRO. Example of using NSGA-II, MOCell, SMPSO or WASF-GA to solve the FDA problems using the default streaming runtime, i.e. without Spark and AVRO is used for interchange communications.
DynamicContinuousApplicationFlink. This example is the same that DynamicContinuousApplication but with Flink as streaming runtime.
DynamicContinuousApplicationFlinkkafka. This example is the same that DynamicContinuousApplicationFlink but using kafka for communicating components.
DynamicContinuousApplicationFlinkkafkaAVRO. This example is the same that DynamicContinuousApplicationFlinkKafka but using AVRO adn Kafka for communicating components.
DynamicContinuousApplicationWithSparkkafkaAVRO. Example of using NSGA-II, MOCell, SMPSO or WASF-GA to solve the FDA problems using Spark and Kafka-AVRO for communicating components.
DSMPSORunnerForContinuousProblems. This example, shows how to configure the Dynamic SMPSO to solve FDA problems.
RNSGAIIRunnerForContinuousProblems. This example, shows how to configure the Dynamic R-NSGA-II to solve FDA problems.
SMPSORPRunnerForContinuousProblems. This example, shows how to configure the Dynamic SMPSO/RP to solve DF problems.

How to execute the Spark example

In order to run the example DynamicContinuousApplicationWithSpark, it is necessary to run first the CounterProvider program whose goal is to store continuously in a directory files containing the data (i.e., the value of a counter) that will read by Spark in streaming.

Requirements

To run the examples that do not use Spark you need:

Java JDK 8
Apache Maven
jMetal 5.7

To execute the codes with Spark:

Spark 2.3.0 or later

References

José A. Cordero, Antonio J. Nebro, Juan J. Durillo, José García-Nieto, Ismael Navas-Delgado, José F. Aldana-Montes: "Dynamic Multi-Objective Optimization With jMetal and Spark: a Case Study". MOD 2016 (DOI).
Cristóbal Barba-González, José García-Nieto, Antonio J. Nebro and José F. Aldana-Montes. Multi-Objective Big Data Optimization with jMetal and Spark. EMO 2017 (DOI).
Cristóbal Barba-González, Antonio J. Nebro, José A. Cordero, José García-Nieto, Juan J. Durillo, Ismael Navas-Delgado, José F. Aldana-Montes. "JMetalSP: a Framework for Dynamic Multi-Objective Big Data Optimization". Applied Soft Computing. Available online May 2017. (DOI)
Antonio J. Nebro, Ana B. Ruíz, Cristóbal Barba-González, José García-Nieto, José F. Aldana, Mariano Luque. InDM2: Interactive Dynamic Multi-Objective Decision Making using Evolutionary Algorithms. Swarm and Evolutionary Computation. Volume 40, June 2018, Pages 184-195. 2018. (DOI)