Need

Businesses and governments have a lot of data, and want to learn about structures and patterns in the data. This might include being able to make predictions extending from the data.

Use case examples

Time-series analysis

Companies, research organizations, governments, etc. often collect data/observations containing timestamps. It is useful in many cases to find trends or patterns in data over time, including the possibility to forecast future trends. These types of analyses fall under the umbrella of time series analysis.

Specific examples include:

Anomoly detection

We can search for values that stand out of the normal range, or variance, in medium or large data sets. These 'abnormal' data may point to problems or unique conditions, which need attention. Anomoly detection algorithms can help decision makers quickly find unusual segments of data.

Specific examples include:

Related resources

Situation

There are myriad tools to help people design Machine Learning workflows. However, there does not appear to be a visual programming environment with machine learning primitives.

Goal

Build a general purpose machine learning programming environment that is accessible by a REST API and web user interface.

Roadmap

Design

The design will most likely consist of a REST API and User Interface, developed as separate components.

API

A REST API would make it easy to use Machine Learning algorithms, since users would not have to install or maintain the ML software.

The API might be structured to mirror the Orange3 User Interface. Specifically, the Orange3 UI has the following structure:

UI

The user interface for machine learning algorithms will make it easy for people with little programming experience to build machine learning services. The UI should include interface for interacting with data, sequencing ML tasks, and accessing output. It might also include basic visualizations to give users insight into data (histogram, etc)

UI Mockup

UI contains features such as:

Existing tools

It is worth building on top of existing tools, to make our work more focused. This section outlines relevant tools for building the idea as easily as possible.

REST framework(s)

Machine Learning Framework(s)

Machine Learning User Interface(s)

Orange3

Orange3: machine learning user interface with drag and drop modelling, visualization, data management and more.

While Orange3 has a user interface, it is based on the Qt framework. This design decision means Orange3 is primarily relegated to Desktop usage. It may be desirable to build a web native user interface, so that no end-user download is necessary (aside from a web browser) to use the software .

Machine Learning REST Interface(s)

General UI widgets

To build out the overall user interface, we can select an existing JS UI framework, such as:

Graph/data flow widgets

Following the conventions in the Orange3 user interface, ML sequences can be modeled as data flows. To facilitate this type of modelling/interation, we can build on an existing JavaScript UI framework such as the following:

Flow-based programming environments

There are some programming environments that support a flow-based visual workflow. The following examples are open-source, and run in aweb browser:

Visualization

For similar reasons as the user interface, the visualization framework should be based on web standards.

A discussion was opened in the Orange3 repository related to open-source, web-based data visualization frameworks.

Proposals for the data visualization framework include:

Resources