Easy Machine Learning

What is Easy Machine Learning

Machine learning algorithms have become the key components in many big data applications. However, the full potential of machine learning is still far from been realized because using machine learning algorithms is hard, especially on distributed platforms such as Hadoop and Spark. The key barriers come from not only the implementation of the algorithms themselves, but also the processing for applying them to real applications which often involve multiple steps and different algorithms.

Our platform Easy Machine Learning presents a general-purpose dataflow-based system for easing the process of applying machine learning algorithms to real world tasks. In the system a learning task is formulated as a directed acyclic graph (DAG) in which each node represents an operation (e.g. a machine learning algorithm), and each edge represents the flow of the data from one node to its descendants. The task can be defined manually or be cloned from existing tasks/templates. After submitting a task to the cloud, each node will be automatically scheduled to execute according to the DAG. Graphical user interface is implemented for making users to create, configure, submit, and monitor a task in a drag-and-drop manner. Advantages of the system include

  1. Lowing the barriers of defining and executing machine learning tasks;

  2. Sharing and re-using the implementations of the algorithms, the job DAGs, and the experimental results;

  3. Seamlessly integrating the stand-alone algorithms as well as the distributed algorithms in one task.

The system consists of three major components:

How to involve in our project

Pull all project and prepare some necessary environments and a kind of development utilities. Follows the step in Quick-start.md, and you can create our system in your computer.

How to use Easy Machine Learning Studio

After you have ran Easy ML,You can login via http://localhost:18080/EMLStudio.htmlwith our official account [email protected] and password bdaict. For the best user experience, it is recommended to use Chrome.

How to experience our system

We apply an online service for you to experience our system. You can register your own account or use our official account to login the system. The website of the system is as belows:

If you have any advice or problems when you expericen our system, welcome to contact us! You can leave us a message or give a email to [email protected], thank you for your advice!

Papers and Presentations

  1. EasyML: Ease the Process of Machine Learning with Data Flow. SOSP AI System Workshop Shanghai Oct. 28, 2017
  2. Tianyou Guo, Jun Xu, Xiaohui Yan, Jianpeng Hou, Ping Li, Zhaohui Li, Jiafeng Guo, and Xueqi Cheng. Ease the Process of Machine Learning with Dataflow. Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM '16), Indianapolis, USA, pp. 2437-2440, 2016.

Acknowledgements

The following people contributed to the development of the EasyML project: