Lale

Build Status Documentation Status codecov PyPI version shields.io License

logo

Lale is a Python library for semi-automated data science. Lale makes it easy to automatically select algorithms and tune hyperparameters of pipelines that are compatible with scikit-learn, in a type-safe fashion. If you are a data scientist who wants to experiment with automated machine learning, this library is for you! Lale adds value beyond scikit-learn along three dimensions: automation, correctness checks, and interoperability. For automation, Lale provides a consistent high-level interface to existing pipeline search tools including Hyperopt, GridSearchCV, and SMAC. For correctness checks, Lale uses JSON Schema to catch mistakes when there is a mismatch between hyperparameters and their type, or between data and operators. And for interoperability, Lale has a growing library of transformers and estimators from popular libraries such as scikit-learn, XGBoost, PyTorch etc. Lale can be installed just like any other Python package and can be edited with off-the-shelf Python tools such as Jupyter notebooks.

The name Lale, pronounced laleh, comes from the Persian word for tulip. Similarly to popular machine-learning libraries such as scikit-learn, Lale is also just a Python library, not a new stand-alone programming language. It does not require users to install new tools nor learn new syntax.

The following paper has a technical deep-dive:

@Article{arxiv19-lale,
  author = "Hirzel, Martin and Kate, Kiran and Shinnar, Avraham and Roy, Subhrajit and Ram, Parikshit",
  title = "Type-Driven Automated Learning with {Lale}",
  journal = "CoRR",
  volume = "abs/1906.03957",
  year = 2019,
  month = may,
  url = "https://arxiv.org/abs/1906.03957" }

The schemas of the operators defined in the lale.lib.autogen module were automatically generated from the source code of 115 scikit-learn operators. The following paper describes the schema extractor:

@InProceedings{baudart_et_al_2020,
  title = "Mining Documentation to Extract Hyperparameter Schemas",
  author = "Baudart, Guillaume and Kirchner, Peter and Hirzel, Martin and Kate, Kiran",
  booktitle = "ICML Workshop on Automated Machine Learning (AutoML@ICML)",
  year = 2020,
  url = "https://arxiv.org/abs/2006.16984" }

Lale is distributed under the terms of the Apache 2.0 License, see LICENSE.txt. It is currently in an Alpha release, without warranties of any kind.