Machine Learning Glossary

Looking for fellow maintainers!

Apologies for my non-responsiveness. :( I've been heads down at Cruise, buiding ML infra for self-driving cars, and haven't reviewed this repo in forever. Looks like we're getting 54k monthly active users now and I think the repo deserves more attention. Let me know if you would be interested in joining as a maintainer with priviledges to merge PRs.

View The Glossary

How To Contribute

  1. Clone Repo

    git clone https://github.com/bfortuner/ml-glossary.git
  2. Install Dependencies

    # Assumes you have the usual suspects installed: numpy, scipy, etc..
    pip install sphinx sphinx-autobuild
    pip install sphinx_rtd_theme
    pip install recommonmark

    For python-3.x installed, use:

    pip3 install sphinx sphinx-autobuild
    pip3 install sphinx_rtd_theme
    pip3 install recommonmark
  3. Preview Changes

If you are using make build.

cd ml-glossary
cd docs
make html

For Windows.

cd ml-glossary
cd docs
build.bat html
  1. Verify your changes by opening the index.html file in _build/

  2. Submit Pull Request

Short for time?

Feel free to raise an issue to correct errors or contribute content without a pull request.

Style Guide

Each entry in the glossary MUST include the following at a minimum:

  1. Concise explanation - as short as possible, but no shorter
  2. Citations - Papers, Tutorials, etc.

Excellent entries will also include:

  1. Visuals - diagrams, charts, animations, images
  2. Code - python/numpy snippets, classes, or functions
  3. Equations - Formatted with Latex

The goal of the glossary is to present content in the most accessible way possible, with a heavy emphasis on visuals and interactive diagrams. That said, in the spirit of rapid prototyping, it's okay to to submit a "rough draft" without visuals or code. We expect other readers will enhance your submission over time.

Why RST and not Markdown?

RST has more features. For large and complex documentation projects, it's the logical choice.

Top Contributors

We're big fans of Distill and we like their idea of offering prizes for high-quality submissions. We don't have as much money as they do, but we'd still like to reward contributors in some way for contributing to the glossary. For instance a cheatsheet cryptocurreny where tokens equal commits ;). Let us know if you have better ideas. In the end, this is an open-source project and we hope contributing to a repository of concise, accessible, machine learning knowledge is enough incentive on its own!

Tips and Tricks

Resources