numerflow

Data workflows for the numer.ai machine learning competition

Tasks

Currently implemented:

Task Documentation

FetchAndExtractData

Fetches the dataset zipfile and extracts the contents to output-path.

Parameters

TrainAndPredict

Trains a Bernoulli Naïve Bayes classifier and predicts the targets. Output file is saved at output-path with a custom, timestamped file name.

Parameters

UploadPredictions

Uploads the predictions of not already uploaded.

Parameters

Usage

Prepare the project:

pip install -r requirements.txt --ignore-installed

If not alread done create an API key here with at least the following permissions:

To run the complete pipeline:

env PYTHONPATH='.' luigi --local-scheduler --module workflow Workflow --secret="YOURSECRET" --public-id="YOURPUBLICID"