This repo contains custom algorithms for use with the Splunk Machine Learning Toolkit. The repo itself is also a Splunk app. Custom algorithms can be added to the Splunk Machine Learning toolkit by adhering to the ML-SPL API. The API is a thin wrapper around machine learning estimators provided by libraries such as:
and custom algorithms.
Note that this repo is a collection of custom algorithms only, and not any libraries. Any libraries required should only be added to live environments manually and not to this repo.
A comprehensive guide to using the ML-SPL API can be found here.
A very simple example:
from base import BaseAlgo
class CustomAlgorithm(BaseAlgo):
def __init__(self, options):
# Option checking & initializations here
pass
def fit(self, df, options):
# Fit an estimator to df, a pandas DataFrame of the search results
pass
def partial_fit(self, df, options):
# Incrementally fit a model
pass
def apply(self, df, options):
# Apply a saved model
# Modify df, a pandas DataFrame of the search results
return df
@staticmethod
def register_codecs():
# Add codecs to the codec manager
pass
To use the custom algorithms contained in this app, you must also have installed:
This repository is contains public contributions and Splunk is not responsible for guaranteeing the correctness or validity of the algorithms. Splunk is in no way responsible for the vetting of the contents of contributed algorithms.
To use the custom algorithms in this repository, you must deploy them as a Splunk app.
There are two ways to do this.
You can simple copy the following directories under src:
to:
OR
You will need to install tox. See Test Prerequisites
tox -e package-macos # if on Mac
tox -e package-linux # if on Linux
target
directory (e.g. target/SA_mltk_contrib_app.tgz).
BUILD_DIR
environment variable.SA_mltk_contrib_app
, but this can be overridden by the APP_NAME
environment variable.tox -e clean
to remove the target
directory.${SPLUNK_HOME}/etc/apps
directoryThis repository was specifically made for your contributions! See Contributing for more details.
To start developing, you will need to have Splunk installed. If you don't, read more here.
git clone https://github.com/splunk/mltk-algo-contrib.git
cd mltk-algo-contrib
src
directory to the apps folder in Splunk and restart splunkd:ln -s "$(pwd)/src" $SPLUNK_HOME/etc/apps/SA_mltk_contrib_app
$SPLUNK_HOME/bin/splunk restart
Add your new algorithm(s) to src/bin/algos_contrib
.
(See SVR.py for an example.)
Add a new stanza to src/default/algos.conf
[<your_algo>]
package=algos_contrib
src/bin/algos_contrib/tests/test_<your_algo>.py
(See test_svr.py for an example.)Install tox:
pip install tox
Install tox-pip-extensions:
pip install tox-pip-extensions
tox -r
everytime you update requirements*.txt file, but
this is recommended for convenience.You must also have the following environment variable set to your Splunk installation directory (e.g. /opt/splunk):
To run all tests, run the following command in the root source directory:
tox
To run a single test, you can provide the directory or a file as a parameter:
tox src/bin/algos_contrib/tests/
tox src/bin/algos_contrib/tests/test_example_algo.py
...
Basically, any arguments passed to tox will be passed as an argument to the pytest command. To pass in options, use double dashes (--):
tox -- -k "example" # Run tests that has keyword 'example'
tox -- -x # Stop after the first failure
tox -- -s # Show stdout/stderr (i.e. disable capturing)
...
$ python # from src/bin directory
>>> # Add the MLTK to our sys.path
>>> from link_mltk import add_mltk
>>> add_mltk()
>>>
>>> # Import our algorithm class
>>> from algos_contrib.ExampleAlgo import ExampleAlgo
... (some warning from Splunk may show up)
>>>
>>> # Use utilities to catch common mistakes
>>> from test.contrib_util import AlgoTestUtils
>>> AlgoTestUtils.assert_algo_basic(ExampleAlgo, serializable=False)
Files and packages under test directory should avoid having names that conflict with files or directories directly under:
$SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin
Once you've finished what you're adding, make a pull request.
Please file issues with any information that might be needed to:
The algorithms hosted, as well as the app itself, is licensed under the permissive Apache 2.0 license.
Any additions to this repository must be under one of these licenses: