cd data ./download_data.sh
This takes about 30 minutes (depending on your internet connection) and downloads the inside TCPDUMP files from the dataset (~18GB) organized into training and test sets, as well as a sample of the KDD dataset.
A description of how evaluation is performed for the DARPA dataset, as well as ground truth files can be found on the DARPA Dataset Documentation page.
Our various experiments are organized as Python files in the root of the repository. Each of the experiments is explained below.
gmm.py- Mixture of Gaussian experiment
IPv4_ttlpacket field as a feature.
check_results.py is a simple script used for checking the results of each
usage: check_results.py [-h] [--thresh THRESH] [--plot] [--table TABLE] results_file attacks_file positional arguments: results_file the results.csv file attacks_file the actual attacks file optional arguments: -h, --help show this help message and exit --thresh THRESH range of thresholds to try. Format: start:stop:num_points, default: 0.5:0.5:1 --plot make plots --table TABLE make table using the specified threshold
The plots we used in the poster and paper were generated using the scripts in
To run tests locally, run
python -m unittest discover
from the root folder of the repository.