PDP Framework for Neural Constraint Satisfaction Solving

The PDP framework is a generic framework based on the idea of Propagation, Decimation and Prediction (PDP) for learning and implementing message passing-based solvers for constraint satisfaction problems (CSP). In particular, it provides an elegant unsupervised framework for training neural solvers based on the idea of energy minimization. Our SAT solver adaptation of the PDP framework, referred as SATYR, supports a wide spectrum of solvers from fully neural architectures to classical inference-based techniques (such as Survey Propagation) with hybrid methods in between. For further theoretical details of the framework, please refer to our paper:

Saeed Amizadeh, Sergiy Matusevych and Markus Weimer, PDP: A General Neural Framework for Learning Constraint Satisfaction Solvers, arXiv preprint arXiv:1903.01969, 2019.

@article{amizadeh2019pdp,
  title={PDP: A General Neural Framework for Learning Constraint Satisfaction Solvers},
  author={Amizadeh, Saeed and Matusevych, Sergiy and Weimer, Markus},
  journal={arXiv preprint arXiv:1903.01969},
  year={2019}
}

We also note the present work is still far away from competing with modern industrial solvers; nevertheless, we believe it is a significant step in the right direction for machine learning-based methods. Hence, we are glad to open source our code to the researchers in related fields including the neuro-symbolic community as well as the classical SAT community.

SATYR

SATYR is the adaptation of the PDP framework for training and deploying neural Boolean Satisfiability solvers. In particular, SATYR implements:

  1. Fully or partially neural SAT solvers that can be trained toward solving SAT for a specific distribution of problem instances. The training is based on unsupervised energy minimization and can be performed on an infinite stream of unlabeled, random instances sampled from the target distribution.

  2. Non-learnable classical solvers based on message passing in graphical models (e.g. Survey Propagation). Even though, these solvers are non-learnable, they still benefit from the embarrassingly parallel implementation via the PDP framework on GPUs.

It should be noted that all the SATYR solvers try to find a satisfying assignment for input SAT formulas. However, if the SATYR solvers cannot find a satisfying solution for a given problem within their iteration number budget, it does NOT necessarily mean that the input problem in UNSAT. In other words, none of the SATYR solvers provide the proof of unsatisfiability.

Setup

Prerequisites

Run:

> python setup.py

Usage

The SATYR solvers can be used in two main modes: (1) apply an already-trained model or non-ML algorithm to test data, and (2) train/test new models.

Running a (Trained) SATYR Solver

The usage for running a SATYR solver againts a set of SAT problems (represented as Conjunctive Normal Form (CNF)) is:

> python satyr.py [-h] [-b BATCH_REPLICATION] [-z BATCH_SIZE]
                [-m MAX_CACHE_SIZE] [-l TEST_BATCH_LIMIT]
                [-w LOCAL_SEARCH_ITERATION] [-e EPSILON] [-v] [-c] [-d]
                [-s RANDOM_SEED] [-o OUTPUT]
                model_config test_path test_recurrence_num

The commandline arguments are:

Training/Testing a SATYR Solver

The usage for training/testing new SATYR models is:

> python satyr-train-test.py [-h] [-t] [-l LOAD_MODEL] [-c] [-r] [-g]
                           [-b BATCH_REPLICATION]
                           config

The commandline arguments are:

Input/Output Formats

Input

SATYR effectively works with the standard DIMACS format for representing CNF formulas. However, in order to increase the ingressing efficiency, the actual solvers work directly with an intermediate JSON format instead of the DIMACS representation for consuming input CNF data. A key feature of the intermediate JSON format is that an entire set of DIMACS files can be represented by a single JSON file where each row in the JSON file associates with one DIMACS file.

The train/test script assumes the train/validation/test sets are already in the JSON format. In order to convert a set of DIMACS files into a single JSON file, we have provided the following script:

> python dimacs2json.py [-h] [-s] [-p] in_dir out_file

where the commandline arguments are:

The solver script, however, does not require the input problems to be in the JSON format; they can be in the DIMACS format as long as the -d option is deployed. Nevertheless, for repetitive applications of the solver script on the same input set, we would recommend externally converting the input DIMACS files into the JSON format once and only consume the JSON file afterwards.

Output

The output of the solver script is a JSON file where each line corresponds to one input CNF instance and is a dictionary with the following key:value pairs:

Main Contributors

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Extending The PDP Framework

The PDP framework supports a wide range of solvers from fully neural solvers to hybrid, neuro-symbolic models all the way to classical, non-learnable algorithms. So far, we have only implemented six different types, but there is definitely room for more. Therefore, we highly encourage contributions in the form of other types of PDP-based solvers. Furthermore, we welcome contributions with PDP-based adaptations for other types of constraint satisfaction problems beyond SAT.

References

  1. Mezard, M. and Montanari, A. Information, physics, and computation. Oxford University Press, 2009.
  2. Chavas, J., Furtlehner, C., Mezard, M., and Zecchina, R. Survey-propagation decimation through distributed local computations. Journal of Statistical Mechanics: Theory and Experiment, 2005(11):P11016, 2005.
  3. Hoos, Holger H. On the Run-time Behaviour of Stochastic Local Search Algorithms for SAT. In AAAI/IAAI, pp. 661-666. 1999.
  4. Giraldez-Cru, J. and Levy, J. Generating sat instances with community structure. Artificial Intelligence, 238:119–134, 2016.