Semantic Textual Similarity Toolkits

Gitter

This is the code by ECNU team submitted to SemEval STS Task.

Installation

# download the repo
git clone https://github.com/rgtjf/Semantic-Texual-Similarity-Toolkits.git
# download the dataset and stanford CoreNLP tools
sh download.sh
# run the demo
python demo.py

Results

you can configure sts_model.py to see the performance of different features on STSBenchmark dataset.

STSBenchmark

Methods Dev Test
RF 0.8333 0.7993
GB 0.8356 0.8022
EN-seven 0.8466 0.8100
---------------------- -------- --------
aligner 0.6991 0.6379
idf_aligner 0.7969 0.7622
BOWFeature-True 0.7584 0.6472
BOWFeature-False 0.7788 0.6874
nGramOverlapFeature 0.7817 0.7453
BOWFeature 0.7639 0.6847
AlignmentFeature 0.8163 0.7748
WordEmbeddingFeature 0.8011 0.7128

Reference

STSBenchmark board

Contacts

Any questions, please feel free to contact us: rgtjf1 AT 163 DOT com