Lexical Substitution Evaluation

This code was used to perform the lexical substitution evaluation described in the following papers:

[1] A Simple Word Embedding Model for Lexical Substitution Oren Melamud, Omer Levy, Ido Dagan. Workshop on Vector Space Modeling for NLP (VSM), 2015 [pdf].

[2] context2vec: Learning Generic Context Embedding with Bidirectional LSTM
Oren Melamud, Jacob Goldberger, Ido Dagan. CoNLL, 2016 [pdf].

Requirements

Datasets

This repository contains preprocessed data files based on the datasets introduced by the following papers:

[3] Semeval-2007 task 10: English lexical substitution task Diana McCarthy, Roberto Navigli, SemEval 2007.
(files with the prefix 'lst' under the 'dataset' directory)

[4] What substitutes tell us-analysis of an ”all-words” lexical substitution corpus. Gerhard Kremer,Katrin Erk, Sebastian Pado, Stefan Thater. EACL, 2014.
(files with the prefix 'coinco' under the 'dataset' directory)

Evaluating the word embedding model [1]

Evaluating the context2vec model [2]

License

Apache 2.0