Implementation of TextRank with the option of using cosine similarity of word vectors from pre-trained Word2Vec embeddings as the similarity metric.
The text extract from which keywords are to be extracted can be stored in sample.txt and keywords can be extracted using main.py
python3 main.py --data sample.txt
from keyword_extractor import KeywordExtractor text = "sample text goes here" word2vec = "path to pre-trained Word2Vec embeddings (None if pre-trained embeddings are not available" extractor = KeywordExtractor(word2vec=word2vec) keywords = extractor.extract(text, ratio=0.2, split=True, scores=True) for keyword in keywords: print(keyword)
gensim nltk Use python3