TextRank

Implementation of TextRank with the option of using cosine similarity of word vectors from pre-trained Word2Vec embeddings as the similarity metric.

Instructions:

The text extract from which keywords are to be extracted can be stored in sample.txt and keywords can be extracted using main.py

python3 main.py --data sample.txt

Usage:

from keyword_extractor import KeywordExtractor

text = "sample text goes here"
word2vec = "path to pre-trained Word2Vec embeddings (None if pre-trained embeddings are not available"

extractor = KeywordExtractor(word2vec=word2vec)

keywords = extractor.extract(text, ratio=0.2, split=True, scores=True)
for keyword in keywords:
    print(keyword)

Dependencies:

gensim
nltk

Use python3

Reference: