summarize.py

Build Status PyPI version

Simple multi-language Python and NLTK-based implementation of text summarization.

Installation

$ pip install pysummarize

Setup

Before using, make sure you have stopwords and punkt NLTK packages downloaded:

import nltk
nltk.download(['stopwords', 'punkt'])

Quick start

from summarize import summarize
summarize("Alice and Bob are friends. Alice is fun and cuddly."
          " Bob is cute and quirky. Together they go on wonderful"
          " adventures in the land of tomorrow. Alice's cuddlines"
          " and Bob's cuteness allow them to reach their goals."
          " But before they get to them, they have to go past their"
          " mortal enemy - Mr. Boredom. He is ugly and mean. They"
          " will surely defeat him. He is no match for their abilities.")

Usage

summarize(text[, sentence_count=5, language='english'])

Supported languages

In theory, any language with full support in NLTK (stemming, sentence tokenization and stopwords) should work.

Working

Supported by NLTK, but untested results

Online demo

http://summarize.plansfortheday.org/ (source)

Credits

Original description of the algorithm: https://web.archive.org/web/20170825024342/http://engineering.flipboard.com/2014/10/summarization/