cc.mallet.pipe.TokenSequence2FeatureSequence Java Examples

The following examples show how to use cc.mallet.pipe.TokenSequence2FeatureSequence. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: TopicModelPipe.java    From baleen with Apache License 2.0 5 votes vote down vote up
/**
 * Construct topic model pipe with given stopwords and alphabets
 *
 * @param stopwords to be removed
 * @param alphabet to use
 */
public TopicModelPipe(Collection<String> stopwords, Alphabet alphabet) {
  // @formatter:off
  super(
      ImmutableList.of(
          new CharSequenceLowercase(),
          new CharSequence2TokenSequence(Pattern.compile("\\p{L}[\\p{L}\\p{P}]+\\p{L}")),
          new RemoveStopwords(stopwords),
          new TokenSequence2FeatureSequence(alphabet)));
  // @formatter:on
}
 
Example #2
Source File: ReferencesClassifierTrainer.java    From bluima with Apache License 2.0 5 votes vote down vote up
static List<Pipe> getPipes() {

        List<Pipe> pipes = newArrayList();
        pipes.add(new Target2Label());
        pipes.add(new MyInput2RegexTokens());

        // pipes.add(new PrintInputAndTarget());

        pipes.add(new TokenSequence2FeatureSequence());
        pipes.add(new FeatureSequence2FeatureVector());
        return pipes;
    }
 
Example #3
Source File: LDA.java    From topic-detection with Apache License 2.0 5 votes vote down vote up
/**
 * Creates a list of Malelt instances from a list of documents
 * @param texts a list of documents
 * @return a list of Mallet instances
 * @throws IOException
 */
private InstanceList createInstanceList(List<String> texts) throws IOException
{
	ArrayList<Pipe> pipes = new ArrayList<Pipe>();
	pipes.add(new CharSequence2TokenSequence());
	pipes.add(new TokenSequenceLowercase());
	pipes.add(new TokenSequenceRemoveStopwords());
	pipes.add(new TokenSequence2FeatureSequence());
	InstanceList instanceList = new InstanceList(new SerialPipes(pipes));
	instanceList.addThruPipe(new ArrayIterator(texts));
	return instanceList;
}