Java Code Examples for org.apache.lucene.analysis.TokenStream#getAttributeImplsIterator()

The following examples show how to use org.apache.lucene.analysis.TokenStream#getAttributeImplsIterator() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: Retriever.java    From lucene4ir with Apache License 2.0 6 votes vote down vote up
/**
 * Returns the list of tokens extracted from the query string using the specified analyzer.
 *
 * @param field document field.
 *
 * @param queryTerms query string.
 *
 * @param distinctTokens if true, return the distinct tokens in the query string.
 *
 * @return the list of tokens extracted from the given query.
 *
 * @throws IOException
 */
List<String> getTokens(String field, String queryTerms, boolean distinctTokens) throws IOException {

    List<String> tokens = new ArrayList<String>();

    StringReader topicTitleReader = new StringReader(queryTerms);

    Set<String> seenTokens = new TreeSet<String>();

    TokenStream tok;
    tok = analyzer.tokenStream(field, topicTitleReader);
    tok.reset();
    while (tok.incrementToken()) {
        Iterator<AttributeImpl> atts = tok.getAttributeImplsIterator();
        AttributeImpl token = atts.next();
        String text = "" + token;
        if (seenTokens.contains(text) && distinctTokens) {
            continue;
        }
        seenTokens.add(text);
        tokens.add(text);
    }
    tok.close();

    return tokens;
}
 
Example 2
Source File: AnalysisImpl.java    From lucene-solr with Apache License 2.0 5 votes vote down vote up
private List<TokenAttribute> copyAttributes(TokenStream tokenStream, CharTermAttribute charAtt) {
  List<TokenAttribute> attributes = new ArrayList<>();
  Iterator<AttributeImpl> itr = tokenStream.getAttributeImplsIterator();
  while(itr.hasNext()) {
    AttributeImpl att = itr.next();
    Map<String, String> attValues = new LinkedHashMap<>();
    att.reflectWith((attClass, key, value) -> {
      if (value != null)
        attValues.put(key, value.toString());
    });
    attributes.add(new TokenAttribute(att.getClass().getSimpleName(), attValues));
  }
  return attributes;
}