Java Code Examples for org.apache.nutch.crawl.CrawlDatum#setScore()

The following examples show how to use org.apache.nutch.crawl.CrawlDatum#setScore() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: OPICScoringFilter.java    From anthelion with Apache License 2.0 5 votes vote down vote up
/** Increase the score by a sum of inlinked scores. */
public void updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List inlinked) throws ScoringFilterException {
  float adjust = 0.0f;
  for (int i = 0; i < inlinked.size(); i++) {
    CrawlDatum linked = (CrawlDatum)inlinked.get(i);
    adjust += linked.getScore();
  }
  if (old == null) old = datum;
  datum.setScore(old.getScore() + adjust);
}
 
Example 2
Source File: OPICScoringFilter.java    From nutch-htmlunit with Apache License 2.0 5 votes vote down vote up
/** Increase the score by a sum of inlinked scores. */
public void updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked) throws ScoringFilterException {
  float adjust = 0.0f;
  for (int i = 0; i < inlinked.size(); i++) {
    CrawlDatum linked = inlinked.get(i);
    adjust += linked.getScore();
  }
  if (old == null) old = datum;
  datum.setScore(old.getScore() + adjust);
}
 
Example 3
Source File: OPICScoringFilter.java    From anthelion with Apache License 2.0 4 votes vote down vote up
/** Set to 0.0f (unknown value) - inlink contributions will bring it to
 * a correct level. Newly discovered pages have at least one inlink. */
public void initialScore(Text url, CrawlDatum datum) throws ScoringFilterException {
  datum.setScore(0.0f);
}
 
Example 4
Source File: LinkAnalysisScoringFilter.java    From anthelion with Apache License 2.0 4 votes vote down vote up
public void initialScore(Text url, CrawlDatum datum)
  throws ScoringFilterException {
  datum.setScore(0.0f);
}
 
Example 5
Source File: OPICScoringFilter.java    From nutch-htmlunit with Apache License 2.0 4 votes vote down vote up
/** Set to 0.0f (unknown value) - inlink contributions will bring it to
 * a correct level. Newly discovered pages have at least one inlink. */
public void initialScore(Text url, CrawlDatum datum) throws ScoringFilterException {
  datum.setScore(0.0f);
}
 
Example 6
Source File: LinkAnalysisScoringFilter.java    From nutch-htmlunit with Apache License 2.0 4 votes vote down vote up
public void initialScore(Text url, CrawlDatum datum)
  throws ScoringFilterException {
  datum.setScore(0.0f);
}