Java Code Examples for org.apache.nutch.crawl.CrawlDatum#getScore()

The following examples show how to use org.apache.nutch.crawl.CrawlDatum#getScore() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: OPICScoringFilter.java    From anthelion with Apache License 2.0 5 votes vote down vote up
/** Increase the score by a sum of inlinked scores. */
public void updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List inlinked) throws ScoringFilterException {
  float adjust = 0.0f;
  for (int i = 0; i < inlinked.size(); i++) {
    CrawlDatum linked = (CrawlDatum)inlinked.get(i);
    adjust += linked.getScore();
  }
  if (old == null) old = datum;
  datum.setScore(old.getScore() + adjust);
}
 
Example 2
Source File: OPICScoringFilter.java    From nutch-htmlunit with Apache License 2.0 5 votes vote down vote up
/** Increase the score by a sum of inlinked scores. */
public void updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked) throws ScoringFilterException {
  float adjust = 0.0f;
  for (int i = 0; i < inlinked.size(); i++) {
    CrawlDatum linked = inlinked.get(i);
    adjust += linked.getScore();
  }
  if (old == null) old = datum;
  datum.setScore(old.getScore() + adjust);
}
 
Example 3
Source File: OPICScoringFilter.java    From anthelion with Apache License 2.0 4 votes vote down vote up
/** Use {@link CrawlDatum#getScore()}. */
public float generatorSortValue(Text url, CrawlDatum datum, float initSort) throws ScoringFilterException {
  return datum.getScore() * initSort;
}
 
Example 4
Source File: AnthelionScoringFilter.java    From anthelion with Apache License 2.0 4 votes vote down vote up
/**
 * This is the score that is used for selecting the urls that are going to
 * be fetched. If you didn't know that you will have some headaches.
 * 
 */
@Override
public float generatorSortValue(Text url, CrawlDatum datum, float initSort) throws ScoringFilterException {
	// TODO Auto-generated method stub
	return datum.getScore();
}
 
Example 5
Source File: LinkAnalysisScoringFilter.java    From anthelion with Apache License 2.0 4 votes vote down vote up
public float generatorSortValue(Text url, CrawlDatum datum, float initSort)
  throws ScoringFilterException {
  return datum.getScore() * initSort;
}
 
Example 6
Source File: LinkAnalysisScoringFilter.java    From anthelion with Apache License 2.0 4 votes vote down vote up
public float indexerScore(Text url, NutchDocument doc, CrawlDatum dbDatum,
  CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)
  throws ScoringFilterException {
  return (normalizedScore * dbDatum.getScore());
}
 
Example 7
Source File: OPICScoringFilter.java    From nutch-htmlunit with Apache License 2.0 4 votes vote down vote up
/** Use {@link CrawlDatum#getScore()}. */
public float generatorSortValue(Text url, CrawlDatum datum, float initSort) throws ScoringFilterException {
  return datum.getScore() * initSort;
}
 
Example 8
Source File: LinkAnalysisScoringFilter.java    From nutch-htmlunit with Apache License 2.0 4 votes vote down vote up
public float generatorSortValue(Text url, CrawlDatum datum, float initSort)
  throws ScoringFilterException {
  return datum.getScore() * initSort;
}
 
Example 9
Source File: LinkAnalysisScoringFilter.java    From nutch-htmlunit with Apache License 2.0 4 votes vote down vote up
public float indexerScore(Text url, NutchDocument doc, CrawlDatum dbDatum,
  CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)
  throws ScoringFilterException {
  return (normalizedScore * dbDatum.getScore());
}