us.codecraft.webmagic.pipeline.PageModelPipeline Java Examples

The following examples show how to use us.codecraft.webmagic.pipeline.PageModelPipeline. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: ModelPipeline.java    From webmagic with Apache License 2.0 6 votes vote down vote up
@Override
public void process(ResultItems resultItems, Task task) {
    for (Map.Entry<Class, PageModelPipeline> classPageModelPipelineEntry : pageModelPipelines.entrySet()) {
        Object o = resultItems.get(classPageModelPipelineEntry.getKey().getCanonicalName());
        if (o != null) {
            Annotation annotation = classPageModelPipelineEntry.getKey().getAnnotation(ExtractBy.class);
            if (annotation == null || !((ExtractBy) annotation).multi()) {
                classPageModelPipelineEntry.getValue().process(o, task);
            } else {
                List<Object> list = (List<Object>) o;
                for (Object o1 : list) {
                    classPageModelPipelineEntry.getValue().process(o1, task);
                }
            }
        }
    }
}
 
Example #2
Source File: Kr36NewsModel.java    From webmagic with Apache License 2.0 5 votes vote down vote up
public static void main(String[] args) throws IOException, JMException {
    //Just for benchmark
    Spider thread = OOSpider.create(Site.me().setSleepTime(0), new PageModelPipeline() {
        @Override
        public void process(Object o, Task task) {

        }
    }, Kr36NewsModel.class).thread(20).addUrl("http://www.36kr.com/");
    thread.start();
    SpiderMonitor spiderMonitor = SpiderMonitor.instance();
    spiderMonitor.register(thread);
}
 
Example #3
Source File: OschinaBlog.java    From webmagic with Apache License 2.0 5 votes vote down vote up
public static void main(String[] args) {
    OOSpider.create(Site.me()
            .setUserAgent("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36")
            .setSleepTime(0)
            .setRetryTimes(3)
            ,new PageModelPipeline() {
        @Override
        public void process(Object o, Task task) {

        }
    }, OschinaBlog.class).thread(10).addUrl("http://my.oschina.net/flashsword/blog").run();
}
 
Example #4
Source File: OOSpider.java    From webmagic with Apache License 2.0 5 votes vote down vote up
/**
 * create a spider
 *
 * @param site site
 * @param pageModelPipeline pageModelPipeline
 * @param pageModels pageModels
 */
public OOSpider(Site site, PageModelPipeline pageModelPipeline, Class... pageModels) {
    this(ModelPageProcessor.create(site, pageModels));
    this.modelPipeline = new ModelPipeline();
    super.addPipeline(modelPipeline);
    for (Class pageModel : pageModels) {
        if (pageModelPipeline != null) {
            this.modelPipeline.put(pageModel, pageModelPipeline);
        }
        pageModelClasses.add(pageModel);
    }
}
 
Example #5
Source File: OOSpider.java    From webmagic with Apache License 2.0 5 votes vote down vote up
public OOSpider addPageModel(PageModelPipeline pageModelPipeline, Class... pageModels) {
    for (Class pageModel : pageModels) {
        modelPageProcessor.addPageModel(pageModel);
        modelPipeline.put(pageModel, pageModelPipeline);
    }
    return this;
}
 
Example #6
Source File: GithubRepoTest.java    From webmagic with Apache License 2.0 5 votes vote down vote up
@Test
public void test() {
    OOSpider.create(Site.me().setSleepTime(0)
            , new PageModelPipeline<GithubRepo>() {
        @Override
        public void process(GithubRepo o, Task task) {
            assertThat(o.getStar()).isEqualTo(86);
            assertThat(o.getFork()).isEqualTo(70);
        }
    }, GithubRepo.class).addUrl("https://github.com/code4craft/webmagic").setDownloader(new MockGithubDownloader()).test("https://github.com/code4craft/webmagic");
}
 
Example #7
Source File: ModelPipeline.java    From webmagic with Apache License 2.0 4 votes vote down vote up
public ModelPipeline put(Class clazz, PageModelPipeline pageModelPipeline) {
    pageModelPipelines.put(clazz, pageModelPipeline);
    return this;
}
 
Example #8
Source File: OOSpider.java    From webmagic with Apache License 2.0 4 votes vote down vote up
public static OOSpider create(Site site, PageModelPipeline pageModelPipeline, Class... pageModels) {
    return new OOSpider(site, pageModelPipeline, pageModels);
}