ProxyPool

@Tony沈哲 on weibo Download License

1. 使用方法

对于Java工程如果使用gradle构建,由于默认没有使用jcenter(),需要在相应module的build.gradle中配置

repositories {
    mavenCentral()
    jcenter()
}

Gradle:

compile 'com.cv4j.proxy:proxypool:1.1.13'

1)本地装好MongoDB数据库

2)proxypool-web模块下的application.properties,参考配置如下:

spring.data.mongodb.uri=mongodb://localhost:27017/proxypool
spring.data.mongodb.uri=mongodb://username:[email protected]:27017/proxypool (有账号密码)

3)创建database和collection

database:proxypool
collection:Proxy_Resource、Resource_Plan、Proxy、Job_Log、Sys_Sequence

4)collection中的默认数据 Proxy_Resource:

{
    "_id" : ObjectId("5a48578737a340d5c48a84af"),
    "_class" : "com.cv4j.proxy.web.dto.ProxyResource",
    "resId" : 1,
    "webName" : "西刺国内高匿代理",
    "webUrl" : "http://www.xicidaili.com/nn/1.html",
    "pageCount" : 100,
    "prefix" : "http://www.xicidaili.com/nn/",
    "suffix" : ".html",
    "parser" : "com.cv4j.proxy.site.xicidaili.XicidailiProxyListPageParser",
    "addTime" : NumberLong(1515114009516),
    "modTime" : NumberLong(1515114009516)
}

Sys_Sequence:

{
    "_id" : ObjectId("5a4f2baf87ccb25df57b096b"),
    "colName" : "Proxy_Resource",
    "sequence" : 2
}

5)运行 按照SpringBoot项目的方式运行程序,访问web的地址如下:

6)定时抓取数据的job proxypool-web模块下目前配置了一个每隔三小时自动运行的job:

com.cv4j.proxy.web.job.ScheduleJobs.cronJob()
application.properties: cronJob.schedule = 0 0 0/3 * * ?s

2. 免费的在线演示:

3. 专业的爬虫

笔者开发的专业的爬虫框架: NetDiscovery

4. 联系方式:

Wechat:fengzhizi715

Java与Android技术栈:每周更新推送原创技术文章,欢迎扫描下方的公众号二维码并关注,期待与您的共同成长和进步。

License

Copyright (C) 2017 - present, Tony Shen.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.