Commit Graph

939 Commits (17d8bfa907da148d5b0cc6452ce3fc3360c1fdfc)

Author SHA1 Message Date
xbynet@outlook.com c23627bf63 解决post/redirect/post 302跳转问题 2017-01-17 00:07:01 +08:00
yihua.huang 7ca4ed04b4 add Gather Platform to readme zh 2017-01-14 06:25:39 +08:00
Jsbd 6d78d51fc0 Merge branch 'master' into master 2016-12-27 14:15:40 +08:00
yihua.huang e3131af856 remove user manual link 2016-12-24 07:34:25 +08:00
yihua.huang 7f0d79ccb0 readme 2016-12-19 09:38:33 +08:00
yihua.huang ab1dad3a71 remove lib 2016-12-18 11:49:20 +08:00
yihua.huang a6b2e307f3 docs for 0.6.0 2016-12-18 11:48:57 +08:00
yihua.huang d69204b919 0.6.0 2016-12-18 11:45:43 +08:00
yihua.huang 9bdb48b2d0 version 0.6.0 2016-12-18 11:20:28 +08:00
yihua.huang eeb607fd0e 将Spider.processRequest()抛出异常改回原来的逻辑 2016-12-18 11:04:58 +08:00
yihua.huang 97592d6720 Version 0.6.0 2016-12-18 10:58:24 +08:00
yihua.huang 42e5e623b4 remove project avalon 2016-12-18 10:52:01 +08:00
yihua.huang 00dfebbceb #424 remove guava dep and add fix docs 2016-12-18 10:45:50 +08:00
yihua.huang c2531c6817 clean dependency 2016-12-18 08:34:46 +08:00
yihua.huang a960a39c44 fix compile error for example change 2016-12-18 08:32:14 +08:00
yihua.huang a3ee9e3d08 fix example 2016-12-18 08:18:26 +08:00
yihua.huang 7476ceccee more stable test 2016-12-18 08:15:26 +08:00
yihua.huang 5ce3fdfe5a some refactor in log 2016-12-18 08:15:09 +08:00
yihua.huang 98163a3e40 update examples 2016-12-18 07:46:18 +08:00
yihua.huang 243ebc22fa #374 update httpclient version to 4.5.2 2016-12-18 07:15:40 +08:00
yihua.huang b090dcd20d sepcific error page for HttpClientDownloaderTest to avoid test error when local port is available 2016-12-18 07:15:06 +08:00
yihua.huang 8f942d6fe2 #419 修复抓取https链接线程无法结束导致进程一直运行的问题 2016-12-18 06:56:01 +08:00
Jsbd 1b886d48a2 新增PhantomJSDownloader构造函数,支持crawl.js路径自定义,因为当其他项目依赖此jar包时,runtime.exec()执行phantomjs命令时无使用法jar包中的crawl.js 2016-12-08 14:29:42 +08:00
Jsbd d1f2e65e5d 新增PhantomJSDownloader构造函数,支持crawl.js路径自定义,因为当其他项目依赖此jar包时,runtime.exec()执行phantomjs命令时无使用法jar包中的crawl.js 2016-12-08 14:28:48 +08:00
Jsbd f8a2328ead Merge pull request #1 from code4craft/master
merge
2016-12-02 17:22:55 +08:00
Yihua Huang 1987cd3ae1 Merge pull request #408 from code4craft/0.6.0
groovy demo
2016-12-02 14:20:46 +08:00
Yihua Huang 65fe2c4487 Merge pull request #407 from jsbd/master
为PhantomJSDownloader添加新的构造函数,支持phantomjs自定义命令
2016-12-02 14:05:58 +08:00
Jsbd ebc61363c8 为PhantomJSDownloader添加新的构造函数,支持phantomjs自定义命令
为PhantomJSDownloader添加新的构造函数,支持phantomjs自定义命令
 example: 
   *    phantomjs.exe 支持windows环境
   *    phantomjs --ignore-ssl-errors=yes 忽略抓取地址是https时的一些错误
   *    /usr/local/bin/phantomjs 命令的绝对路径,避免因系统环境变量引起的IOException
2016-12-02 10:17:46 +08:00
yihua.huang fdf39eb99d open in new page 2016-11-25 18:20:59 +08:00
yihua.huang 1e74494708 add related link 2016-11-25 18:19:32 +08:00
yihua.huang b92e6b04f0 #400 修复FileCacheQueueScheduler自己设置DuplicateRemover会导致NPE的问题 2016-11-25 08:30:24 +08:00
yihua.huang dafd2b77ff fix GithubRepoPageProcessor in example 2016-11-24 08:18:06 +08:00
yihua.huang cfed860fb9 Merge branch 'master' of github.com:code4craft/webmagic 2016-11-22 17:00:27 +08:00
yihua.huang 2189aab652 fix test 2016-11-22 16:58:49 +08:00
Yihua Huang 1491033534 Merge pull request #377 from jerry-sc/monitor-bug
fix the monitor bug which the spider will terminate when a seed url with port
2016-11-19 13:01:30 +08:00
Yihua Huang 228911b58c Merge pull request #370 from gyk001/master
fixed #301 修复使用注解抽取JSON数据的问题
2016-11-19 12:59:48 +08:00
yihua.huang 507556d0aa fix test: ProxyTest.testProxy() do not load exist proxy config 2016-11-19 12:54:39 +08:00
yihua.huang 55f131e5ef #380 update fastjson to 1.2.21 2016-11-19 12:53:59 +08:00
Jerry e56b8c3efc fix the monitor bug which the spider will terminate when a seed url with port 2016-09-22 22:36:18 +08:00
郭玉昆 700898fe8a fixed #301 修复使用注解抽取JSON数据的问题 2016-08-29 17:07:37 +08:00
Yihua Huang e22d6426fc Merge pull request #343 from Salon-sai/master
add: redis scheduler with priority
2016-08-18 13:28:25 +08:00
Salon.sai f89a6a6826 add: redis scheduler with priority 2016-07-05 16:29:01 +08:00
yihua.huang 448e528140 update StringUtils to apache lang3 #314 2016-05-24 13:33:17 +08:00
yihua.huang 3e33959b7a #319 fix javadoc 2016-05-24 13:17:35 +08:00
yihua.huang 3a6e246350 Merge branch 'kapsterio-fix' 2016-05-08 20:54:57 +08:00
yihua.huang 8730e3e97a Merge branch 'fix' of git://github.com/kapsterio/webmagic into kapsterio-fix 2016-05-08 20:46:22 +08:00
Yihua Huang 37cb43b667 Merge pull request #176 from lavenderx/master
add PhantomJSDownloader
2016-05-08 20:36:17 +08:00
yihua.huang 2400ff7e1a resovle conflict 2016-05-08 20:31:43 +08:00
yihua.huang 9de64ea0f2 Merge branch 'hepan-master' 2016-05-08 20:27:53 +08:00
yihua.huang b7f3c4bba0 Merge branch 'master' of git://github.com/hepan/webmagic into hepan-master 2016-05-08 20:27:47 +08:00