Commit Graph

1118 Commits (133106a15c573a56754938d10bf3fdb56b40f810)

Author SHA1 Message Date
Sutra Zhou 133106a15c
Merge pull request #1013 from FreaxLin/Restore_the_already_crawled_example
提交可恢复爬取内容例子
2021-04-09 17:04:18 +08:00
linweisen 76f625c02e 提交可恢复爬取内容例子 2021-04-09 17:00:00 +08:00
Sutra Zhou be6f5ff771 Add missing @Deprecated annotations. 2021-03-22 18:21:59 +08:00
Sutra Zhou 4e8a086dae Pass exception to onError. Fixes #1005. 2021-03-22 18:21:10 +08:00
Sutra Zhou dcfd238413 Polish java version setting. 2021-03-01 01:06:42 +08:00
Sutra Zhou 59fc16101b
Merge pull request #1000 from thebirdandfish/develop
增加了List<SpiderStatusMXBean>属性的get方法,供SpiderMonitor的子类获取.
2021-02-27 16:17:10 +08:00
wecandoitjustthink 528a8908af 增加了List<SpiderStatusMXBean>属性的get方法,供SpiderMonitor的子类获取. 2021-02-27 19:59:05 +13:00
Sutra Zhou 71b7dfbf9a
Merge pull request #993 from yqia182/master
SpiderStatus中getPagePerSecond()方法,增加验证逻辑,避免空指针,避免除数为零。
2021-02-03 10:13:50 +08:00
JustThink 54127318a4 SpiderStatus中getPagePerSecond()方法,增加验证逻辑,避免空指针,避免除数为零。 2021-02-03 02:43:53 +13:00
Sutra Zhou d92dc8397f Upgrade htmlcleaner from 2.5 to 2.9, this is the highest version to let Xpath2Selector pass the test cases. 2021-01-11 01:46:32 +08:00
Sutra Zhou 124c52b988 Downgrade htmlcleaner from 2.24 back to 2.5, to make Xpath2Selector pass the test cases. 2021-01-11 01:25:41 +08:00
Sutra Zhou 683db09133 Complete testXPath2 assertion. 2021-01-11 00:35:22 +08:00
Sutra Zhou 2f71f7912c Fix scm tag. 2021-01-10 14:31:40 +08:00
Sutra Zhou d0e2776991 Upgrade xsoup from 0.3.1 to 0.3.2. 2021-01-10 14:10:32 +08:00
Sutra Zhou 0e01550a79 Upgrade dependencies, including the jedis from 2.9.3 to 3.4.1. 2021-01-06 03:21:10 +08:00
Sutra Zhou 0d73f08ef6 Upgrade maven plugins. 2021-01-06 02:29:34 +08:00
Sutra Zhou e14a762632 Add gitflow-maven-plugin. 2021-01-05 23:14:24 +08:00
Sutra Zhou ab6ff7f809 Revert "pageCount修改"
This reverts commit 9a71f0ac92.
2021-01-02 20:33:32 +08:00
Sutra Zhou 30daec4803 Revert "提供出现某种异常刷新代理,异常可配置"
This reverts commit 4a6441e7c5.
2021-01-02 20:33:17 +08:00
Sutra Zhou d0843bee0d Revert "简化代码"
This reverts commit 9cc5287743.
2021-01-02 20:32:35 +08:00
Sutra Zhou 5ceccc62e0 Revert "提供异常刷新httpClient,异常可配置,重写getHttpClient代码"
This reverts commit 19465089c3.
2021-01-02 20:31:53 +08:00
Sutra Zhou 33e3fcdf22 Revert "代理接口的修改,提供刷星代理API。downloader 下载错误时,提供request,exception,proxyProvider三个参数,"
This reverts commit ba69eba669.
2021-01-02 20:27:28 +08:00
Sutra Zhou c489647c4b Revert " Downloader 提供刷新组件的api,方便在spider中操作"
This reverts commit 2e2a0fdf3e.
2021-01-02 20:15:10 +08:00
Sutra Zhou 4bedd97267 Revert " 刷新代理api重构,需要提供旧代理,如果依然是旧代理,才进行刷新,防止应延迟响应造成的过度刷新"
This reverts commit 0aa2c3949d.
2021-01-02 20:14:02 +08:00
Sutra Zhou 3f756c9325 Revert " 代理功能扩展,对原代理提供商进行拆分,加入lombok"
This reverts commit 33906e36f4.
2021-01-02 20:14:01 +08:00
Sutra Zhou aabc5584b8 Revert " bug修改,对结果提供缓存能力"
This reverts commit f68795d7dd.
2021-01-02 20:13:53 +08:00
Sutra Zhou 57dfc7cfb3
Merge pull request #977 from sutra/build
Remove useless imports to fix build.
2021-01-02 19:42:47 +08:00
Sutra Zhou 328c3e0d7d Remove useless imports to fix build. 2021-01-02 19:41:05 +08:00
Sutra Zhou 1d536cf705
Merge pull request #976 from yaoqiangpersonal/master
主要是对代理的功能进行了增加和修改
2020-12-29 17:27:42 +08:00
yao f68795d7dd bug修改,对结果提供缓存能力 2020-12-29 16:54:38 +08:00
yao 33906e36f4 代理功能扩展,对原代理提供商进行拆分,加入lombok 2020-12-29 16:18:43 +08:00
yao 0aa2c3949d 刷新代理api重构,需要提供旧代理,如果依然是旧代理,才进行刷新,防止应延迟响应造成的过度刷新 2020-12-22 18:19:37 +08:00
yao 2e2a0fdf3e Downloader 提供刷新组件的api,方便在spider中操作 2020-12-21 18:08:55 +08:00
yao 19465089c3 提供异常刷新httpClient,异常可配置,重写getHttpClient代码 2020-12-21 16:02:35 +08:00
yao 9cc5287743 简化代码 2020-12-21 14:58:01 +08:00
yao 4a6441e7c5 提供出现某种异常刷新代理,异常可配置 2020-12-21 14:52:25 +08:00
yao ba69eba669 代理接口的修改,提供刷星代理API。downloader 下载错误时,提供request,exception,proxyProvider三个参数, 2020-12-21 14:36:44 +08:00
Sutra Zhou b2aa5e2677
Merge pull request #974 from itranlin/master
子任务可以使用不同的下载器
2020-12-19 18:12:17 +08:00
itranlin fc7ae9ce28 子任务可以使用不同的下载器。。。 2020-12-19 17:59:52 +08:00
yao 9a71f0ac92 pageCount修改 2020-12-15 17:05:16 +08:00
Sutra Zhou 4b902270b4 Bump version number from 0.7.3 to 0.7.4. 2020-10-27 09:01:21 +08:00
Sutra Zhou 1895dba696
Merge pull request #959 from code4craft/snyk-fix-ad3d8147b4d9640ed91bfc34fbfb9e38
[Snyk] Security upgrade com.google.guava:guava from 29.0-jre to 30.0-android
2020-10-24 15:52:10 +08:00
snyk-bot c87af365d6
fix: pom.xml to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-JAVA-COMGOOGLEGUAVA-1015415
2020-10-24 03:02:20 +00:00
Sutra Zhou 541f421011
Merge pull request #957 from code4craft/snyk-fix-9aefd7a47e1f776c68e7fad3e85033a4
[Snyk] Security upgrade junit:junit from 4.13 to 4.13.1
2020-10-14 10:23:49 +08:00
snyk-bot 2223552aeb
fix: pom.xml to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-JAVA-JUNIT-1017047
2020-10-14 00:46:32 +00:00
Sutra Zhou 50c4e8ccfe
Merge pull request #955 from code4craft/snyk-fix-5cfb952010fd88492e4e2d5516c6237d
[Snyk] Security upgrade org.apache.httpcomponents:httpclient from 4.5.12 to 4.5.13
2020-10-10 11:00:09 +08:00
snyk-bot 7f737626b1
fix: pom.xml to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-JAVA-ORGAPACHEHTTPCOMPONENTS-1016906
2020-10-10 02:58:58 +00:00
Sutra Zhou b4b1df85a0 Fix TLSv1.3. Maybe we should expose a API to allow user to use org.apache.http.ssl.SSLContextBuilder. Fixes #948. 2020-09-21 17:48:59 +08:00
Sutra Zhou 94ac7ca3b6
Merge pull request #946 from code4craft/snyk-fix-84ae966502c97386fb9c384e5874967b
[Snyk] Security upgrade com.alibaba:fastjson from 1.2.68 to 1.2.69
2020-09-09 11:02:19 +08:00
snyk-bot e3b3b9afdd
fix: pom.xml to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-JAVA-COMALIBABA-570967
2020-09-09 02:59:41 +00:00