Commit Graph

1109 Commits (d92dc8397f336c2757ce4559ac92daf7bf82aa61)

Author SHA1 Message Date
Sutra Zhou d92dc8397f Upgrade htmlcleaner from 2.5 to 2.9, this is the highest version to let Xpath2Selector pass the test cases. 2021-01-11 01:46:32 +08:00
Sutra Zhou 124c52b988 Downgrade htmlcleaner from 2.24 back to 2.5, to make Xpath2Selector pass the test cases. 2021-01-11 01:25:41 +08:00
Sutra Zhou 683db09133 Complete testXPath2 assertion. 2021-01-11 00:35:22 +08:00
Sutra Zhou 2f71f7912c Fix scm tag. 2021-01-10 14:31:40 +08:00
Sutra Zhou d0e2776991 Upgrade xsoup from 0.3.1 to 0.3.2. 2021-01-10 14:10:32 +08:00
Sutra Zhou 0e01550a79 Upgrade dependencies, including the jedis from 2.9.3 to 3.4.1. 2021-01-06 03:21:10 +08:00
Sutra Zhou 0d73f08ef6 Upgrade maven plugins. 2021-01-06 02:29:34 +08:00
Sutra Zhou e14a762632 Add gitflow-maven-plugin. 2021-01-05 23:14:24 +08:00
Sutra Zhou ab6ff7f809 Revert "pageCount修改"
This reverts commit 9a71f0ac92.
2021-01-02 20:33:32 +08:00
Sutra Zhou 30daec4803 Revert "提供出现某种异常刷新代理,异常可配置"
This reverts commit 4a6441e7c5.
2021-01-02 20:33:17 +08:00
Sutra Zhou d0843bee0d Revert "简化代码"
This reverts commit 9cc5287743.
2021-01-02 20:32:35 +08:00
Sutra Zhou 5ceccc62e0 Revert "提供异常刷新httpClient,异常可配置,重写getHttpClient代码"
This reverts commit 19465089c3.
2021-01-02 20:31:53 +08:00
Sutra Zhou 33e3fcdf22 Revert "代理接口的修改,提供刷星代理API。downloader 下载错误时,提供request,exception,proxyProvider三个参数,"
This reverts commit ba69eba669.
2021-01-02 20:27:28 +08:00
Sutra Zhou c489647c4b Revert " Downloader 提供刷新组件的api,方便在spider中操作"
This reverts commit 2e2a0fdf3e.
2021-01-02 20:15:10 +08:00
Sutra Zhou 4bedd97267 Revert " 刷新代理api重构,需要提供旧代理,如果依然是旧代理,才进行刷新,防止应延迟响应造成的过度刷新"
This reverts commit 0aa2c3949d.
2021-01-02 20:14:02 +08:00
Sutra Zhou 3f756c9325 Revert " 代理功能扩展,对原代理提供商进行拆分,加入lombok"
This reverts commit 33906e36f4.
2021-01-02 20:14:01 +08:00
Sutra Zhou aabc5584b8 Revert " bug修改,对结果提供缓存能力"
This reverts commit f68795d7dd.
2021-01-02 20:13:53 +08:00
Sutra Zhou 57dfc7cfb3
Merge pull request #977 from sutra/build
Remove useless imports to fix build.
2021-01-02 19:42:47 +08:00
Sutra Zhou 328c3e0d7d Remove useless imports to fix build. 2021-01-02 19:41:05 +08:00
Sutra Zhou 1d536cf705
Merge pull request #976 from yaoqiangpersonal/master
主要是对代理的功能进行了增加和修改
2020-12-29 17:27:42 +08:00
yao f68795d7dd bug修改,对结果提供缓存能力 2020-12-29 16:54:38 +08:00
yao 33906e36f4 代理功能扩展,对原代理提供商进行拆分,加入lombok 2020-12-29 16:18:43 +08:00
yao 0aa2c3949d 刷新代理api重构,需要提供旧代理,如果依然是旧代理,才进行刷新,防止应延迟响应造成的过度刷新 2020-12-22 18:19:37 +08:00
yao 2e2a0fdf3e Downloader 提供刷新组件的api,方便在spider中操作 2020-12-21 18:08:55 +08:00
yao 19465089c3 提供异常刷新httpClient,异常可配置,重写getHttpClient代码 2020-12-21 16:02:35 +08:00
yao 9cc5287743 简化代码 2020-12-21 14:58:01 +08:00
yao 4a6441e7c5 提供出现某种异常刷新代理,异常可配置 2020-12-21 14:52:25 +08:00
yao ba69eba669 代理接口的修改,提供刷星代理API。downloader 下载错误时,提供request,exception,proxyProvider三个参数, 2020-12-21 14:36:44 +08:00
Sutra Zhou b2aa5e2677
Merge pull request #974 from itranlin/master
子任务可以使用不同的下载器
2020-12-19 18:12:17 +08:00
itranlin fc7ae9ce28 子任务可以使用不同的下载器。。。 2020-12-19 17:59:52 +08:00
yao 9a71f0ac92 pageCount修改 2020-12-15 17:05:16 +08:00
Sutra Zhou 4b902270b4 Bump version number from 0.7.3 to 0.7.4. 2020-10-27 09:01:21 +08:00
Sutra Zhou 1895dba696
Merge pull request #959 from code4craft/snyk-fix-ad3d8147b4d9640ed91bfc34fbfb9e38
[Snyk] Security upgrade com.google.guava:guava from 29.0-jre to 30.0-android
2020-10-24 15:52:10 +08:00
snyk-bot c87af365d6
fix: pom.xml to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-JAVA-COMGOOGLEGUAVA-1015415
2020-10-24 03:02:20 +00:00
Sutra Zhou 541f421011
Merge pull request #957 from code4craft/snyk-fix-9aefd7a47e1f776c68e7fad3e85033a4
[Snyk] Security upgrade junit:junit from 4.13 to 4.13.1
2020-10-14 10:23:49 +08:00
snyk-bot 2223552aeb
fix: pom.xml to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-JAVA-JUNIT-1017047
2020-10-14 00:46:32 +00:00
Sutra Zhou 50c4e8ccfe
Merge pull request #955 from code4craft/snyk-fix-5cfb952010fd88492e4e2d5516c6237d
[Snyk] Security upgrade org.apache.httpcomponents:httpclient from 4.5.12 to 4.5.13
2020-10-10 11:00:09 +08:00
snyk-bot 7f737626b1
fix: pom.xml to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-JAVA-ORGAPACHEHTTPCOMPONENTS-1016906
2020-10-10 02:58:58 +00:00
Sutra Zhou b4b1df85a0 Fix TLSv1.3. Maybe we should expose a API to allow user to use org.apache.http.ssl.SSLContextBuilder. Fixes #948. 2020-09-21 17:48:59 +08:00
Sutra Zhou 94ac7ca3b6
Merge pull request #946 from code4craft/snyk-fix-84ae966502c97386fb9c384e5874967b
[Snyk] Security upgrade com.alibaba:fastjson from 1.2.68 to 1.2.69
2020-09-09 11:02:19 +08:00
snyk-bot e3b3b9afdd
fix: pom.xml to reduce vulnerabilities
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-JAVA-COMALIBABA-570967
2020-09-09 02:59:41 +00:00
Sutra Zhou 96ebe608cc
Merge pull request #939 from bytesgo/manage-version
build: manage plugin version & remove build WARNING
2020-08-07 17:38:17 +08:00
leeyazhou 9aab25f339 build: manage plugin version & remove build WARNING
## use the new dependency of commons-io

[WARNING] The artifact org.apache.commons:commons-io:jar:1.3.2 has been
relocated to commons-io:commons-io:jar:1.3.2

## manage plugin version of maven-jar-plugin and maven-deploy-plugin

[WARNING]
[WARNING] Some problems were encountered while building the effective
model for us.codecraft:webmagic-core:jar:0.7.3
[WARNING] 'build.plugins.plugin.version' for
org.apache.maven.plugins:maven-jar-plugin is missing. @
us.codecraft:webmagic-parent:0.7.3, /opt/code/git/webmagic/pom.xml, line
263, column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective
model for us.codecraft:webmagic-extension:jar:0.7.3
[WARNING] 'build.plugins.plugin.version' for
org.apache.maven.plugins:maven-jar-plugin is missing. @
us.codecraft:webmagic-parent:0.7.3, /opt/code/git/webmagic/pom.xml, line
263, column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective
model for us.codecraft:webmagic-scripts:jar:0.7.3
[WARNING] 'build.plugins.plugin.version' for
org.apache.maven.plugins:maven-jar-plugin is missing. @ line 61, column
21
[WARNING]
[WARNING] Some problems were encountered while building the effective
model for us.codecraft:webmagic-selenium:jar:0.7.3
[WARNING] 'build.plugins.plugin.version' for
org.apache.maven.plugins:maven-jar-plugin is missing. @
us.codecraft:webmagic-parent:0.7.3, /opt/code/git/webmagic/pom.xml, line
263, column 21
[WARNING] 'build.plugins.plugin.version' for
org.apache.maven.plugins:maven-deploy-plugin is missing. @ line 34,
column 12
[WARNING]
[WARNING] Some problems were encountered while building the effective
model for us.codecraft:webmagic-saxon:jar:0.7.3
[WARNING] 'build.plugins.plugin.version' for
org.apache.maven.plugins:maven-jar-plugin is missing. @
us.codecraft:webmagic-parent:0.7.3, /opt/code/git/webmagic/pom.xml, line
263, column 21
[WARNING] 'build.plugins.plugin.version' for
org.apache.maven.plugins:maven-deploy-plugin is missing. @ line 34,
column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective
model for us.codecraft:webmagic-samples:jar:0.7.3
[WARNING] 'build.plugins.plugin.version' for
org.apache.maven.plugins:maven-jar-plugin is missing. @
us.codecraft:webmagic-parent:0.7.3, /opt/code/git/webmagic/pom.xml, line
263, column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective
model for us.codecraft:webmagic-parent:pom:0.7.3
[WARNING] 'build.plugins.plugin.version' for
org.apache.maven.plugins:maven-jar-plugin is missing. @ line 263, column
21
[WARNING]
[WARNING] It is highly recommended to fix these problems because they
threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support
building such malformed projects.
2020-08-07 16:36:32 +08:00
Sutra Zhou 48bc73fbff New method Proxy#create. 2020-06-24 13:43:16 +08:00
Sutra Zhou 6d3f2d9b64 Wrap URISyntaxException as IllegalArgumentException for Proxy#toURI. 2020-06-24 13:24:45 +08:00
Sutra Zhou 236e5ade44 Update Proxy#toString(). 2020-06-17 11:19:37 +08:00
Sutra Zhou 791323a5b0 Add Proxy#scheme. 2020-06-16 14:45:29 +08:00
Sutra Zhou 2413366adb Format code, no actual code changed. 2020-06-15 20:01:14 +08:00
Sutra Zhou 5d14efc50f Serialize request URL only in FileCacheQueueScheduler. 2020-06-14 00:20:39 +08:00
Sutra Zhou 7945c0612d Merge branch 'master' of github.com:code4craft/webmagic 2020-05-30 02:10:25 +08:00