Commit Graph

189 Commits (49a5efff46ec604578d6cb98015a8700bdf1fa21)

Author SHA1 Message Date
Sutra Zhou 16a4fe3e28 Use oxerr-parent instead. 2024-05-17 13:17:13 +08:00
François Gibier 2df7dca871
Changed refactor of processSingle again, this one is a better version (#1157)
* Refactor of processSingle in PageModelExtractor

* Changed my refactor of processSingle, this one is a lot better

* Changed my refactor of processSingle, this one is a lot better
2024-04-05 22:50:21 +08:00
François Gibier 05e5eefc7d
Refactor of processSingle in PageModelExtractor (#1155) 2024-04-05 21:51:08 +08:00
Sutra Zhou 4ebf48f6e3 Replace log4j 1.x with log4j 2.x, refs #534. 2024-04-03 18:26:01 +08:00
Sutra Zhou 31548deb93
Revert "Refactored code for increased optimization. (#1139)" (#1153)
This reverts commit f051d978e2.
2024-03-30 14:37:55 +08:00
Parthgajera056 f051d978e2
Refactored code for increased optimization. (#1139)
* refactoring by decompose conditional technique

* refactoring by introduction explaining variable technique

* refactoring by rename method/variable technique

* refactoring by introducing explaining variable technique

* Added Extract class refactoring to increase maintainablilty

* Refactoring using replace conditional with polymorphism
2024-03-30 14:28:02 +08:00
ayushi250317 9b9f173c1c
Refactored Code to increase maintainability (#1152)
* Initial Commit

* Assignment 1 Submission

* Resolving Implementation Smells

* Refactoring Code to increase maintainability
2024-03-30 14:26:41 +08:00
ayushi250317 28ac8bf9c4
Refactored Code to Resolve Implementation Code Smells (#1151)
* Initial Commit

* Assignment 1 Submission

* Resolving Implementation Smells
2024-03-29 00:45:12 +08:00
Sutra Zhou e4ab6e27e4 Optimize Request#extras, refs #1148. 2024-03-03 18:35:25 +08:00
Sutra Zhou 67644de3d9 Expose Page to onSuccess & onError. 2023-11-20 18:28:27 +08:00
Joe Zhou ef616c999e Fix warnings. 2022-11-27 02:05:31 +08:00
Sutra Zhou d2b2eed9df Pass the task to onSuccess & onError. 2022-10-19 22:15:41 +08:00
vio.ao 7a62a6cb45 Revert "Revert "Common the downloader status process and pass error information when …""
This reverts commit acfbd7b883.
2022-10-01 17:33:11 +08:00
Sutra Zhou acfbd7b883
Revert "Common the downloader status process and pass error information when …" 2022-10-01 10:37:09 +08:00
vio.ao d01f26333b Common the downloader status process and pass error information when onError 2022-10-01 00:21:17 +08:00
David Hsing 54da7af17e change dependency versions into properties
change dependency versions into properties
update commons-collections from 3.x to 4.4
2022-05-03 17:42:42 +08:00
wecandoitjustthink 528a8908af 增加了List<SpiderStatusMXBean>属性的get方法,供SpiderMonitor的子类获取. 2021-02-27 19:59:05 +13:00
Sutra Zhou 71b7dfbf9a
Merge pull request #993 from yqia182/master
SpiderStatus中getPagePerSecond()方法,增加验证逻辑,避免空指针,避免除数为零。
2021-02-03 10:13:50 +08:00
JustThink 54127318a4 SpiderStatus中getPagePerSecond()方法,增加验证逻辑,避免空指针,避免除数为零。 2021-02-03 02:43:53 +13:00
Sutra Zhou 0e01550a79 Upgrade dependencies, including the jedis from 2.9.3 to 3.4.1. 2021-01-06 03:21:10 +08:00
Sutra Zhou c489647c4b Revert " Downloader 提供刷新组件的api,方便在spider中操作"
This reverts commit 2e2a0fdf3e.
2021-01-02 20:15:10 +08:00
yao 2e2a0fdf3e Downloader 提供刷新组件的api,方便在spider中操作 2020-12-21 18:08:55 +08:00
Sutra Zhou 5d14efc50f Serialize request URL only in FileCacheQueueScheduler. 2020-06-14 00:20:39 +08:00
Sutra Zhou ba1b4017a7 Mark slf4j-log4j12 as optional. 2020-05-21 19:59:29 +08:00
Sutra Zhou b98a87e45a Serialize requests in FileCacheQueueScheduler, so that the extra info of request could be restored. 2020-04-11 20:21:20 +08:00
yihua.huang c701fe8d38 #702 Refactor: rename CheckForAdditionalInfo to checkForAdditionalInfo 2017-11-30 11:50:52 +08:00
yihy 266083fa07 [Fix] #698  Repair using redis,Request additional information is lost 2017-11-29 20:19:00 +08:00
yihua.huang e5db538c19 #647 remove ThreadSafe annotation 2017-11-29 13:49:40 +08:00
yihua.huang 76766a7c77 RequestUtils for range #222 2017-06-05 17:13:33 +08:00
yihua.huang 5de6205563 use RawText as source for JsonPath #589 2017-06-04 07:24:56 +08:00
yihua.huang 211ac74389 use group 0 of targetUrl instead of group 1 #588 2017-06-04 07:11:13 +08:00
yihua.huang 9b77306098 test of ExtractByUrl #586 2017-06-04 07:03:20 +08:00
yihua.huang d8bd0637a1 add custom formatter test #586 2017-06-04 06:51:41 +08:00
yihua.huang 6cc647577a fix test for ExtractLinks #586 2017-06-03 22:38:42 +08:00
yihua.huang a6f8ed5476 complete formatter refactor by ObjectFormatterBuilder #586 2017-06-03 22:07:53 +08:00
yihua.huang b1ef61b278 add tests before refactor #586 2017-06-03 21:33:18 +08:00
yihua.huang b363ee6a9d move constant to class 2017-06-03 20:33:28 +08:00
yihua.huang df682857a7 fix test and some refactor 2017-06-03 20:19:41 +08:00
yihua.huang 0e6eb46eba add test for jsonpath in pagemodel #462 2017-06-03 16:47:21 +08:00
yihua.huang f5018d569e remove test_issue409 #409 2017-06-03 15:10:25 +08:00
yihua.huang 17ae500c77 test_issue409 #409 2017-06-03 15:09:51 +08:00
yihua.huang 4cd5b4f93e test_issue409 2017-06-03 15:02:33 +08:00
yihua.huang fb0acd710c complete SimpleHttpClientTest 2017-06-03 14:59:08 +08:00
yihua.huang d07941d900 SimpleHttpClientTest 2017-06-03 14:58:34 +08:00
yihua.huang f02f469c69 add test #570 2017-06-03 11:02:31 +08:00
yihua.huang 2d693580fc add test 2017-06-01 22:28:03 +08:00
yihua.huang b879b0eed0 fix redisscheduler #583 2017-06-01 22:25:01 +08:00
Yihua Huang 9903f0367d Merge pull request #570 from SoulZhong/master
修复formatter初始化未传参bug
2017-05-29 17:49:42 +08:00
yihua.huang 2e35e149be for 0.7.1 2017-05-29 14:41:49 +08:00
yihua.huang 49de9374cd new SimpleHttpClient #576 2017-05-27 17:30:19 +08:00