Commit Graph

198 Commits (3c1338193b1bec3e52b47466d3ec1a5b3bfc3da5)

Author SHA1 Message Date
yihua.huang 5f8c3fd5c5 update version 2014-06-04 17:33:30 +08:00
yihua.huang 928f98dd93 auto create folder in JsonFilePipeline #122 2014-05-08 15:12:17 +08:00
yihua.huang 7fbe18b8c0 implementation of PageMapper #120 2014-05-05 08:01:39 +08:00
yihua.huang 5dc9fe95a9 interface of PageMapper #120 2014-05-05 07:43:32 +08:00
yihua.huang 7668731f08 update version to snapshot 2014-05-05 07:03:55 +08:00
yihua.huang 182dd51689 Merge branch 'stable' of github.com:code4craft/webmagic 2014-05-03 06:19:11 +08:00
yihua.huang 81e6e772ac versions back to 0.5.1 2014-05-03 06:18:57 +08:00
yihua.huang feb604da87 Merge branch 'stable' of github.com:code4craft/webmagic 2014-05-03 06:14:54 +08:00
yihua.huang 358e906379 [maven-release-plugin] prepare for next development iteration 2014-05-03 00:00:13 +08:00
yihua.huang 470750fc0d [maven-release-plugin] prepare release WebMagic-0.5.1 2014-05-02 23:59:55 +08:00
yihua.huang 186b90512e refactor redisscheduler #118 2014-05-02 20:24:15 +08:00
yihua.huang d1140b9e29 add bloom filter for scheduler #118 2014-05-02 20:20:22 +08:00
yihua.huang e8d4a9be2b fix remove duplicate error #117 2014-04-29 20:32:06 +08:00
yihua.huang 04ade75606 Merge branch 'stable' of github.com:code4craft/webmagic
Conflicts:
	README.md
	pom.xml
	webmagic-avalon/pom.xml
	webmagic-core/pom.xml
	webmagic-extension/pom.xml
	webmagic-lucene/pom.xml
	webmagic-samples/pom.xml
	webmagic-saxon/pom.xml
	webmagic-scripts/pom.xml
	webmagic-selenium/pom.xml
2014-04-27 15:03:15 +08:00
yihua.huang a08d8cb167 update verion 2014-04-27 14:59:48 +08:00
yihua.huang 42a2676e8c update version 2014-04-27 14:56:21 +08:00
yihua.huang c25b32f1ca [maven-release-plugin] prepare for next development iteration 2014-04-27 12:52:27 +08:00
yihua.huang 7ff83bb11a [maven-release-plugin] prepare release WebMagic-0.5.0 2014-04-27 12:52:12 +08:00
yihua.huang 1104122979 more abstraction in scheduler 2014-04-27 09:30:01 +08:00
yihua.huang b0fb1c3e10 remove copy-dependcies plugin for m2e error 2014-04-27 08:22:33 +08:00
yihua.huang 94a67165e1 remove jmx server for simplify #98 2014-04-26 20:17:52 +08:00
yihua.huang 86a45a6643 change SpiderMonitor to singleton #98 2014-04-26 18:14:25 +08:00
yihua.huang ab4d36806e clean code 2014-04-26 11:45:21 +08:00
yihua.huang 04fde8203b add control for monitor 2014-04-26 11:44:14 +08:00
yihua.huang 2770811a10 update monitor example 2014-04-26 11:24:22 +08:00
yihua.huang 17e95f2a7f comments 2014-04-25 18:39:01 +08:00
yihua.huang 375e64e845 more monitor status 2014-04-25 18:10:14 +08:00
yihua.huang c6661899fd new thread pool #110 2014-04-25 17:33:48 +08:00
yihua.huang 179baa7a22 return when page is null 2014-04-25 16:07:41 +08:00
yihua.huang 4738ae2d14 change url find to match #94 2014-04-25 16:04:41 +08:00
yihua.huang f973889cda refactor subpageprossor etc. #94 2014-04-25 15:48:05 +08:00
yihua.huang acb63d55d7 some check and example #98 2014-04-25 13:26:08 +08:00
yihua.huang 11ba5beb42 [refactor]move monitor to webmagic-extension #98 2014-04-25 13:17:13 +08:00
yihua.huang b06aa489fb [BugFix]Only one url from sourceRegion can be extracted #107 2014-04-18 17:48:26 +08:00
yihua.huang 023c2ac84e spider config draft 2014-04-17 16:44:32 +08:00
yihua.huang a5db6cf292 some monitor and JMX support #98 2014-04-17 00:35:09 +08:00
yihua.huang aae1ab2cd6 fix compile error 2014-04-16 18:14:13 +08:00
yihua.huang 1fbfc92de2 Inherit support of Field annotation in Model #103 2014-04-16 18:13:44 +08:00
yihua.huang a03f6a8431 eclipse project 2014-04-15 07:44:43 +08:00
yihua.huang 3a79b1b64a [Bugfix]formatter property does not work when field is String#100 2014-04-13 23:02:34 +08:00
Yihua Huang cc9d319fd9 Merge pull request #94 from sebastian1118/master
update:PatternHandler
2014-04-13 13:16:20 +08:00
yihua.huang 03c251237b add Json parse support 2014-04-13 10:23:00 +08:00
Tian 99e12aafaa update:PatternHandler 2014-04-13 10:14:39 +08:00
yihua.huang c1e7207869 add FileCacheQueueScheduler support for cycleRetryTimes 2014-04-07 11:00:09 +08:00
yihua.huang 969ad1766b change logger style to slf4j for cleaner code 2014-04-06 21:32:20 +08:00
yihua.huang 9b2cb43f47 ConfigurablePageProcessor #91 2014-04-05 23:40:10 +08:00
Bo LIANG 159eeea2f5 Remove unused variable to make the project cleaner. 2014-04-05 18:32:12 +08:00
yihua.huang c143fc662c add SubPageProcessor #86 2014-04-05 18:17:48 +08:00
Yihua Huang 474f785dab Merge pull request #86 from sebastian1118/master
new feature: PatternProcessor
2014-04-04 23:41:27 +08:00
Tian 38a12f8641 new feature: PatternProcessor 2014-04-04 22:02:52 +08:00
yihua.huang dafd0b5875 [BugFix]multi model in one pageprocessor will be skipped #85 2014-04-04 20:36:31 +08:00
yihua.huang a1c7e826f7 fix dep of slf4j-log4j12 2014-04-03 23:04:31 +08:00
yihua.huang f3c2503a29 add warning of slf4j #78 2014-04-01 07:42:23 +08:00
yihua.huang 8958d774f2 add default values for @Formatter 2014-03-24 13:52:17 +08:00
yihua.huang 6c11718566 Clean project structure #70 2014-03-14 23:24:38 +08:00
yihua.huang 98e2bba099 Merge branch 'master' of github.com:code4craft/webmagic
Conflicts:
	README.md
	pom.xml
	webmagic-core/pom.xml
	webmagic-extension/pom.xml
	webmagic-scripts/pom.xml
2014-03-13 08:07:33 +08:00
yihua.huang 757cc9b942 [maven-release-plugin] prepare for next development iteration 2014-03-13 07:49:51 +08:00
yihua.huang 63ffb5c792 [maven-release-plugin] prepare release webmaigc-0.4.3 2014-03-13 07:49:27 +08:00
yihua.huang d5a978e00f update version back to 0.4.3 2014-03-13 06:55:05 +08:00
yihua.huang 0e98183f74 Change log4j to slf4j #55 2014-02-12 09:35:57 +08:00
yihua.huang fa33b15843 property loader 2014-02-11 23:07:31 +08:00
yihua.huang 362fdd0662 Merge branch 'master' of github.com:code4craft/webmagic 2014-02-11 22:23:56 +08:00
yihua.huang af809c4d55 update version to 0.5.0-snapshot 2014-02-11 22:16:01 +08:00
jon a722f9bb66 修复由于FileCacheQueueScheduler中fileCursor 文件再次打开时没有初始化抛出NullPointerException的错误 2014-01-08 21:24:58 +08:00
yihua.huang 12a6390cbd update spring4 configuration 2013-12-18 01:02:59 +08:00
yihua.huang fc97cb58c5 update lib and version 2013-12-04 00:04:29 +08:00
yihua.huang d274310cb2 [maven-release-plugin] prepare for next development iteration 2013-12-03 23:35:06 +08:00
yihua.huang e8c32a32dc [maven-release-plugin] prepare release webmagic-0.4.2 2013-12-03 23:34:57 +08:00
yihua.huang 486d9d276f #45 Remove multi in ExtractBy 2013-11-28 18:23:51 +08:00
yihua.huang e7083dc39d [maven-release-plugin] prepare for next development iteration 2013-11-28 13:04:32 +08:00
yihua.huang ae623567b3 [maven-release-plugin] prepare release webmagic-0.4.1 2013-11-28 13:04:22 +08:00
yihua.huang 18a3af4a0a add more sample for jsonpath #42 2013-11-28 09:58:22 +08:00
yihua.huang 59ad4cad27 #42 Add jsonpath in annotation mode for json result 2013-11-28 08:25:16 +08:00
yihua.huang cf62d707e0 #36 Spider does not exit when success 2013-11-27 23:33:18 +08:00
yihua.huang a01312930a #39 Parsing html after page.getHtml() 2013-11-27 22:01:34 +08:00
yihua.huang f9daae39cf [maven-release-plugin] prepare for next development iteration 2013-11-11 14:33:11 +08:00
yihua.huang fdb9441519 [maven-release-plugin] prepare release webmagic-0.4.0 2013-11-11 14:33:01 +08:00
yihua.huang 1d75ae7f5b rollback version to 0.4.0 because not deploy success 2013-11-11 11:52:56 +08:00
yihua.huang b838c4e433 #34 Close reader in FileCacheQueueScheduler 2013-11-08 14:59:09 +08:00
yihua.huang 775eb9732f [maven-release-plugin] prepare for next development iteration 2013-11-06 22:17:58 +08:00
yihua.huang 0b4fadc24d [maven-release-plugin] prepare release webmagic-0.4.0 2013-11-06 22:17:47 +08:00
yihua.huang fd6d2fd6f8 try to keepalive TCP connection 2013-11-06 21:19:14 +08:00
yihua.huang 425df08523 update version to 0.4.0 2013-11-06 12:50:45 +08:00
yihua.huang e046bb0723 remove useless code 2013-11-06 12:48:14 +08:00
yihua.huang 6e32a19f80 update api for direct download 2013-11-06 12:46:50 +08:00
yihua.huang 807aefe9df change EntityUtil to IOUtil because some encoding error 2013-11-06 07:37:34 +08:00
yihua.huang 8f774afc84 add direct download 2013-11-06 06:41:04 +08:00
yihua.huang 2e496402dc add more warning for 0.3.3 2013-10-24 13:16:48 +08:00
yihua.huang 1a2c84ea78 #27 add timeout config to site 2013-10-11 07:36:16 +08:00
yihua.huang 3b00190f99 api without implementation for #28: add specific url crawl 2013-10-10 00:40:44 +08:00
yihua.huang 4acbc19cee [maven-release-plugin] prepare for next development iteration 2013-09-23 13:12:32 +08:00
yihua.huang cc3b787991 [maven-release-plugin] prepare release webmagic-0.3.2 2013-09-23 13:12:19 +08:00
yihua.huang 6f18eec77e fix a test error 2013-09-23 13:07:33 +08:00
yihua.huang b131878123 add example 2013-09-23 13:01:28 +08:00
yihua.huang 95ab4edec3 some bugfix 2013-09-23 08:38:54 +08:00
yihua.huang 250cc5e662 change formatter to class 2013-09-23 08:17:21 +08:00
yihua.huang b18216245b add type convert 2013-09-23 07:53:33 +08:00
yihua.huang fb693a4ac4 [maven-release-plugin] prepare for next development iteration 2013-09-08 22:25:07 +08:00
yihua.huang bfaaa042b9 [maven-release-plugin] prepare release webmagic-parent-0.3.1 2013-09-08 22:24:48 +08:00
yihua.huang d7c7a78177 complete test cases 2013-09-08 22:19:02 +08:00