Commit Graph

95 Commits (110412297925415549cca50cfc0be78c30360171)

Author SHA1 Message Date
yihua.huang 1104122979 more abstraction in scheduler 2014-04-27 09:30:01 +08:00
yihua.huang b0fb1c3e10 remove copy-dependcies plugin for m2e error 2014-04-27 08:22:33 +08:00
yihua.huang 94a67165e1 remove jmx server for simplify #98 2014-04-26 20:17:52 +08:00
yihua.huang 86a45a6643 change SpiderMonitor to singleton #98 2014-04-26 18:14:25 +08:00
yihua.huang ab4d36806e clean code 2014-04-26 11:45:21 +08:00
yihua.huang 04fde8203b add control for monitor 2014-04-26 11:44:14 +08:00
yihua.huang 2770811a10 update monitor example 2014-04-26 11:24:22 +08:00
yihua.huang 17e95f2a7f comments 2014-04-25 18:39:01 +08:00
yihua.huang 375e64e845 more monitor status 2014-04-25 18:10:14 +08:00
yihua.huang c6661899fd new thread pool #110 2014-04-25 17:33:48 +08:00
yihua.huang 179baa7a22 return when page is null 2014-04-25 16:07:41 +08:00
yihua.huang 4738ae2d14 change url find to match #94 2014-04-25 16:04:41 +08:00
yihua.huang f973889cda refactor subpageprossor etc. #94 2014-04-25 15:48:05 +08:00
yihua.huang acb63d55d7 some check and example #98 2014-04-25 13:26:08 +08:00
yihua.huang 11ba5beb42 [refactor]move monitor to webmagic-extension #98 2014-04-25 13:17:13 +08:00
yihua.huang b06aa489fb [BugFix]Only one url from sourceRegion can be extracted #107 2014-04-18 17:48:26 +08:00
yihua.huang 023c2ac84e spider config draft 2014-04-17 16:44:32 +08:00
yihua.huang a5db6cf292 some monitor and JMX support #98 2014-04-17 00:35:09 +08:00
yihua.huang aae1ab2cd6 fix compile error 2014-04-16 18:14:13 +08:00
yihua.huang 1fbfc92de2 Inherit support of Field annotation in Model #103 2014-04-16 18:13:44 +08:00
yihua.huang 3a79b1b64a [Bugfix]formatter property does not work when field is String#100 2014-04-13 23:02:34 +08:00
Yihua Huang cc9d319fd9 Merge pull request #94 from sebastian1118/master
update:PatternHandler
2014-04-13 13:16:20 +08:00
yihua.huang 03c251237b add Json parse support 2014-04-13 10:23:00 +08:00
Tian 99e12aafaa update:PatternHandler 2014-04-13 10:14:39 +08:00
yihua.huang c1e7207869 add FileCacheQueueScheduler support for cycleRetryTimes 2014-04-07 11:00:09 +08:00
yihua.huang 969ad1766b change logger style to slf4j for cleaner code 2014-04-06 21:32:20 +08:00
yihua.huang 9b2cb43f47 ConfigurablePageProcessor #91 2014-04-05 23:40:10 +08:00
Bo LIANG 159eeea2f5 Remove unused variable to make the project cleaner. 2014-04-05 18:32:12 +08:00
yihua.huang c143fc662c add SubPageProcessor #86 2014-04-05 18:17:48 +08:00
Yihua Huang 474f785dab Merge pull request #86 from sebastian1118/master
new feature: PatternProcessor
2014-04-04 23:41:27 +08:00
Tian 38a12f8641 new feature: PatternProcessor 2014-04-04 22:02:52 +08:00
yihua.huang dafd0b5875 [BugFix]multi model in one pageprocessor will be skipped #85 2014-04-04 20:36:31 +08:00
yihua.huang 8958d774f2 add default values for @Formatter 2014-03-24 13:52:17 +08:00
yihua.huang 6c11718566 Clean project structure #70 2014-03-14 23:24:38 +08:00
yihua.huang 0e98183f74 Change log4j to slf4j #55 2014-02-12 09:35:57 +08:00
yihua.huang fa33b15843 property loader 2014-02-11 23:07:31 +08:00
yihua.huang 362fdd0662 Merge branch 'master' of github.com:code4craft/webmagic 2014-02-11 22:23:56 +08:00
yihua.huang af809c4d55 update version to 0.5.0-snapshot 2014-02-11 22:16:01 +08:00
jon a722f9bb66 修复由于FileCacheQueueScheduler中fileCursor 文件再次打开时没有初始化抛出NullPointerException的错误 2014-01-08 21:24:58 +08:00
yihua.huang 486d9d276f #45 Remove multi in ExtractBy 2013-11-28 18:23:51 +08:00
yihua.huang 18a3af4a0a add more sample for jsonpath #42 2013-11-28 09:58:22 +08:00
yihua.huang 59ad4cad27 #42 Add jsonpath in annotation mode for json result 2013-11-28 08:25:16 +08:00
yihua.huang cf62d707e0 #36 Spider does not exit when success 2013-11-27 23:33:18 +08:00
yihua.huang a01312930a #39 Parsing html after page.getHtml() 2013-11-27 22:01:34 +08:00
yihua.huang b838c4e433 #34 Close reader in FileCacheQueueScheduler 2013-11-08 14:59:09 +08:00
yihua.huang fd6d2fd6f8 try to keepalive TCP connection 2013-11-06 21:19:14 +08:00
yihua.huang e046bb0723 remove useless code 2013-11-06 12:48:14 +08:00
yihua.huang 6e32a19f80 update api for direct download 2013-11-06 12:46:50 +08:00
yihua.huang 807aefe9df change EntityUtil to IOUtil because some encoding error 2013-11-06 07:37:34 +08:00
yihua.huang 8f774afc84 add direct download 2013-11-06 06:41:04 +08:00