yihua.huang
|
1104122979
|
more abstraction in scheduler
|
2014-04-27 09:30:01 +08:00 |
yihua.huang
|
b0fb1c3e10
|
remove copy-dependcies plugin for m2e error
|
2014-04-27 08:22:33 +08:00 |
yihua.huang
|
94a67165e1
|
remove jmx server for simplify #98
|
2014-04-26 20:17:52 +08:00 |
yihua.huang
|
86a45a6643
|
change SpiderMonitor to singleton #98
|
2014-04-26 18:14:25 +08:00 |
yihua.huang
|
ab4d36806e
|
clean code
|
2014-04-26 11:45:21 +08:00 |
yihua.huang
|
04fde8203b
|
add control for monitor
|
2014-04-26 11:44:14 +08:00 |
yihua.huang
|
2770811a10
|
update monitor example
|
2014-04-26 11:24:22 +08:00 |
yihua.huang
|
17e95f2a7f
|
comments
|
2014-04-25 18:39:01 +08:00 |
yihua.huang
|
375e64e845
|
more monitor status
|
2014-04-25 18:10:14 +08:00 |
yihua.huang
|
c6661899fd
|
new thread pool #110
|
2014-04-25 17:33:48 +08:00 |
yihua.huang
|
179baa7a22
|
return when page is null
|
2014-04-25 16:07:41 +08:00 |
yihua.huang
|
4738ae2d14
|
change url find to match #94
|
2014-04-25 16:04:41 +08:00 |
yihua.huang
|
f973889cda
|
refactor subpageprossor etc. #94
|
2014-04-25 15:48:05 +08:00 |
yihua.huang
|
acb63d55d7
|
some check and example #98
|
2014-04-25 13:26:08 +08:00 |
yihua.huang
|
11ba5beb42
|
[refactor]move monitor to webmagic-extension #98
|
2014-04-25 13:17:13 +08:00 |
yihua.huang
|
b06aa489fb
|
[BugFix]Only one url from sourceRegion can be extracted #107
|
2014-04-18 17:48:26 +08:00 |
yihua.huang
|
023c2ac84e
|
spider config draft
|
2014-04-17 16:44:32 +08:00 |
yihua.huang
|
a5db6cf292
|
some monitor and JMX support #98
|
2014-04-17 00:35:09 +08:00 |
yihua.huang
|
aae1ab2cd6
|
fix compile error
|
2014-04-16 18:14:13 +08:00 |
yihua.huang
|
1fbfc92de2
|
Inherit support of Field annotation in Model #103
|
2014-04-16 18:13:44 +08:00 |
yihua.huang
|
3a79b1b64a
|
[Bugfix]formatter property does not work when field is String#100
|
2014-04-13 23:02:34 +08:00 |
Yihua Huang
|
cc9d319fd9
|
Merge pull request #94 from sebastian1118/master
update:PatternHandler
|
2014-04-13 13:16:20 +08:00 |
yihua.huang
|
03c251237b
|
add Json parse support
|
2014-04-13 10:23:00 +08:00 |
Tian
|
99e12aafaa
|
update:PatternHandler
|
2014-04-13 10:14:39 +08:00 |
yihua.huang
|
c1e7207869
|
add FileCacheQueueScheduler support for cycleRetryTimes
|
2014-04-07 11:00:09 +08:00 |
yihua.huang
|
969ad1766b
|
change logger style to slf4j for cleaner code
|
2014-04-06 21:32:20 +08:00 |
yihua.huang
|
9b2cb43f47
|
ConfigurablePageProcessor #91
|
2014-04-05 23:40:10 +08:00 |
Bo LIANG
|
159eeea2f5
|
Remove unused variable to make the project cleaner.
|
2014-04-05 18:32:12 +08:00 |
yihua.huang
|
c143fc662c
|
add SubPageProcessor #86
|
2014-04-05 18:17:48 +08:00 |
Yihua Huang
|
474f785dab
|
Merge pull request #86 from sebastian1118/master
new feature: PatternProcessor
|
2014-04-04 23:41:27 +08:00 |
Tian
|
38a12f8641
|
new feature: PatternProcessor
|
2014-04-04 22:02:52 +08:00 |
yihua.huang
|
dafd0b5875
|
[BugFix]multi model in one pageprocessor will be skipped #85
|
2014-04-04 20:36:31 +08:00 |
yihua.huang
|
8958d774f2
|
add default values for @Formatter
|
2014-03-24 13:52:17 +08:00 |
yihua.huang
|
6c11718566
|
Clean project structure #70
|
2014-03-14 23:24:38 +08:00 |
yihua.huang
|
0e98183f74
|
Change log4j to slf4j #55
|
2014-02-12 09:35:57 +08:00 |
yihua.huang
|
fa33b15843
|
property loader
|
2014-02-11 23:07:31 +08:00 |
yihua.huang
|
362fdd0662
|
Merge branch 'master' of github.com:code4craft/webmagic
|
2014-02-11 22:23:56 +08:00 |
yihua.huang
|
af809c4d55
|
update version to 0.5.0-snapshot
|
2014-02-11 22:16:01 +08:00 |
jon
|
a722f9bb66
|
修复由于FileCacheQueueScheduler中fileCursor 文件再次打开时没有初始化抛出NullPointerException的错误
|
2014-01-08 21:24:58 +08:00 |
yihua.huang
|
486d9d276f
|
#45 Remove multi in ExtractBy
|
2013-11-28 18:23:51 +08:00 |
yihua.huang
|
18a3af4a0a
|
add more sample for jsonpath #42
|
2013-11-28 09:58:22 +08:00 |
yihua.huang
|
59ad4cad27
|
#42 Add jsonpath in annotation mode for json result
|
2013-11-28 08:25:16 +08:00 |
yihua.huang
|
cf62d707e0
|
#36 Spider does not exit when success
|
2013-11-27 23:33:18 +08:00 |
yihua.huang
|
a01312930a
|
#39 Parsing html after page.getHtml()
|
2013-11-27 22:01:34 +08:00 |
yihua.huang
|
b838c4e433
|
#34 Close reader in FileCacheQueueScheduler
|
2013-11-08 14:59:09 +08:00 |
yihua.huang
|
fd6d2fd6f8
|
try to keepalive TCP connection
|
2013-11-06 21:19:14 +08:00 |
yihua.huang
|
e046bb0723
|
remove useless code
|
2013-11-06 12:48:14 +08:00 |
yihua.huang
|
6e32a19f80
|
update api for direct download
|
2013-11-06 12:46:50 +08:00 |
yihua.huang
|
807aefe9df
|
change EntityUtil to IOUtil because some encoding error
|
2013-11-06 07:37:34 +08:00 |
yihua.huang
|
8f774afc84
|
add direct download
|
2013-11-06 06:41:04 +08:00 |