Commit Graph

274 Commits (110412297925415549cca50cfc0be78c30360171)

Author SHA1 Message Date
yihua.huang 1104122979 more abstraction in scheduler 2014-04-27 09:30:01 +08:00
yihua.huang 2770811a10 update monitor example 2014-04-26 11:24:22 +08:00
yihua.huang 5ecd909ef2 add timeout for wait/notify #111 2014-04-25 19:41:36 +08:00
yihua.huang c7afdb516e remove thread utils #110 2014-04-25 18:44:45 +08:00
yihua.huang 17e95f2a7f comments 2014-04-25 18:39:01 +08:00
yihua.huang 05eb7831b6 refactor and comments #110 2014-04-25 18:27:40 +08:00
yihua.huang 375e64e845 more monitor status 2014-04-25 18:10:14 +08:00
yihua.huang 018061d2cd fix error in thread pool 2014-04-25 18:01:02 +08:00
yihua.huang cdc423f2bf log 2014-04-25 17:41:41 +08:00
yihua.huang c6661899fd new thread pool #110 2014-04-25 17:33:48 +08:00
yihua.huang 179baa7a22 return when page is null 2014-04-25 16:07:41 +08:00
yihua.huang 0336f4cdb4 remove IllegalStateException when download error for less error log 2014-04-25 16:06:29 +08:00
yihua.huang 11ba5beb42 [refactor]move monitor to webmagic-extension #98 2014-04-25 13:17:13 +08:00
yihua.huang d61f65cef8 update mbean to mxbean #98 2014-04-25 11:31:43 +08:00
yihua.huang ad6a273b12 update test url 2014-04-25 11:28:35 +08:00
yihua.huang 30af23d003 split monitor to server and client mode #98 2014-04-25 11:25:52 +08:00
yihua.huang ced79630d3 specify jndi and jmx #98 2014-04-25 11:11:15 +08:00
yihua.huang 95d3802e77 add formdata support for post request #108 2014-04-24 11:48:58 +08:00
yihua.huang f49bb877c8 clean some code #109 2014-04-24 11:38:13 +08:00
yihua.huang e1aaf1dd11 fix mistake of guava Table #109 2014-04-24 11:05:49 +08:00
yihua.huang 8ba2da146c request method #108 and more cookie #109 config 2014-04-24 10:51:37 +08:00
yihua.huang b06aa489fb [BugFix]Only one url from sourceRegion can be extracted #107 2014-04-18 17:48:26 +08:00
Bo LIANG 08fa3b01c1 when download error, throw an exception instead of calling onError and returning peacefully. #105 2014-04-17 17:53:12 +08:00
yihua.huang 27b37e8164 extension point and sample for JMX support #98 2014-04-17 08:12:37 +08:00
yihua.huang a5db6cf292 some monitor and JMX support #98 2014-04-17 00:35:09 +08:00
yihua.huang f39aa435cf add null check #104 2014-04-16 19:46:32 +08:00
yihua.huang 42bbe40a37 [Bugfix]Urls will be lost when call setScheduler() #104 2014-04-16 19:45:17 +08:00
Bo LIANG 163773af6b combine two try-catch block into one, make it cleaner. 2014-04-16 16:05:08 +08:00
yihua.huang ec446277b1 some refactor in httpclientdownloader 2014-04-15 15:30:37 +08:00
yihua.huang 4a035e729a extension point for LocalDuplicatedRemovedScheduler #95 2014-04-13 23:31:13 +08:00
yihua.huang b249e49748 [Bugfix]loop error when add TargetRequest #99 2014-04-13 23:04:09 +08:00
Yihua Huang da2f023c12 Merge pull request #96 from ouyanghuangzheng/master
修改了Spider 和site  几处注释
2014-04-13 13:12:12 +08:00
yihua.huang f7950ebcab fix tests 2014-04-13 13:00:31 +08:00
愤怒的番茄 32ba1b8889 修复几处注释问题 2014-04-13 12:41:15 +08:00
yihua.huang 84b897f83b update AngularJSProcessor 2014-04-13 12:20:57 +08:00
yihua.huang 03c251237b add Json parse support 2014-04-13 10:23:00 +08:00
愤怒的番茄 644e8d1f72 同步官方源码 2014-04-12 22:32:22 +08:00
yihua.huang 969ad1766b change logger style to slf4j for cleaner code 2014-04-06 21:32:20 +08:00
yihua.huang 9b2cb43f47 ConfigurablePageProcessor #91 2014-04-05 23:40:10 +08:00
Bo LIANG b043ac76d6 change the formatter of log.
To use slf4j, we should insert {} into the formatter string.
2014-04-05 11:31:56 +08:00
yihua.huang 7aaf837e15 change logger to slf4j style for performance #84 2014-04-04 20:10:00 +08:00
yihua.huang f9b157951d Merge branch 'master' of github.com:code4craft/webmagic 2014-04-04 20:01:14 +08:00
yihua.huang 22c394e629 [doc] 2014-04-04 20:00:58 +08:00
Bo LIANG 762a3973fd Modify the log levels of LocalDuplicatedRemovedScheduler.java
The old version will print a debug level log each time the push method is
called. So sometimes, when a html page have multiple links for the same
page, the debug log will appears more than once. Also, when we meet a
duplicate URL, it will also print a log, which will be confusing.
I change the level of it to trace. And each time a URL is really push into
queue, print a debug level log.
2014-04-04 15:53:46 +08:00
yihua.huang a1c7e826f7 fix dep of slf4j-log4j12 2014-04-03 23:04:31 +08:00
yihua.huang 01848301d4 encode illegal charactors in url #80 2014-04-01 22:14:30 +08:00
yihua.huang 2780423e60 enable blank space in quotes in UrlUtils.fixAllRelativeHrefs #80 2014-04-01 20:35:11 +08:00
yihua.huang 97b6f46280 Bugfix: break loop in addTargetRequests #81 2014-04-01 20:12:25 +08:00
yihua.huang 8d8194bee4 Change HashMap to LinkedHashMap in ResultItems for same order of input and output #76 2014-03-25 08:23:20 +08:00
yihua.huang 8b35d79569 Do not cache document in Selectable for selected Html element #73 2014-03-19 22:19:06 +08:00