??xml version="1.0" encoding="utf-8" standalone="yes"?>BlogJava-SIMONEhttp://www.dentisthealthcenter.com/wangxinsh55/zh-cnThu, 07 Dec 2023 16:14:04 GMTThu, 07 Dec 2023 16:14:04 GMT60nodejs对象{换成字符串代?动态执行字W串代码,requirejs使用r.js打包时动态生成配|文?/title><link>http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/11/01/431944.html</link><dc:creator>SIMONE</dc:creator><author>SIMONE</author><pubDate>Tue, 01 Nov 2016 08:24:00 GMT</pubDate><guid>http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/11/01/431944.html</guid><wfw:comment>http://www.dentisthealthcenter.com/wangxinsh55/comments/431944.html</wfw:comment><comments>http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/11/01/431944.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431944.html</wfw:commentRss><trackback:ping>http://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431944.html</trackback:ping><description><![CDATA[<div style="background-color: #eeeeee; font-size: 13px; border: 1px solid #cccccc; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;"><!--<br /> <br /> Code highlighting produced by Actipro CodeHighlighter (freeware)<br /> http://www.CodeHighlighter.com/<br /> <br /> --><span style="color: #0000FF; ">var</span><span style="color: #000000; "> path </span><span style="color: #000000; ">=</span><span style="color: #000000; "> require('path');<br /> </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> fs </span><span style="color: #000000; ">=</span><span style="color: #000000; "> require('fs');<br /> </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> vm </span><span style="color: #000000; ">=</span><span style="color: #000000; "> require('vm');<br /> </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> os </span><span style="color: #000000; ">=</span><span style="color: #000000; "> require('os');<br /> <br /> </span><span style="color: #008000; ">/*</span><span style="color: #008000; ">*<br />  * 格式化羃q的个数<br />  </span><span style="color: #008000; ">*/</span><span style="color: #000000; "><br /> </span><span style="color: #0000FF; ">function</span><span style="color: #000000; "> toIndent(indent) {<br />     </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> s </span><span style="color: #000000; ">=</span><span style="color: #000000; "> [];<br />     </span><span style="color: #0000FF; ">for</span><span style="color: #000000; "> (</span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> i </span><span style="color: #000000; ">=</span><span style="color: #000000; "> </span><span style="color: #000000; ">0</span><span style="color: #000000; ">; i </span><span style="color: #000000; "><</span><span style="color: #000000; "> indent; i</span><span style="color: #000000; ">++</span><span style="color: #000000; ">) {<br />         s.push('\t');<br />     }<br />     </span><span style="color: #0000FF; ">return</span><span style="color: #000000; "> s.join('');<br /> }<br /> <br /> </span><span style="color: #008000; ">/*</span><span style="color: #008000; ">*<br />  * 数l对象{换成原始字符?br />  </span><span style="color: #008000; ">*/</span><span style="color: #000000; "><br /> </span><span style="color: #0000FF; ">function</span><span style="color: #000000; "> array2string(arr, indent) {<br />     </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> s </span><span style="color: #000000; ">=</span><span style="color: #000000; "> ['[', os.EOL], hasProp </span><span style="color: #000000; ">=</span><span style="color: #000000; "> </span><span style="color: #0000FF; ">false</span><span style="color: #000000; ">;<br />     </span><span style="color: #0000FF; ">for</span><span style="color: #000000; "> (</span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> i </span><span style="color: #000000; ">=</span><span style="color: #000000; "> </span><span style="color: #000000; ">0</span><span style="color: #000000; ">; i </span><span style="color: #000000; "><</span><span style="color: #000000; "> arr.length; i</span><span style="color: #000000; ">++</span><span style="color: #000000; ">) {<br />         </span><span style="color: #0000FF; ">if</span><span style="color: #000000; "> (</span><span style="color: #000000; ">!</span><span style="color: #000000; ">hasProp) {<br />             hasProp </span><span style="color: #000000; ">=</span><span style="color: #000000; "> </span><span style="color: #0000FF; ">true</span><span style="color: #000000; ">;<br />         }<br /> <br />         s.push(toIndent(indent </span><span style="color: #000000; ">+</span><span style="color: #000000; "> </span><span style="color: #000000; ">1</span><span style="color: #000000; ">));<br /> <br />         </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> item </span><span style="color: #000000; ">=</span><span style="color: #000000; "> arr[i];<br />         </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> itemtp </span><span style="color: #000000; ">=</span><span style="color: #000000; "> </span><span style="color: #0000FF; ">typeof</span><span style="color: #000000; ">(item);<br />         </span><span style="color: #0000FF; ">if</span><span style="color: #000000; "> (itemtp </span><span style="color: #000000; ">===</span><span style="color: #000000; "> 'object') {<br />             </span><span style="color: #0000FF; ">if</span><span style="color: #000000; "> (item </span><span style="color: #0000FF; ">instanceof</span><span style="color: #000000; "> Array) {<br />                 s.push(array2string(item, indent </span><span style="color: #000000; ">+</span><span style="color: #000000; "> </span><span style="color: #000000; ">1</span><span style="color: #000000; ">));<br />             } </span><span style="color: #0000FF; ">else</span><span style="color: #000000; "> {<br />                 s.splice(s.length </span><span style="color: #000000; ">-</span><span style="color: #000000; "> </span><span style="color: #000000; ">2</span><span style="color: #000000; ">, </span><span style="color: #000000; ">2</span><span style="color: #000000; ">);<br />                 s.push(object2strng(item, indent).trim());<br />             }<br />         } </span><span style="color: #0000FF; ">else</span><span style="color: #000000; "> {<br />             s.push(JSON.stringify(item));<br />         }<br />         s.push(',');<br />         s.push(os.EOL);<br />     }<br />     </span><span style="color: #0000FF; ">if</span><span style="color: #000000; "> (hasProp) {<br />         s.splice(s.length </span><span style="color: #000000; ">-</span><span style="color: #000000; "> </span><span style="color: #000000; ">2</span><span style="color: #000000; ">, </span><span style="color: #000000; ">1</span><span style="color: #000000; ">);<br />     }<br />     s.push(toIndent(indent));<br />     s.push(']');<br />     </span><span style="color: #0000FF; ">return</span><span style="color: #000000; "> s.join('');<br /> }<br /> <br /> </span><span style="color: #008000; ">/*</span><span style="color: #008000; ">*<br />  * 对象{换成原始字符?br />  </span><span style="color: #008000; ">*/</span><span style="color: #000000; "><br /> </span><span style="color: #0000FF; ">function</span><span style="color: #000000; "> object2strng(obj, indent) {<br />     </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> s </span><span style="color: #000000; ">=</span><span style="color: #000000; "> ['{', os.EOL], hasProp </span><span style="color: #000000; ">=</span><span style="color: #000000; "> </span><span style="color: #0000FF; ">false</span><span style="color: #000000; ">;<br /> <br />     </span><span style="color: #0000FF; ">for</span><span style="color: #000000; "> (</span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> o </span><span style="color: #0000FF; ">in</span><span style="color: #000000; "> obj) {<br />         </span><span style="color: #0000FF; ">if</span><span style="color: #000000; "> (</span><span style="color: #000000; ">!</span><span style="color: #000000; ">hasProp) {<br />             hasProp </span><span style="color: #000000; ">=</span><span style="color: #000000; "> </span><span style="color: #0000FF; ">true</span><span style="color: #000000; ">;<br />         }<br />         s.push(toIndent(indent </span><span style="color: #000000; ">+</span><span style="color: #000000; "> </span><span style="color: #000000; ">1</span><span style="color: #000000; ">));<br />         s.push(JSON.stringify(o));<br />         s.push(':');<br /> <br />         </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> tp </span><span style="color: #000000; ">=</span><span style="color: #000000; "> </span><span style="color: #0000FF; ">typeof</span><span style="color: #000000; ">(obj[o]);<br />         </span><span style="color: #0000FF; ">if</span><span style="color: #000000; "> (tp </span><span style="color: #000000; ">===</span><span style="color: #000000; "> 'object') {<br />             </span><span style="color: #0000FF; ">if</span><span style="color: #000000; "> (obj[o] </span><span style="color: #0000FF; ">instanceof</span><span style="color: #000000; "> Array) {<br />                 s.push(array2string(obj[o], indent </span><span style="color: #000000; ">+</span><span style="color: #000000; "> </span><span style="color: #000000; ">1</span><span style="color: #000000; ">));<br />             } </span><span style="color: #0000FF; ">else</span><span style="color: #000000; "> {<br />                 s.push(object2strng(obj[o], indent </span><span style="color: #000000; ">+</span><span style="color: #000000; "> </span><span style="color: #000000; ">1</span><span style="color: #000000; ">));<br />             }<br />         } </span><span style="color: #0000FF; ">else</span><span style="color: #000000; "> </span><span style="color: #0000FF; ">if</span><span style="color: #000000; "> (tp </span><span style="color: #000000; ">===</span><span style="color: #000000; "> '</span><span style="color: #0000FF; ">function</span><span style="color: #000000; ">') {<br />             s.push(obj[o].toString());<br />         } </span><span style="color: #0000FF; ">else</span><span style="color: #000000; "> {<br />             s.push(JSON.stringify(obj[o]));<br />         }<br />         s.push(',');<br />         s.push(os.EOL);<br />     }<br />     </span><span style="color: #0000FF; ">if</span><span style="color: #000000; "> (hasProp) {<br />         s.splice(s.length </span><span style="color: #000000; ">-</span><span style="color: #000000; "> </span><span style="color: #000000; ">2</span><span style="color: #000000; ">, </span><span style="color: #000000; ">1</span><span style="color: #000000; ">);<br />     }<br />     s.push(toIndent(indent));<br />     s.push('}');<br />     </span><span style="color: #0000FF; ">return</span><span style="color: #000000; "> s.join('');<br /> }<br /> <br /> </span><span style="color: #008000; ">//</span><span style="color: #008000; ">提取正式代码里的requirejs的配|字W串,q动态执行{换成json对象; 修改相关的g息ؓ下边的打包操作做准备; q将配置信息再{成字W串形式写到临时文g?/span><span style="color: #008000; "><br /> </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> mainPath </span><span style="color: #000000; ">=</span><span style="color: #000000; "> path.resolve(process.cwd(), '..</span><span style="color: #000000; ">/</span><span style="color: #000000; ">js</span><span style="color: #000000; ">/</span><span style="color: #000000; ">main.js');<br /> </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> mainContent </span><span style="color: #000000; ">=</span><span style="color: #000000; "> fs.readFileSync(mainPath, 'utf</span><span style="color: #000000; ">-</span><span style="color: #000000; ">8</span><span style="color: #000000; ">').replace(</span><span style="color: #000000; ">/</span><span style="color: #000000; ">(requirejs\.config\()</span><span style="color: #000000; ">?</span><span style="color: #000000; ">([</span><span style="color: #000000; ">^</span><span style="color: #000000; ">)]]</span><span style="color: #000000; ">*</span><span style="color: #000000; ">)(\);)</span><span style="color: #000000; ">?/</span><span style="color: #000000; ">gm, '$</span><span style="color: #000000; ">2</span><span style="color: #000000; ">');<br /> vm.runInThisContext('</span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> mainCfg</span><span style="color: #000000; ">=</span><span style="color: #000000; "> ' </span><span style="color: #000000; ">+</span><span style="color: #000000; "> mainContent);</span><span style="color: #008000; ">//</span><span style="color: #008000; ">提取的字符串{成mainCfg对象</span><span style="color: #008000; "><br /> </span><span style="color: #000000; ">mainCfg.baseUrl </span><span style="color: #000000; ">=</span><span style="color: #000000; "> '</span><span style="color: #000000; ">/</span><span style="color: #000000; ">static</span><span style="color: #000000; ">/</span><span style="color: #000000; ">js</span><span style="color: #000000; ">/</span><span style="color: #000000; ">dist</span><span style="color: #000000; ">/</span><span style="color: #000000; ">lib';<br /> </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> nMainCfgStr </span><span style="color: #000000; ">=</span><span style="color: #000000; "> 'requirejs.config(' </span><span style="color: #000000; ">+</span><span style="color: #000000; "> object2strng(mainCfg, </span><span style="color: #000000; ">0</span><span style="color: #000000; ">) </span><span style="color: #000000; ">+</span><span style="color: #000000; "> ');';</span><span style="color: #008000; ">//</span><span style="color: #008000; ">重新生成main.js配置文g,Z边的打包做准?/span><span style="color: #008000; "><br /> </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> buildPath </span><span style="color: #000000; ">=</span><span style="color: #000000; "> path.resolve(process.cwd(), '.</span><span style="color: #000000; ">/</span><span style="color: #000000; ">main.js');<br /> fs.writeFileSync(buildPath, nMainCfgStr);<br /> console.log('write temp file main.js fininshed');<br /> <br /> </span><span style="color: #008000; ">//</span><span style="color: #008000; ">打包的配|信?/span><span style="color: #008000; "><br /> </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> buildJson </span><span style="color: #000000; ">=</span><span style="color: #000000; "> {<br />     appDir: '..</span><span style="color: #000000; ">/</span><span style="color: #000000; ">js',<br />     baseUrl: 'lib',<br />     mainConfigFile: '.</span><span style="color: #000000; ">/</span><span style="color: #000000; ">main.js',<br />     dir: '..</span><span style="color: #000000; ">/</span><span style="color: #000000; ">js</span><span style="color: #000000; ">/</span><span style="color: #000000; ">dist',<br />     modules: [{<br />         'name': '..</span><span style="color: #000000; ">/</span><span style="color: #000000; ">main',<br />         include: []<br />     }]<br /> };<br /> </span><span style="color: #0000FF; ">for</span><span style="color: #000000; "> (</span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> p </span><span style="color: #0000FF; ">in</span><span style="color: #000000; "> mainCfg.paths) {</span><span style="color: #008000; ">//</span><span style="color: #008000; ">q里提取所有的依赖模块,打包时放到main.js文g?/span><span style="color: #008000; "><br /> </span><span style="color: #000000; ">    buildJson.modules[</span><span style="color: #000000; ">0</span><span style="color: #000000; ">].include.push(p);<br /> }<br /> <br /> </span><span style="color: #0000FF; ">var</span><span style="color: #000000; "> buildPath </span><span style="color: #000000; ">=</span><span style="color: #000000; "> path.resolve(process.cwd(), '.</span><span style="color: #000000; ">/</span><span style="color: #000000; ">build_main.json');<br /> fs.writeFileSync(buildPath, object2strng(buildJson, </span><span style="color: #000000; ">0</span><span style="color: #000000; ">));</span><span style="color: #008000; ">//</span><span style="color: #008000; ">生成打包配置文g</span><span style="color: #008000; "><br /> </span><span style="color: #000000;">console.log('wirte temp file build_main.json fininshed');<br /> <br /> </span></div><br /><br />写一批处理文件build.bat<br /><div><pre style="background-color:#2b2b2b;color:#a9b7c6;font-family:'Source Code Pro';font-size:10.5pt;">@echo off<br />node build.js<br />node r.js -o build_main.json<br />@pause</pre></div>执行可以了<img src ="http://www.dentisthealthcenter.com/wangxinsh55/aggbug/431944.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.dentisthealthcenter.com/wangxinsh55/" target="_blank">SIMONE</a> 2016-11-01 16:24 <a href="http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/11/01/431944.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>java keytool证书工具使用结http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/10/20/431905.htmlSIMONESIMONEThu, 20 Oct 2016 03:20:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/10/20/431905.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431905.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/10/20/431905.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431905.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431905.html
http://www.micmiu.com/lang/java/keytool-start-guide/

java keytool证书工具使用结


Keytool 是一个Java数据证书的管理工? ,Keytool密钥(keyQ和证书QcertificatesQ存在一个称为keystore的文件中在keystore里,包含两种数据:密钥实体QKey entityQ?密钥Qsecret keyQ或者是U钥和配对公钥(采用非对U加密)可信ȝ证书实体Qtrusted certificate entriesQ?只包含公?
JDK中keytool常用参数说明Q?span style="color: #ff0000;">不同版本有差异,详细可参见【附录】中的官Ҏ链?/span>Q?

  • -genkey 在用户主目录中创Z个默认文?#8221;.keystore”,q会产生一个mykey的别名,mykey中包含用L公钥、私钥和证书(在没有指定生成位|的情况?keystore会存在用Ll默认目?
  • -alias 产生别名 每个keystore都关联这一个独一无二的aliasQ这个alias通常不区分大写
  • -keystore 指定密钥库的名称(产生的各cM息将不在.keystore文g?
  • -keyalg 指定密钥的算?(?RSA DSAQ默认gؓQDSA)
  • -validity 指定创徏的证书有效期多少?默认 90)
  • -keysize 指定密钥长度 Q默?1024Q?/li>
  • -storepass 指定密钥库的密码(获取keystore信息所需的密?
  • -keypass 指定别名条目的密?U钥的密?
  • -dname 指定证书发行者信?其中Q?“CN=名字与姓?OU=l织单位名称,O=l织名称,L=城市或区域名 U?ST=州或省䆾名称,C=单位的两字母国家代码”
  • -list 昄密钥库中的证书信?keytool -list -v -keystore 指定keystore -storepass 密码
  • -v 昄密钥库中的证书详l信?/li>
  • -export 别名指定的证书导出到文?keytool -export -alias 需要导出的别名 -keystore 指定keystore -file 指定导出的证书位|及证书名称 -storepass 密码
  • -file 参数指定导出到文件的文g?/li>
  • -delete 删除密钥库中某条?keytool -delete -alias 指定需删除的别 -keystore 指定keystore – storepass 密码
  • -printcert 查看导出的证书信?keytool -printcert -file g:\sso\michael.crt
  • -keypasswd 修改密钥库中指定条目口o keytool -keypasswd -alias 需修改的别?-keypass 旧密?-new 新密?-storepass keystore密码 -keystore sage
  • -storepasswd 修改keystore口o keytool -storepasswd -keystore g:\sso\michael.keystore(需修改口o的keystore) -storepass pwdold(原始密码) -new pwdnew(新密?
  • -import 已{֐数字证书导入密钥?keytool -import -alias 指定导入条目的别?-keystore 指定keystore -file 需导入的证?/li>
目录说明Q?/span>
  1. 生成证书
  2. 查看证书
  3. 证书导出
  4. 附录资料
一、生成证?/span>
 按win?RQ弹行窗口,输入 cmd 回RQ打开命o行窗P输入如下命oQ?/div>
1
keytool -genkey -alias michaelkey -keyalg RSA -keysize 1024 -keypass michaelpwd -validity 365 -keystore g:\sso\michael.keystore -storepass michaelpwd2
截图如下Q?/div>
二、查看证?/span>

~省情况下,-list 命o打印证书?nbsp;MD5 指纹。而如果指定了 -v 选项Q将以可L式打印证书,如果指定?nbsp;-rfc 选项Q将以可打印的编码格式输书?/p>

-v 命o如下Q?/span>
1
keytool -list  -v -keystore g:\sso\michael.keystore -storepass michaelpwd2
回R看到的信息如下:
-rfc 命o如下Q?/span>
1
keytool -list -rfc -keystore g:\sso\michael.keystore -storepass michaelpwd2
回R看到的信息如下:
三、证书的导出和查看:
导出证书命oQ?/div>
1
keytool -export -alias michaelkey -keystore g:\sso\michael.keystore -file g:\sso\michael.crt -storepass michaelpwd2
回R如下Q?/div>
查看导出的证书信?/span>Q?/div>
1
keytool -printcert -file g:\sso\michael.crt
回R看到信息如下Q?/div>
四:附录
官方有关keytool命o的介l文:
  • jdk1.4.2 Qhttp://docs.oracle.com/javase/1.4.2/docs/tooldocs/windows/keytool.html
  • jdk1.6    Qhttp://docs.oracle.com/javase/6/docs/technotes/tools/windows/keytool.html
  • jdk1.7    Qhttp://docs.oracle.com/javase/7/docs/technotes/tools/windows/keytool.html


SIMONE 2016-10-20 11:20 发表评论
]]>Ud端Web开发调试之Chromeq程调试(Remote Debugging)http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/10/12/431884.htmlSIMONESIMONEWed, 12 Oct 2016 02:29:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/10/12/431884.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431884.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/10/12/431884.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431884.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431884.html
http://www.cnblogs.com/terrylin/p/4606277.html


SIMONE 2016-10-12 10:29 发表评论
]]>
Genymotion 解决虚拟镜像下蝲速度特别慢的问题 http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/10/11/431881.htmlSIMONESIMONETue, 11 Oct 2016 06:44:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/10/11/431881.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431881.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/10/11/431881.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431881.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431881.htmlhttp://blog.csdn.net/qing666888/article/details/51622762

GenymotionLAndroid模拟器中q行最快的Q但是服务器在国外,Android镜像下蝲h那个速度׃惌了?/strong>

Add new device后下载速度太慢了,Ҏp|


解决Ҏ如下Q?/strong>

Ҏ一Q?/strong>

1、设|HTTP代理Q在Setting->NetworkQ自p|HTTP proxy和PortQ?/strong>

 

Ҏ二:

1、找C载链接,直接用迅h下来

     遇到下蝲p|或者下载太慢,win+R打开q行框,输入 %appdata%Q?再点M一步(Alt+↑ Q,扑ֈlocal文g多w?/span>Genymobile,打开 查看里面的genymotion.log文gQ?/span>

扑ֈcM下面的文?/strong>

[Genymotion] [Debug] Downloading file

"http://files2.genymotion.com/dists/4.1.1/ova/genymotion_vbox86p_4.1.1_151117_133208.ova"


http://file........ova q个虚拟镜像地址直接用迅h速版下蝲Q或者用迅LU下载等功能很快能完成下?/strong>


2、把下蝲的文件复制到C:\Users\用户ȝ?span style="margin:0px; padding:0px; border:0px">\AppData\Local\Genymobile\Genymotion\ova 中覆盖里面以随机数命名的对应镜像?/span>实际上就是刚才看?/strong>genymotion软g刚刚点击下蝲的那个镜像,


3?/span>重新按照刚刚下蝲操作GUI的下载步骤,你会发现对应的镜像已l可以用了不需要下载了Q验证安装后即会昄在设备列表中?/span>

点击start Q启动模拟器Q开始?span style="border-style:initial; border-color:initial">



SIMONE 2016-10-11 14:44 发表评论
]]>ubuntu mate 下的sublime text 3调用中文输入法的修改http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/08/19/431643.htmlSIMONESIMONEFri, 19 Aug 2016 09:53:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/08/19/431643.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431643.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/08/19/431643.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431643.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431643.html

本经验目前在已有搜狗输入?for Linux和Sublime Text 3的情况下安装成功?/p>


保存下面的代码到文gsublime_imfix.c(位于~目录)


#include <gtk/gtkimcontext.h>

void gtk_im_context_set_client_window (GtkIMContext *context,

         GdkWindow    
*window)

{

 GtkIMContextClass 
*klass;

 g_return_if_fail (GTK_IS_IM_CONTEXT (context));

 klass 
= GTK_IM_CONTEXT_GET_CLASS (context);

 
if (klass->set_client_window)

   klass
->set_client_window (context, window);

 g_object_set_data(G_OBJECT(context),
"window",window);

 
if(!GDK_IS_WINDOW (window))

   
return;

 
int width = gdk_window_get_width(window);

 
int height = gdk_window_get_height(window);

 
if(width != 0 && height !=0)

   gtk_im_context_focus_in(context);

}



上一步的代码~译成共享库libsublime-imfix.soQ命?/p>

cd ~

gcc -shared -o libsublime-imfix.so sublime_imfix.c  `pkg-config --libs --cflags gtk+-2.0` -fPIC


如果q行不成功,可能是某些库没有安装Q执行下边的命o来安装缺q?br />




sudo apt-get install build-essential
sudo apt-get install libgtk2.0-dev

然后?em>libsublime-imfix.so拯?em>sublime_text所在文件夹

sudo mv libsublime-imfix.so /opt/sublime_text/



修改sublime-text-2.desktop
注意Qsublime_text.desktop不同版本有所不同Q请调整己安装版本的路径
sudo vim /usr/share/applications/sublime_text.desktop

[Desktop Entry]
Version
=1.0
Type
=Application
Name
=Sublime Text
GenericName
=Text Editor
Comment
=Sophisticated text editor for code, markup and prose
Exec
=/usr/bin/subl %F        #q里修改执行路径?usr/bin/subl,subl文g刚才已经修改q,大家应该记得
Terminal
=false
MimeType
=text/plain;
Icon=sublime-text
Categories
=TextEditor;Development;
StartupNotify=true
Actions
=Window;Document;

[Desktop Action Window]
Name
=New Window
Exec
=/usr/bin/subl -n       #q里修改执行路径?usr/bin/subl,subl文g刚才已经修改q,大家应该记得
OnlyShowIn
=Unity;

[Desktop Action Document]
Name
=New File
Exec
=/usr/bin/subl new_file    #q里修改执行路径?usr/bin/subl,subl文g刚才已经修改q,大家应该记得
OnlyShowIn
=Unity;

如果在命令行中执?usr/bin/subl打开sublime text后,应该可以用中文输入法了?br />另外在右键打开文gQ还不能使用中文输入法,需要做如下步骤操作
打开“控制中心”-》打开“主菜?#8221;-?#8220;应用E序”树k目录中找?#8220;~程”Q找?#8220;sublime text”Q双M攚w边的命o?div>/usr/bin/subl %F



SIMONE 2016-08-19 17:53 发表评论
]]>Java ?HTTP 的那些事Q二Q?使用代理http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/08/02/431421.htmlSIMONESIMONETue, 02 Aug 2016 06:11:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/08/02/431421.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431421.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/08/02/431421.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431421.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431421.htmlhttp://www.aneasystone.com/archives/2015/12/java-and-http-using-proxy.html

在上一博?a >《模?HTTP h?/a>中,我们分别介绍了两U方法来q行 HTTP 的模拟请求:HttpURLConnection ?HttpClient Q到目前为止q两U方法都工作的很好,基本上可以实现我们需要的 GET/POST Ҏ的模拟。对于一个爬虫来_能发?HTTP hQ能获取面数据Q能解析|页内容Q这相当于已l完?80% 的工作了。只不过对于剩下的这 20% 的工作,q得p我们另外 80% 的时? :-)

在这博客里Q我们将介绍剩下 20% 的工作中最为重要的一:如何?Java 中?HTTP 代理Q代理也是爬虫技术中的重要一V你如果要大规模的爬别h|页上的内容Q必然会对h家的|站造成影响Q如果你太拼了,׃遭h查封。要防止别h查封?们,我们要么自qE序分布到大量机器上去,但是对于资金和资源有限的我们来说q是很奢侈的Q要么就使用代理技术,从网上捞一批代理,免费的也好收费的 也好Q或者购C批廉L VPS 来搭q代理服务器。关于如何搭q代理服务器,后面有时间的话我再写一关于这个话题的博客。现在有了一大批代理服务器之后,可以用我们这 博客所介绍的技术了?/p>

一、简单的 HTTP 代理

我们先从最单的开始,|上有很多免费代理,直接上百度搜?“免费代理” 或?“HTTP 代理” p扑ֈ很多Q虽然网上能扑ֈ大量的免费代理,但它们的安全性已l有很多文章讨论q了Q也有专门的文章Ҏq行调研的,譬如q篇文章Q我在这里就不多作说明,如果你的爬虫爬取的信息ƈ没有什么特别的隐私问题Q可以忽略之Q如果你的爬虫涉及一些例如模拟登录之cȝ功能Q考虑到安全性,我徏议你q是不要使用|上公开的免费代理,而是搭徏自己的代理服务器比较靠谱Q?/p>

1.1 HttpURLConnection 使用代理

HttpURLConnection ?openConnection() Ҏ可以传入一?Proxy 参数Q如下:

1
2
3
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("127.0.0.1", 9876));
URL obj = new URL(url);
HttpURLConnection con = (HttpURLConnection) obj.openConnection(proxy);

OK 了,p么简单!

不仅如此Q我们注意到 Proxy 构造函数的W一个参Cؓ枚Dcd Proxy.Type.HTTP Q那么很昄Q如果将其修改ؓ Proxy.Type.SOCKS 卛_以?SOCKS 代理?/p>

1.2 HttpClient 使用代理

׃ HttpClient 非常灉|Q?HttpClient 来连接代理有很多不同的方法。最单的Ҏ莫过于下面这P

1
2
3
4
HttpHost proxy = new HttpHost("127.0.0.1", 9876, "HTTP");
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet request = new HttpGet(url);
CloseableHttpResponse response = httpclient.execute(proxy, request);

和上一中使用 HttpClient 发送请求的代码几乎一P只是 httpclient.execute() Ҏ多加了一个参敎ͼW一参数?HttpHost cdQ我们这里设|成我们的代理即可?/p>

q里要注意一点的是,虽然q里?new HttpHost() 和上面的 new Proxy() 一P也是可以指定协议cd的,但是遗憾的是 HttpClient 默认是不支持 SOCKS 协议的,如果我们使用下面的代码:

1
HttpHost proxy = new HttpHost("127.0.0.1", 1080, "SOCKS");

会直接报协议不支持异常Q?/p>

org.apache.http.conn.UnsupportedSchemeException: socks protocol is not supported

如果希望 HttpClient 支持 SOCKS 代理Q可以参看这里:How to use Socks 5 proxy with Apache HTTP Client 4? 通过 HttpClient 提供?ConnectionSocketFactory cL实现?/p>

虽然使用q种方式很简单,只需要加个参数就可以了,但是其实?HttpClient 的代码注释,如下Q?/p>

1
2
3
4
5
6
7
/*
* @param target    the target host for the request.
*                  Implementations may accept <code>null</code>
*                  if they can still determine a route, for example
*                  to a default target or by inspecting the request.
* @param request   the request to execute
*/

可以看到W一个参?target q不是代理,它的真实作用?执行h的目标主?/strong>Q这个解释有Ҏp,什么叫?执行h的目标主?/strong>Q代理算不算执行h的目标主?/strong>呢?因ؓ按常理来Ԍ执行h的目标主?/strong> 应该是要h URL 对应的站Ҏ寏V如果不的话,Z么这里将 target 讄成代理也能正常工作?q个我也不清楚,q需要进一步研I下 HttpClient 的源码来深入了解下?/p>

除了上面介绍的这U方式(自己写的Q不推荐使用Q来使用代理之外QHttpClient 官网q提供了几个CZQ我其作ؓ推荐写法记录在此?/p>

W一U写法是使用 RequestConfig c?/a>Q如下:

1
2
3
4
5
6
7
8
9
10
CloseableHttpClient httpclient = HttpClients.createDefault();      
HttpGet request = new HttpGet(url);
 
request.setConfig(
    RequestConfig.custom()
        .setProxy(new HttpHost("45.32.21.237", 8888, "HTTP"))
        .build()
);
         
CloseableHttpResponse response = httpclient.execute(request);

W二U写法是使用 RoutePlanner c?/a>Q如下:

1
2
3
4
5
6
7
HttpHost proxy = new HttpHost("127.0.0.1", 9876, "HTTP");
DefaultProxyRoutePlanner routePlanner = new DefaultProxyRoutePlanner(proxy);
CloseableHttpClient httpclient = HttpClients.custom()
        .setRoutePlanner(routePlanner)
        .build();
HttpGet request = new HttpGet(url);
CloseableHttpResponse response = httpclient.execute(request);

二、用系l代理配|?/h2>

我们在调?HTTP 爬虫E序Ӟ常常需要切换代理来试Q有时候直接用系l自带的代理配置是一U简单的Ҏ。以前在?.Net 目ӞE序默认使用 Internet |络讄中配的代理,遗憾的是Q我q里说的pȝ代理配置指的 JVM pȝQ而不是操作系l,我还没找到简单的Ҏ来让 Java E序直接使用 Windows pȝ下的代理配置?/p>

管如此Q系l代理用v来还是很单的。一般有下面两种方式可以讄 JVM 的代理配|:

2.1 System.setProperty

Java 中的 System cM仅仅是用来给我们 System.out.println() 打印信息的,它其实还有很多静态方法和属性可以用。其?System.setProperty() 是比较常用的一个?/p>

可以通过下面的方式来分别讄 HTTP 代理QHTTPS 代理?SOCKS 代理Q?/p>

1
2
3
4
5
6
7
8
9
10
11
12
// HTTP 代理Q只能代?HTTP h
System.setProperty("http.proxyHost", "127.0.0.1");
System.setProperty("http.proxyPort", "9876");
 
// HTTPS 代理Q只能代?HTTPS h
System.setProperty("https.proxyHost", "127.0.0.1");
System.setProperty("https.proxyPort", "9876");
 
// SOCKS 代理Q支?HTTP ?HTTPS h
// 注意Q如果设|了 SOCKS 代理׃要设 HTTP/HTTPS 代理
System.setProperty("socksProxyHost", "127.0.0.1");
System.setProperty("socksProxyPort", "1080");

q里有三点要说明Q?/p>

  1. pȝ默认先?HTTP/HTTPS 代理Q如果既讄?HTTP/HTTPS 代理Q又讄?SOCKS 代理QSOCKS 代理会v不到作用
  2. ׃历史原因Q注?socksProxyHost ?socksProxyPort 中间没有数?/li>
  3. HTTP ?HTTPS 代理可以合v来羃写,如下Q?/li>
1
2
3
// 同时支持代理 HTTP/HTTPS h
System.setProperty("proxyHost", "127.0.0.1");
System.setProperty("proxyPort", "9876");

2.2 JVM 命o行参?/h4>

可以使用 System.setProperty() Ҏ来设|系l代理,也可以直接将q些参数通过 JVM 的命令行参数来指定。如果你使用的是 Eclipse Q可以按下面的步骤来讄Q?/p>

  1. 按顺序打开QWindow -> Preferences -> Java -> Installed JREs -> Edit
  2. ?Default VM arguments 中填写参敎ͼ -DproxyHost=127.0.0.1 -DproxyPort=9876

jvm-arguments.jpg

2.3 使用pȝ代理

上面两种Ҏ都可以设|系l,下面要怎么在程序中自动使用pȝ代理呢?

对于 HttpURLConnection cL_E序不用做Q何变动,它会默认使用pȝ代理。但?HttpClient 默认是不使用pȝ代理的,如果惌它默认用系l代理,可以通过 SystemDefaultRoutePlanner ?ProxySelector 来设|。示例代码如下:

1
2
3
4
5
6
SystemDefaultRoutePlanner routePlanner = new SystemDefaultRoutePlanner(ProxySelector.getDefault());
CloseableHttpClient httpclient = HttpClients.custom()
        .setRoutePlanner(routePlanner)
        .build();
HttpGet request = new HttpGet(url);    
CloseableHttpResponse response = httpclient.execute(request);

参?/h2>
  1. HttpClient Tutorial
  2. 评测告诉你:那些免费代理悄悄做的龌蹉事儿
  3. How to use Socks 5 proxy with Apache HTTP Client 4?
  4. 使用ProxySelector选择代理服务?/a>
  5. Java Networking and Proxies



SIMONE 2016-08-02 14:11 发表评论
]]>正则表达式复杂应?/title><link>http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/26/431335.html</link><dc:creator>SIMONE</dc:creator><author>SIMONE</author><pubDate>Tue, 26 Jul 2016 10:04:00 GMT</pubDate><guid>http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/26/431335.html</guid><wfw:comment>http://www.dentisthealthcenter.com/wangxinsh55/comments/431335.html</wfw:comment><comments>http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/26/431335.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431335.html</wfw:commentRss><trackback:ping>http://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431335.html</trackback:ping><description><![CDATA[<div style="background-color: #eeeeee; font-size: 13px; border: 1px solid #cccccc; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;"><!--<br /> <br /> Code highlighting produced by Actipro CodeHighlighter (freeware)<br /> http://www.CodeHighlighter.com/<br /> <br /> --><span style="color: #0000FF; ">class</span><span style="color: #000000; "> ReqParam(queryString: String, val encode: String </span><span style="color: #000000; ">=</span><span style="color: #000000; "> </span><span style="color: #000000; ">"</span><span style="color: #000000; ">GBK</span><span style="color: #000000; ">"</span><span style="color: #000000; ">) : HashMap</span><span style="color: #000000; "><</span><span style="color: #000000; ">String, String</span><span style="color: #000000; ">></span><span style="color: #000000; ">() {<br />         <br />         init {<br />             queryString.split(</span><span style="color: #000000; ">"</span><span style="color: #000000; ">&+</span><span style="color: #000000; ">"</span><span style="color: #000000; ">.toRegex()).filter { it.contains(</span><span style="color: #000000; ">"</span><span style="color: #000000; ">=</span><span style="color: #000000; ">"</span><span style="color: #000000; ">) }.forEach {<br />                 val kv </span><span style="color: #000000; ">=</span><span style="color: #000000; "> it.split(</span><span style="color: #000000; ">"</span><span style="color: #000000; ">(?<!=+)=</span><span style="color: #000000; ">"</span><span style="color: #000000; ">.toRegex())<br />                 put(kv[</span><span style="color: #000000; ">0</span><span style="color: #000000; ">], URLDecoder.decode(kv[</span><span style="color: #000000; ">1</span><span style="color: #000000; ">], encode))<br />             }<br />             <br />         }<br />     }</span></div><br />以上是kotlin代码,是将链接地址串的<span style="color: #000000; "> queryString</span> 拆分k=v形式,q提取?br /><br /><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">fun main(args: Array</span><span style="color: #000000; "><</span><span style="color: #000000; ">String</span><span style="color: #000000; ">></span><span style="color: #000000; ">) {<br />    val domain </span><span style="color: #000000; ">=</span><span style="color: #000000; "> </span><span style="color: #000000; ">"</span><span style="color: #000000; ">fu.area.duxiu.com</span><span style="color: #000000; ">"</span><span style="color: #000000; "><br />    val subdomain </span><span style="color: #000000; ">=</span><span style="color: #000000; "> domain.replace(Regex(</span><span style="color: #000000; ">"""</span><span style="color: #000000; ">.+((\.\w+){2})</span><span style="color: #000000; ">"""</span><span style="color: #000000; ">), </span><span style="color: #000000; ">"</span><span style="color: #000000; ">$1</span><span style="color: #000000; ">"</span><span style="color: #000000; ">)<br />    println(subdomain)<br />}</span></div><br />取主域名<br /><br /><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #0000FF; ">public</span><span style="color: #000000; "> </span><span style="color: #0000FF; ">static</span><span style="color: #000000; "> String cookieDomain(String domain) {<br />        </span><span style="color: #0000FF; ">if</span><span style="color: #000000; "> (domain.matches(</span><span style="color: #000000; ">"</span><span style="color: #000000; ">((2[0-4]\\d|25[0-5]|[01]?\\d\\d?)\\.){3}(2[0-4]\\d|25[0-5]|[01]?\\d\\d?)</span><span style="color: #000000; ">"</span><span style="color: #000000; ">)) {</span><span style="color: #008000; ">//</span><span style="color: #008000; "> 如果是IP地址Q主域名是IP地址</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">            </span><span style="color: #0000FF; ">return</span><span style="color: #000000; "> domain;<br />        }<br />        </span><span style="color: #0000FF; ">return</span><span style="color: #000000; "> domain.substring(domain.indexOf(</span><span style="color: #000000; ">"</span><span style="color: #000000; ">.</span><span style="color: #000000; ">"</span><span style="color: #000000; ">));</span><span style="color: #008000; ">//</span><span style="color: #008000; "> 写子域名Q域名前边加?."</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">    }</span></div><br />判断IP<br /><img src ="http://www.dentisthealthcenter.com/wangxinsh55/aggbug/431335.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.dentisthealthcenter.com/wangxinsh55/" target="_blank">SIMONE</a> 2016-07-26 18:04 <a href="http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/26/431335.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>maven 依赖打包插ghttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/20/431252.htmlSIMONESIMONEWed, 20 Jul 2016 01:42:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/20/431252.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431252.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/20/431252.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431252.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431252.html<plugin>
                
<groupId>org.apache.maven.plugins</groupId>
                
<artifactId>maven-shade-plugin</artifactId>
                
<version>2.4.2</version>
                
<configuration>
                    
<createDependencyReducedPom>false</createDependencyReducedPom>
                
</configuration>
                
<executions>
                    
<execution>
                        
<phase>package</phase>
                        
<goals>
                            
<goal>shade</goal>
                        
</goals>
                        
<configuration>
                            
<artifactSet>
                                
<includes>
                                    
<include>org.apache.activemq:activemq-mqtt</include>
                                
</includes>
                            
</artifactSet>
                           
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.duxiu.demo.app.ApplicationKt</mainClass>
</transformer>
</transformers>

                        
</configuration>
                    
</execution>
                
</executions>
            
</plugin>


此配|方式将所有的依赖包的源码都解压打包进?
如果是war?会将整个站点解压打包q去.
同时打包q程中如果有classpath和依赖的jar包有相同的类,会将classpath里的cL换掉依赖包里的类


<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<!--<descriptors>
<descriptor>assembly.xml</descriptor>
</descriptors>-->
<!--<finalName>employees-app-${project.version}</finalName>-->
<archive>
<manifest>
<mainClass>com.duxiu.demo.app.ApplicationKt</mainClass>
</manifest>
</archive>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
只将依赖的jar包解压打?对于静态文件等是不会打包的


<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>appassembler-maven-plugin</artifactId>
<version>1.10</version>
<configuration>
<!-- 生成linux, Windows两种q_的执行脚?-->
<platforms>
<platform>windows</platform>
<platform>unix</platform>
</platforms>
<!-- 根目?-->
<assembleDirectory>${project.build.directory}/mall</assembleDirectory>
<!-- 打包?/span>jarQ以?/span>maven依赖?/span>jar攑ֈq个目录里面 -->
<repositoryName>lib</repositoryName>
<!-- 可执行脚本的目录 -->
<binFolder>bin</binFolder>
<!-- 配置文g的目标目?-->
<configurationDirectory>conf</configurationDirectory>
<!-- 拯配置文gC面的目录?-->
<copyConfigurationDirectory>true</copyConfigurationDirectory>
<!-- 从哪里拷贝配|文?/span> (默认src/main/config) -->
<configurationSourceDirectory>src/main/resources</configurationSourceDirectory>
<!-- lib目录?/span>jar的存放规则,默认?/span>${groupId}/${artifactId}的目录格式,flat表示直接?/span>jar攑ֈlib目录 -->
<repositoryLayout>flat</repositoryLayout>
<encoding>UTF-8</encoding>
<logsDirectory>logs</logsDirectory>
<tempDirectory>tmp</tempDirectory>
<programs>
<program>
<id>mall</id>
<!-- 启动c?-->
<mainClass>com.duxiu.demo.app.ApplicationKt</mainClass>
<jvmSettings>
<extraArguments>
<extraArgument>-server</extraArgument>
<extraArgument>-Xmx2G</extraArgument>
<extraArgument>-Xms2G</extraArgument>
</extraArguments>
</jvmSettings>
</program>
</programs>
</configuration>
</plugin>

打包应用E序,q会生成bat或sh可执行文?br />


<plugin>
<artifactId>maven-antrun-plugin</artifactId>
<executions>
<execution>
<id>move-main-class</id>
<phase>compile</phase>
<configuration>
<tasks>
<move todir="${project.build.directory}/${project.artifactId}-${version}/com/duxiu/demo/app">
<fileset dir="${project.build.directory}/classes/com/duxiu/demo/app">
<include name="*.class" />
</fileset>
</move>
</tasks>
</configuration>
<goals>
<goal>run</goal>
</goals>
</execution>
</executions>
</plugin>

打包的时候将包里的某个文件移动到指定的位|?img src ="http://www.dentisthealthcenter.com/wangxinsh55/aggbug/431252.html" width = "1" height = "1" />

SIMONE 2016-07-20 09:42 发表评论
]]> 使用embeded tomcatq行嵌入式javaee开发-启动tomcathttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/18/431229.htmlSIMONESIMONEMon, 18 Jul 2016 06:42:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/18/431229.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431229.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/18/431229.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431229.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431229.htmlhttps://www.iflym.com/index.php/code/use-embeded-tomcat-to-javaee-start-tomcat.html

昨天在网上研I了下关于将tomcat嵌入CE序中进行运行,而不是像以前一个web目copy到tomcat中进行运行。之所以这样做的原 因,x因ؓ目部v到客hQ在q行更新的时候,需要手动地q行更新Q再把相应代码copy到tomcatQ然后再q行。运用embeded tomcat可以将目与tomcat分开Q在q行更新Ӟ先用自定义的程序进行自动化更新Q待更新完毕之后Q再启动tomcatQ或其它 javaee容器Q进行项目运行?

     q样做的最l效果就是修改了目的运行方式。原先的q行方式是以tomcatZ心,由tomcat来启动和l止目Q现在是由我们的启动E序 Z心,由启动程序来负责启动和终止项目。就相当于现在流行的csE序一P有单独的启动脚本Q在启动时进行环境预初始化,更新E序以及其它操作Q待完成 之后再进行最l的目启动?/p>

     q篇主要讲解如何使用embeded tomcat在代码中q行启动和终止。网上的一般文章均为tomca5.x来做Q这里用了最新的tomcat7Q因为tomcat7为embeded开 发,单独发布了org.apache.tomcat.embed包,以进行独立的embed开发。以下是相应的maven?/p>

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
<dependency>
            <groupId>org.apache.tomcat.embed</groupId>
            <artifactId>tomcat-embed-core</artifactId>
            <version>7.0.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.tomcat</groupId>
            <artifactId>tomcat-util</artifactId>
            <version>7.0.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.tomcat.embed</groupId>
            <artifactId>tomcat-embed-jasper</artifactId>
            <version>7.0.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.tomcat.embed</groupId>
            <artifactId>tomcat-embed-logging-juli</artifactId>
            <version>7.0.2</version>
        </dependency>

    使用了embed包中的core包,以及用于~译jsp的jasper包,然后是工L以及q行上场记录的logging-juli包。开始写代码Q?/p>

1
2
3
4
5
6
7
//讄工作目录
        String catalina_home = "d:/";
        Tomcat tomcat = new Tomcat();
        tomcat.setHostname("localhost");
        tomcat.setPort(startPort);
        //讄工作目录,其实没什么用,tomcat需要用这个目录进行写一些东?/code>
        tomcat.setBaseDir(catalina_home);

    上面使用了TomcatcLq行启动c,在tomcat7以前均是使用一个叫EmbedcLq行启动Q在tomcat7之后Qembedc被不徏 议用,而徏议用新的TomcatcLq行启动了。然后设|主机名Q端口,再设|一个工作目录。这个工作目录可以是L目录Q主要是tomcat需要这 个目录来记录一些东西,比如记录word信息Q日志信息(如果配置了日志的话)Q以及时文件存储等?/p>

1
2
3
4
5
6
7
8
//讄E序的目录信?/code>
        tomcat.getHost().setAppBase("e:/");
        // Add AprLifecycleListener
        StandardServer server = (StandardServer) tomcat.getServer();
        AprLifecycleListener listener = new AprLifecycleListener();
        server.addLifecycleListener(listener);
        //注册关闭端口以进行关?/code>
        tomcat.getServer().setPort(shutdownPort);

    上面的代码,首先讄我们的项目程序所在的appbaseQ即N目代码的地方。在通常的tomcat配置中,q个目录一般是webapps。接 着讄一个listenerQ这个listener主要是负责启动一些比如html native支持E序以及ipv6{信息配|(可以忽略Q。接着是配|一个关闭的注册端口Q当向这个端口发送信息时Q就可以辑ֈ关闭tomcat的目?Q后面会Ԍ?/p>

1
2
3
4
5
6
7
8
9
//加蝲上下?/code>
        StandardContext standardContext = new StandardContext();
        standardContext.setPath("/aa");//contextPath
        standardContext.setDocBase("aa");//文g目录位置
        standardContext.addLifecycleListener(new Tomcat.DefaultWebXmlListener());
        //保证已经配置好了?/code>
        standardContext.addLifecycleListener(new Tomcat.FixContextListener());
        standardContext.setSessionCookieName("t-session");
        tomcat.getHost().addChild(standardContext);

    我们单独使用了一个Context来ؓq个hostd上下文,tomcat本n提供一个方法tomcat.addWebҎ来添加项目包Q不q?׃q里需要单独设|一个tomcat的sessionNameQ所以用与与tomcat.addWeb实现cM的方法来d一个项目包?br />     以上代码中有两个需要注意的listenerQ一个是DefaultWebXmlListenerQ这个是由tomcat加蝲一些默认的配置?息,比如jspServletQ以及一些繁复的mime/type信息Q加上这个,׃需要我们自己去写这么多的配|,因ؓ每个目都需要这些。这个配|?与tomcat目录下的conf/web.xml中的配置一P只不q这里是代码化了。第二个是FixContextListenerQ这个主要是在项?部v完后Q将q个上下文设|ؓconfiguredQ表C已l配|好了(不然Qtomcat启动时会报错Q即相应上下文还未配|好Q?br />     配置OK了之后,是启动tomcat了:

1
2
tomcat.start();
        tomcat.getServer().await();

    启动tomcatQƈ让tomcat在关闭端口上监听。如果没有最后一句,E序直接结束,保证监听之后Qtomcat一直监听关闭事Ӟ待有关闭事g之后才结束当前程序。所以如果想要关闭当前的tomcatQ只需要向关闭端口发送一些信息即可:

1
2
3
4
5
6
7
8
9
private static void shutdown() throws Exception {
        Socket socket = new Socket("localhost", shutdownPort);
        OutputStream stream = socket.getOutputStream();
        for(int i = 0;i < shutdown.length();i++)
            stream.write(shutdown.charAt(i));
        stream.flush();
        stream.close();
        socket.close();
    }

    q样卛_辑ֈ关闭tomcat的目的?/p>

     实际上看整个目代码Q项目代码的q行Q就是一个配|一个基的server.xmlQ即tomcat目录下的 conf/server.xml)Q先配置q行端口Q关闭监听端口;然后配置q行的host以及d一个上下文contextQ最后就开始运行ƈ开始监 听。对照这个程序,再看一下server.xml中的配置信息Q就很容易明白以上这D代码了?/p>



SIMONE 2016-07-18 14:42 发表评论
]]>
Secure Kafka Java Producer with Kerberoshttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/05/431097.htmlSIMONESIMONETue, 05 Jul 2016 03:41:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/05/431097.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431097.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/05/431097.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431097.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431097.html阅读全文

SIMONE 2016-07-05 11:41 发表评论
]]>
Kerberos 配置http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/05/431095.htmlSIMONESIMONETue, 05 Jul 2016 03:37:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/05/431095.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431095.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/05/431095.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431095.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431095.htmlhttps://www.zybuluo.com/xtccc/note/176298

哪些配置文gQ?/h1>

在安装好Kerberos的Y件之后,会用到几个配|文Ӟ例如
+ /etc/krb5.conf
+ /var/kerberos/krb5kdc/kdc.conf



配置文g的说?/h1>

/etc/krb5.conf

可以用命?code>man krb5.conf来查看关于该配置文g的说?/p>

先看一下该文g的模板:

  1. [logging]
  2. default = FILE:/var/log/krb5libs.log
  3. kdc = FILE:/var/log/krb5kdc.log
  4. admin_server = FILE:/var/log/kadmind.log

  5. [libdefaults]
  6. default_realm = EXAMPLE.COM
  7. dns_lookup_realm = false
  8. dns_lookup_kdc = false
  9. ticket_lifetime = 24h
  10. renew_lifetime = 7d
  11. forwardable = true

  12. [realms]
  13. EXAMPLE.COM = {
  14. kdc = example.com
  15. admin_server = example.com
  16. }

  17. [domain_realm]
  18. .example.com = EXAMPLE.COM
  19. example.com = EXAMPLE.COM



关于几个重要配置的说明
+ [realms].kdc : the name of the host running a KDC for that realm.
+ [realms].admin_server : identifies the host where the administration server is running. Typically this is the Master Kerberos server.
+ [domain_realm] : provides a translation from a hostname to the Kerberos realm name for the service provided by that host.



SIMONE 2016-07-05 11:37 发表评论
]]>
ubuntu kerberos配置http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/05/431096.htmlSIMONESIMONETue, 05 Jul 2016 03:37:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/05/431096.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431096.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/05/431096.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431096.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431096.htmlhttp://www.dentisthealthcenter.com/ivanwan/archive/2012/12/19/393221.html
https://help.ubuntu.com/10.04/serverguide/kerberos.html

Kerberos

Kerberos is a network authentication system based on the principal of a trusted third party. The other two parties being the user and the service the user wishes to authenticate to. Not all services and applications can use Kerberos, but for those that can, it brings the network environment one step closer to being Single Sign On (SSO).

This section covers installation and configuration of a Kerberos server, and some example client configurations.

Overview

If you are new to Kerberos there are a few terms that are good to understand before setting up a Kerberos server. Most of the terms will relate to things you may be familiar with in other environments:

  • Principal: any users, computers, and services provided by servers need to be defined as Kerberos Principals.

  • Instances: are used for service principals and special administrative principals.

  • Realms: the unique realm of control provided by the Kerberos installation. Usually the DNS domain converted to uppercase (EXAMPLE.COM).

  • Key Distribution Center: (KDC) consist of three parts, a database of all principals, the authentication server, and the ticket granting server. For each realm there must be at least one KDC.

  • Ticket Granting Ticket: issued by the Authentication Server (AS), the Ticket Granting Ticket (TGT) is encrypted in the user's password which is known only to the user and the KDC.

  • Ticket Granting Server: (TGS) issues service tickets to clients upon request.

  • Tickets: confirm the identity of the two principals. One principal being a user and the other a service requested by the user. Tickets establish an encryption key used for secure communication during the authenticated session.

  • Keytab Files: are files extracted from the KDC principal database and contain the encryption key for a service or host.

To put the pieces together, a Realm has at least one KDC, preferably two for redundancy, which contains a database of Principals. When a user principal logs into a workstation, configured for Kerberos authentication, the KDC issues a Ticket Granting Ticket (TGT). If the user supplied credentials match, the user is authenticated and can then request tickets for Kerberized services from the Ticket Granting Server (TGS). The service tickets allow the user to authenticate to the service without entering another username and password.

Kerberos Server

Installation

Before installing the Kerberos server a properly configured DNS server is needed for your domain. Since the Kerberos Realm by convention matches the domain name, this section uses the example.com domain configured in the section called “Primary Master”.

Also, Kerberos is a time sensitive protocol. So if the local system time between a client machine and the server differs by more than five minutes (by default), the workstation will not be able to authenticate. To correct the problem all hosts should have their time synchronized using the Network Time Protocol (NTP). For details on setting up NTP see the section called “Time Synchronisation with NTP”.

The first step in installing a Kerberos Realm is to install the krb5-kdc and krb5-admin-server packages. From a terminal enter:

sudo apt-get install krb5-kdc krb5-admin-server 

You will be asked at the end of the install to supply a name for the Kerberos and Admin servers, which may or may not be the same server, for the realm.

Next, create the new realm with the kdb5_newrealm utility:

sudo krb5_newrealm 

Configuration

The questions asked during installation are used to configure the /etc/krb5.conf file. If you need to adjust the Key Distribution Center (KDC) settings simply edit the file and restart the krb5-kdc daemon.

  1. Now that the KDC running an admin user is needed. It is recommended to use a different username from your everyday username. Using the kadmin.local utility in a terminal prompt enter:

    sudo kadmin.local Authenticating as principal root/admin@EXAMPLE.COM with password. kadmin.local: addprinc steve/admin WARNING: no policy specified for steve/admin@EXAMPLE.COM; defaulting to no policy Enter password for principal "steve/admin@EXAMPLE.COM":  Re-enter password for principal "steve/admin@EXAMPLE.COM":  Principal "steve/admin@EXAMPLE.COM" created. kadmin.local: quit 

    In the above example steve is the Principal, /admin is an Instance, and @EXAMPLE.COM signifies the realm. The "every day" Principal would be steve@EXAMPLE.COM, and should have only normal user rights.

    [Note]

    Replace EXAMPLE.COM and steve with your Realm and admin username.

  2. Next, the new admin user needs to have the appropriate Access Control List (ACL) permissions. The permissions are configured in the /etc/krb5kdc/kadm5.acl file:

    steve/admin@EXAMPLE.COM        * 

    This entry grants steve/admin the ability to perform any operation on all principals in the realm.

  3. Now restart the krb5-admin-server for the new ACL to take affect:

    sudo /etc/init.d/krb5-admin-server restart 
  4. The new user principal can be tested using the kinit utility:

    kinit steve/admin steve/admin@EXAMPLE.COM's Password: 

    After entering the password, use the klist utility to view information about the Ticket Granting Ticket (TGT):

    klist Credentials cache: FILE:/tmp/krb5cc_1000         Principal: steve/admin@EXAMPLE.COM    Issued           Expires          Principal Jul 13 17:53:34  Jul 14 03:53:34  krbtgt/EXAMPLE.COM@EXAMPLE.COM 

    You may need to add an entry into the /etc/hosts for the KDC. For example:

    192.168.0.1   kdc01.example.com       kdc01 

    Replacing 192.168.0.1 with the IP address of your KDC.

  5. In order for clients to determine the KDC for the Realm some DNS SRV records are needed. Add the following to /etc/named/db.example.com:

    _kerberos._udp.EXAMPLE.COM.     IN SRV 1  0 88  kdc01.example.com. _kerberos._tcp.EXAMPLE.COM.     IN SRV 1  0 88  kdc01.example.com. _kerberos._udp.EXAMPLE.COM.     IN SRV 10 0 88  kdc02.example.com.  _kerberos._tcp.EXAMPLE.COM.     IN SRV 10 0 88  kdc02.example.com.  _kerberos-adm._tcp.EXAMPLE.COM. IN SRV 1  0 749 kdc01.example.com. _kpasswd._udp.EXAMPLE.COM.      IN SRV 1  0 464 kdc01.example.com. 
    [Note]

    Replace EXAMPLE.COM, kdc01, and kdc02 with your domain name, primary KDC, and secondary KDC.

    See Chapter 7, Domain Name Service (DNS) for detailed instructions on setting up DNS.

Your new Kerberos Realm is now ready to authenticate clients.

Secondary KDC

Once you have one Key Distribution Center (KDC) on your network, it is good practice to have a Secondary KDC in case the primary becomes unavailable.

  1. First, install the packages, and when asked for the Kerberos and Admin server names enter the name of the Primary KDC:

    sudo apt-get install krb5-kdc krb5-admin-server 
  2. Once you have the packages installed, create the Secondary KDC's host principal. From a terminal prompt, enter:

    kadmin -q "addprinc -randkey host/kdc02.example.com" 
    [Note]

    After, issuing any kadmin commands you will be prompted for your username/admin@EXAMPLE.COM principal password.

  3. Extract the keytab file:

    kadmin -q "ktadd -k keytab.kdc02 host/kdc02.example.com" 
  4. There should now be a keytab.kdc02 in the current directory, move the file to /etc/krb5.keytab:

    sudo mv keytab.kdc02 /etc/krb5.keytab 
    [Note]

    If the path to the keytab.kdc02 file is different adjust accordingly.

    Also, you can list the principals in a Keytab file, which can be useful when troubleshooting, using the klist utility:

    sudo klist -k /etc/krb5.keytab 
  5. Next, there needs to be a kpropd.acl file on each KDC that lists all KDCs for the Realm. For example, on both primary and secondary KDC, create /etc/krb5kdc/kpropd.acl:

    host/kdc01.example.com@EXAMPLE.COM host/kdc02.example.com@EXAMPLE.COM 
  6. Create an empty database on the Secondary KDC:

    sudo kdb5_util -s create 
  7. Now start the kpropd daemon, which listens for connections from the kprop utility. kprop is used to transfer dump files:

    sudo kpropd -S 
  8. From a terminal on the Primary KDC, create a dump file of the principal database:

    sudo kdb5_util dump /var/lib/krb5kdc/dump 
  9. Extract the Primary KDC's keytab file and copy it to /etc/krb5.keytab:

    kadmin -q "ktadd -k keytab.kdc01 host/kdc01.example.com" sudo mv keytab.kdc01 /etc/kr5b.keytab 
    [Note]

    Make sure there is a host for kdc01.example.com before extracting the Keytab.

  10. Using the kprop utility push the database to the Secondary KDC:

    sudo kprop -r EXAMPLE.COM -f /var/lib/krb5kdc/dump kdc02.example.com 
    [Note]

    There should be a SUCCEEDED message if the propagation worked. If there is an error message check /var/log/syslog on the secondary KDC for more information.

    You may also want to create a cron job to periodically update the database on the Secondary KDC. For example, the following will push the database every hour:

    # m h  dom mon dow   command 0 * * * * /usr/sbin/kdb5_util dump /var/lib/krb5kdc/dump && /usr/sbin/kprop -r EXAMPLE.COM -f /var/lib/krb5kdc/dump kdc02.example.com 
  11. Back on the Secondary KDC, create a stash file to hold the Kerberos master key:

    sudo kdb5_util stash 
  12. Finally, start the krb5-kdc daemon on the Secondary KDC:

    sudo /etc/init.d/krb5-kdc start 

The Secondary KDC should now be able to issue tickets for the Realm. You can test this by stopping the krb5-kdc daemon on the Primary KDC, then use kinit to request a ticket. If all goes well you should receive a ticket from the Secondary KDC.

Kerberos Linux Client

This section covers configuring a Linux system as a Kerberos client. This will allow access to any kerberized services once a user has successfully logged into the system.

Installation

In order to authenticate to a Kerberos Realm, the krb5-user and libpam-krb5 packages are needed, along with a few others that are not strictly necessary but make life easier. To install the packages enter the following in a terminal prompt:

sudo apt-get install krb5-user libpam-krb5 libpam-ccreds auth-client-config 

The auth-client-config package allows simple configuration of PAM for authentication from multiple sources, and the libpam-ccreds will cache authentication credentials allowing you to login in case the Key Distribution Center (KDC) is unavailable. This package is also useful for laptops that may authenticate using Kerberos while on the corporate network, but will need to be accessed off the network as well.

Configuration

To configure the client in a terminal enter:

sudo dpkg-reconfigure krb5-config 

You will then be prompted to enter the name of the Kerberos Realm. Also, if you don't have DNS configured with Kerberos SRV records, the menu will prompt you for the hostname of the Key Distribution Center (KDC) and Realm Administration server.

The dpkg-reconfigure adds entries to the /etc/krb5.conf file for your Realm. You should have entries similar to the following:

[libdefaults]         default_realm = EXAMPLE.COM ... [realms]         EXAMPLE.COM = }                                 kdc = 192.168.0.1                                admin_server = 192.168.0.1         } 

You can test the configuration by requesting a ticket using the kinit utility. For example:

kinit steve@EXAMPLE.COM Password for steve@EXAMPLE.COM: 

When a ticket has been granted, the details can be viewed using klist:

klist Ticket cache: FILE:/tmp/krb5cc_1000 Default principal: steve@EXAMPLE.COM  Valid starting     Expires            Service principal 07/24/08 05:18:56  07/24/08 15:18:56  krbtgt/EXAMPLE.COM@EXAMPLE.COM         renew until 07/25/08 05:18:57   Kerberos 4 ticket cache: /tmp/tkt1000 klist: You have no tickets cached 

Next, use the auth-client-config to configure the libpam-krb5 module to request a ticket during login:

sudo auth-client-config -a -p kerberos_example 

You will should now receive a ticket upon successful login authentication.

Resources



SIMONE 2016-07-05 11:37 发表评论
]]>
kerberos安装配置 http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/05/431094.htmlSIMONESIMONETue, 05 Jul 2016 03:36:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/05/431094.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431094.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/07/05/431094.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431094.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431094.htmlhttp://blog.csdn.net/caizhongda/article/details/7947722
安装步骤Q?nbsp;
1.下蝲krb5-1.9 
http://web.mit.edu/kerberos/dist/krb5/1.9/krb5-1.9-signed.tar 

2.解压 
tar -xvf krb5-1.9.signed.tar 
生成krb5-1.9.tar.gz 和krb5-1.9.tar.gz.asc 
l箋解压tar zxvf krb5-1.9.tar.gz 

3.~译 
cd krb5-1.9/src 
./configure 
make 
make install 

4.配置/etc/krb5.conf 
q个是Kerberos最主要的配|文Ӟ而且一定要攑֜/etc?nbsp;
Xml代码  收藏代码
  1. [libdefaults]  
  2.       default_realm = 360BUY.COM  
  3.   
  4. [realms]  
  5.       360BUY.COM = {  
  6.             kdc = m1.360buy.com  
  7.             admin_server = m1.360buy.com  
  8.             default_domain =360buy.com  
  9.       }  
  10.   
  11. [logging]  
  12.       kdc = FILE:/data/logs/krb5/krb5kdc.log  
  13.       admin_server = FILE:/data/logs/krb5/kadmin.log  
  14.       default = FILE:/data/logs/krb5/krb5lib.log  

[libdefaults]中的defalt_realm表示在不l出域的时候,默认采用q个 
[logging]中的是指定日志的位置 
[realms]是最重要的也是Kerberos中最隄概念。,UCؓkerberos域,表示KDC所辖的范_可以和DNS域名一P也可以不一?nbsp;

5.配置/usr/local/var/krb5kdc/kdc.conf 
׃上面安装时没有选择安装目录Q所以默认的安装位置?usr/local/var/krb5kdc 
Xml代码  收藏代码
  1. [kdcdefaults]  
  2.       kdc_ports=750,88  
  3.   
  4. [realm]  
  5.       360BUY.COM ={  
  6.             database_name=/usr/local/var/krb5kdc/principal  
  7.             admin_keytab=/usr/local/var/krb5kdc/kadm5.keytab  
  8.             acl_file=/usr/local/var/krb5kdc/kadm5.acl  
  9.             key_stash_file=/usr/local/var/krb5kdc/.k5.360BUY.COM  
  10.             kdc_ports=750,88  
  11.             max_life=10h 0m 0s  
  12.             max_renewable_life=7d 0h 0m 0s  
  13.       }  



6.创徏一个kerberos数据?/span> 
Java代码  收藏代码
  1. /usr/local/sbin/kdb5_util create -r 360BUY.COM -s  

会要求创建数据库的密码?nbsp;
q且创徏/usr/local/var/krb5kdc/principal保存数据库文?nbsp;

7.dkerberos 
Java代码  收藏代码
  1. /usr/local/sbin/kadmin.local   


1Q查看用?nbsp;
listprincs 

2Q添加用?nbsp;
addprinc admin/admin@360BUY.COM 

3Q删除用?nbsp;
delprinc 

4Q创建keytab文g, 
Java代码  收藏代码
  1. ktadd -k /usr/local/var/krb5kdc/kadm5.keytab kadmin/admin kadmin/changepw  

可以用kadd来增加用L权限 
注意kadm5.keytab的\径要与kdc.conf中的路径一?nbsp;


8.重启krb5kdc和kadmindq程 
/usr/local/sbin/kadmind 
/usr/local/sbin/krb5kdc 

9.更改/etc/hosts文g 
d对应的host 
192.168.101.201 m1.360buy.com kdc 
192.168.101.202 m2.360buy.com client 
q且需要修改对应的hostname 


10.在KDC服务器上试据h
 
/usr/local/sbin/kadmin.local 
kadmin.local:addprinc winston@360BUY.COM 
提示创徏密码Q然后退?nbsp;

su winston 
$ kinit winston@360BUY.COM 
提示输入刚刚创徏的密?nbsp;

$ klist 查看自己甌的票?nbsp;

11.在Client端安装kerberos 
同样需要编译,但是只需要配|文?etc/krb5.conf 
内容和服务器的一?nbsp;

12.试KDC服务器申L?/span> 
su winston 
$ kinit winston@360BUY.COM 
提示输入刚刚创徏的密?nbsp;


addprinc -randkey hdfs/sl.360buy.com@360BUY.COM 
ktadd -norandkey -k hdfs.keytab hdfs/s1.360buy.com host/master.360buy.com 


SIMONE 2016-07-05 11:36 发表评论
]]>
Apache Kafka Security 101http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/30/431061.htmlSIMONESIMONEThu, 30 Jun 2016 11:16:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/30/431061.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431061.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/30/431061.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431061.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431061.html阅读全文

SIMONE 2016-06-30 19:16 发表评论
]]>
MySQL分库分表的全局唯一ID生成器方?/title><link>http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/28/431035.html</link><dc:creator>SIMONE</dc:creator><author>SIMONE</author><pubDate>Tue, 28 Jun 2016 10:48:00 GMT</pubDate><guid>http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/28/431035.html</guid><wfw:comment>http://www.dentisthealthcenter.com/wangxinsh55/comments/431035.html</wfw:comment><comments>http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/28/431035.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431035.html</wfw:commentRss><trackback:ping>http://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431035.html</trackback:ping><description><![CDATA[<div>http://lztian.com/blog/5921.html</div><br /><div><div> <p>借用MySQL ?auto_increment Ҏ可以生唯一的可靠ID?/p> <p>表定义,关键在于auto_incrementQ和UNIQUE KEY的设|:</p> <div><div id="highlighter_394886" sql"=""><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td><div number1="" index0="" alt2"="">1</div><div number2="" index1="" alt1"="">2</div><div number3="" index2="" alt2"="">3</div><div number4="" index3="" alt1"="">4</div><div number5="" index4="" alt2"="">5</div><div number6="" index5="" alt1"="">6</div></td><td><div><div number1="" index0="" alt2"=""><code keyword"="">CREATE</code> <code keyword"="">TABLE</code> <code plain"="">`Tickets64` (</code></div><div number2="" index1="" alt1"=""><code spaces"="">  </code><code plain"="">`id` </code><code keyword"="">bigint</code><code plain"="">(20) unsigned </code><code color1"="">NOT</code> <code color1"="">NULL</code> <code plain"="">auto_increment,</code></div><div number3="" index2="" alt2"=""><code spaces"="">  </code><code plain"="">`stub` </code><code keyword"="">char</code><code plain"="">(1) </code><code color1"="">NOT</code> <code color1"="">NULL</code> <code keyword"="">default</code> <code string"="">''</code><code plain"="">,</code></div><div number4="" index3="" alt1"=""><code spaces"="">  </code><code keyword"="">PRIMARY</code> <code keyword"="">KEY</code>  <code plain"="">(`id`),</code></div><div number5="" index4="" alt2"=""><code spaces"="">  </code><code keyword"="">UNIQUE</code> <code keyword"="">KEY</code> <code plain"="">`stub` (`stub`)</code></div><div number6="" index5="" alt1"=""><code plain"="">) ENGINE=MyISAM</code></div></div></td></tr></tbody></table></div></div> <p>需要用时Qy用replace into语法来获取|l合表定义的UNIQUE KEYQ确保了一条记录就可以满ID生成器的需求:</p> <div><div id="highlighter_696647" sql"=""><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td><div number1="" index0="" alt2"="">1</div><div number2="" index1="" alt1"="">2</div></td><td><div><div number1="" index0="" alt2"=""><code color2"="">REPLACE</code> <code keyword"="">INTO</code> <code plain"="">Tickets64 (stub) </code><code keyword"="">VALUES</code> <code plain"="">(</code><code string"="">'a'</code><code plain"="">);</code></div><div number2="" index1="" alt1"=""><code keyword"="">SELECT</code> <code plain"="">LAST_INSERT_ID();</code></div></div></td></tr></tbody></table></div></div> <p>以上方式中,通过MySQL的机Ӟ可以保此ID的唯一和自增,且适用于多q发的场景。官方对此的描述Qhttps://dev.mysql.com/doc/refman/5.0/en/information-functions.html</p> <div><div id="highlighter_815008" plain"=""><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td><div number1="" index0="" alt2"="">1</div><div number2="" index1="" alt1"="">2</div><div number3="" index2="" alt2"="">3</div></td><td><div><div number1="" index0="" alt2"=""><code plain"="">It is multi-user safe because multiple clients can issue the UPDATE statement and </code></div><div number2="" index1="" alt1"=""><code plain"="">get their own sequence value with the SELECT statement (or mysql_insert_id()), </code></div><div number3="" index2="" alt2"=""><code plain"="">without affecting or being affected by other clients that generate their own sequence values.</code></div></div></td></tr></tbody></table></div></div> <p>需要注意的是,若client采用PHPQ则不能使用mysql_insert_id()获取IDQ原因见《mysql_insert_id() 在bigint型AI字段遇到的问题》:http://kaifage.com/notes/99/mysql-insert-id-issue- with-bigint-ai-field.html?/p> <p>Flickr 采取了此ҎQ?http://code.flickr.net/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/</p> <p>相关Q?/p> <p>http://www.zhihu.com/question/30674667</p> <p>http://my.oschina.net/u/142836/blog/174465</p> </div></div><img src ="http://www.dentisthealthcenter.com/wangxinsh55/aggbug/431035.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.dentisthealthcenter.com/wangxinsh55/" target="_blank">SIMONE</a> 2016-06-28 18:48 <a href="http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/28/431035.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>java修改static final帔R? http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/28/431031.htmlSIMONESIMONETue, 28 Jun 2016 09:32:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/28/431031.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/431031.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/28/431031.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/431031.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/431031.htmlhttp://ljhzzyx.blog.163.com/blog/static/3838031220141011111435161/
java中,final标识的变量是不可修改的,但是通过反射的方式达C改的目的。修改的CZ也很单,在这?http://stackoverflow.com/questions/2474017/using-reflection-to-change-static-final-file-separatorchar-for-unit-testing
public class EverythingIsTrue {
    static void setFinalStatic(Field field, Object newValue) throws Exception {
        field.setAccessible(true);
        Field modifiersField = Field.class.getDeclaredField("modifiers");
        modifiersField.setAccessible(true);
        modifiersField.setInt(field, field.getModifiers() & ~Modifier.FINAL);
        field.set(null, newValue);
    }
    public static void main(String args[]) throws Exception {
        setFinalStatic(Boolean.class.getField("FALSE"), true);
        System.out.format("Everything is %s", false); // "Everything is true"
    }
}
    关键点在?span style="line-height: 28px;">.setAccessible(true)Qƈ且修?/span>modifiers去除final属性。获得修饰符的方式可以通过java.lang.reflect.ModifierQ详l说明在q里
http://blog.csdn.net/xiao__gui/article/details/8141216
通过Modifier的isPublic、isPrivate、isStatic{方法,可以判断是否包含某些修饰W?/div>
boolean isStatic = Modifier.isStatic(field.getModifiers());
if(isStatic) {
    System.out.println(field.get(null).toString());
}
q里?span style="line-height: 28px;">field是静态类型的Q因?/span>field.get(null)Ҏ的参敎ͼ可以是nullQ也可以是A.classq样的目标类Q不用提供实例对象。查?/span>java.lang.reflect.Modifier的代码,可以知道对修饰符的定义是通过二进制位来实现的。上面文章中有D?/span>

public staticQ对应的整数是二进制的Q?001Q也是9。如下:

……

native

transient

volatile

synchronized

final

static

protected

private

public

 

0

0

0

0

0

1

0

0

1


源码中的完整定义如下
public static final int PUBLIC           = 0x00000001;
public static final int PRIVATE          = 0x00000002;
public static final int PROTECTED        = 0x00000004;
public static final int STATIC           = 0x00000008;
public static final int FINAL            = 0x00000010;
public static final int SYNCHRONIZED     = 0x00000020;
public static final int VOLATILE         = 0x00000040;
public static final int TRANSIENT        = 0x00000080;
public static final int NATIVE           = 0x00000100;
public static final int INTERFACE        = 0x00000200;
public static final int ABSTRACT         = 0x00000400;
public static final int STRICT           = 0x00000800;
Ҏ数|得到完整的顺序是q样?/div>
strict,abstract,interface,native,transient,volatile,synchronized,final,static,protected,private,public
      由此可以了?span style="line-height: 28px;">field.getModifiers() & ~Modifier.FINALq部分的含义的,?/span>~Modifier.FINALfinal所在的位设|ؓ0Q其他所有位讄??/span>field.getModifiers() & ~Modifier.FINAL与的操作Q就是将field?/span>modifiers属性修饰符中finall去除掉?/span>
      但是在自己尝试的q程中,发现一个问题。设|final变量的方法是field.set()Q如果在q个Ҏ之前调用?/span>field.get()ҎQ顺序如下面q样
field.get(null).toString();
...
field.set(null, newValue);
q时对final变量的赋值就会报错,q.setAccessible(true);也是没有用的。具体原因尚不清楚,估计需要跟t源码才能查清楚?/span>

      需要注意的是,对于int、long、boolean以及String{基本类型,׃~译器优化的原因Q很多用常量的地方的D是原来的数倹{如
if (index > maxFormatRecordsIndex) {
    index  =  maxFormatRecordsIndex;
}
maxFormatRecordsIndex为finalQ则被编译器Ҏq样
if (index > 100) {
    index = 100;
}
System.out.println(Bean.INT_VALUE);
//~译时会被优化成下面q样Q?/div>
System.out.println(100);
所以正常的使用方式q是获取原来的|获得修改后的final帔R的值需要用field.get(null)q样的方式?/div>
      M来讲Q改基本cd的final帔R的用处还是不大,如果是非基本cd帔RQ则有实际意义?/div>


SIMONE 2016-06-28 17:32 发表评论
]]>MYSQL之表分区----按日期分?http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/07/430822.htmlSIMONESIMONETue, 07 Jun 2016 10:06:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/07/430822.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/430822.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/07/430822.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/430822.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/430822.htmlhttp://blog.sina.com.cn/s/blog_888269b20100w7kf.html

mysql 5.1已经Cbeta版,官方|站上也陆箋有一些文章介l,比如上次看到?a >Improving Database Performance with Partitioning。在使用分区的前提下Q可以用mysql实现非常大的数据量存储。今天在mysql的站上又看到一进阶的文章 —— 按日期分区存?/a>。如果能够实现按日期分区Q这Ҏ些时效性很强的数据存储是相当实用的功能。下面是从这文章中摘录的一些内宏V?/p>

错误的按日期分区例子

最直观的方法,是直接用年月日q种日期格式来进行常规的分区Q?/p>

CODE:
  1. mysql>  create table rms (d date)
  2.     ->  partition by range (d)
  3.     -> (partition p0 values less than ('1995-01-01'),
  4.     ->  partition p1 VALUES LESS THAN ('2010-01-01'));

 

上面的例子中Q就是直接用"Y-m-d"的格式来对一个tableq行分区Q可惜想当然往往不能奏效Q会得到一个错误信?

ERROR 1064 (42000): VALUES value must be of same type as partition function near '),
partition p1 VALUES LESS THAN ('2010-01-01'))' at line 3

上述分区方式没有成功Q而且明显的不l济Q老练的DBA会用整型数值来q行分区Q?/p>

CODE:
  1. mysql> CREATE TABLE part_date1
  2.     ->      (  c1 int default NULL,
  3.     ->  c2 varchar(30) default NULL,
  4.     ->  c3 date default NULL) engine=myisam
  5.     ->      partition by range (cast(date_format(c3,'%Y%m%d') as signed))
  6.     -> (PARTITION p0 VALUES LESS THAN (19950101),
  7.     -> PARTITION p1 VALUES LESS THAN (19960101) ,
  8.     -> PARTITION p2 VALUES LESS THAN (19970101) ,
  9.     -> PARTITION p3 VALUES LESS THAN (19980101) ,
  10.     -> PARTITION p4 VALUES LESS THAN (19990101) ,
  11.     -> PARTITION p5 VALUES LESS THAN (20000101) ,
  12.     -> PARTITION p6 VALUES LESS THAN (20010101) ,
  13.     -> PARTITION p7 VALUES LESS THAN (20020101) ,
  14.     -> PARTITION p8 VALUES LESS THAN (20030101) ,
  15.     -> PARTITION p9 VALUES LESS THAN (20040101) ,
  16.     -> PARTITION p10 VALUES LESS THAN (20100101),
  17.     -> PARTITION p11 VALUES LESS THAN MAXVALUE );
  18. Query OK, 0 rows affected (0.01 sec)

 

搞定Q接着往下分?/p>

CODE:
  1. mysql> explain partitions
  2.     -> select count(*) from part_date1 where
  3.     ->      c3> date '1995-01-01' and c3 <date '1995-12-31'\G
  4. *************************** 1. row ***************************
  5.            id: 1
  6.   select_type: SIMPLE
  7.         table: part_date1
  8.    partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11
  9.          type: ALL
  10. possible_keys: NULL
  11.           key: NULL
  12.       key_len: NULL
  13.           ref: NULL
  14.          rows: 8100000
  15.         Extra: Using where
  16. 1 row in set (0.00 sec)

 

万恶的mysql居然对上面的sql使用全表扫描Q而不是按照我们的日期分区分块查询。原文中解释?a >MYSQL的优化器q不认这U日期Ş式的分区Q花了大量的幅来引׃C歧\Q过分?/p>

正确的日期分Z?/h3>

mysql优化器支持以下两U内|的日期函数q行分区Q?/p>

  • TO_DAYS()
  • YEAR()

看个例子Q?/p>

CODE:
  1. mysql> CREATE TABLE part_date3
  2.     ->      (  c1 int default NULL,
  3.     ->  c2 varchar(30) default NULL,
  4.     ->  c3 date default NULL) engine=myisam
  5.     ->      partition by range (to_days(c3))
  6.     -> (PARTITION p0 VALUES LESS THAN (to_days('1995-01-01')),
  7.     -> PARTITION p1 VALUES LESS THAN (to_days('1996-01-01')) ,
  8.     -> PARTITION p2 VALUES LESS THAN (to_days('1997-01-01')) ,
  9.     -> PARTITION p3 VALUES LESS THAN (to_days('1998-01-01')) ,
  10.     -> PARTITION p4 VALUES LESS THAN (to_days('1999-01-01')) ,
  11.     -> PARTITION p5 VALUES LESS THAN (to_days('2000-01-01')) ,
  12.     -> PARTITION p6 VALUES LESS THAN (to_days('2001-01-01')) ,
  13.     -> PARTITION p7 VALUES LESS THAN (to_days('2002-01-01')) ,
  14.     -> PARTITION p8 VALUES LESS THAN (to_days('2003-01-01')) ,
  15.     -> PARTITION p9 VALUES LESS THAN (to_days('2004-01-01')) ,
  16.     -> PARTITION p10 VALUES LESS THAN (to_days('2010-01-01')),
  17.     -> PARTITION p11 VALUES LESS THAN MAXVALUE );
  18. Query OK, 0 rows affected (0.00 sec)

 

以to_days()函数分区成功Q我们分析一下看看:

CODE:
  1. mysql> explain partitions
  2.     -> select count(*) from part_date3 where
  3.     ->      c3> date '1995-01-01' and c3 <date '1995-12-31'\G
  4. *************************** 1. row ***************************
  5.            id: 1
  6.   select_type: SIMPLE
  7.         table: part_date3
  8.    partitions: p1
  9.          type: ALL
  10. possible_keys: NULL
  11.           key: NULL
  12.       key_len: NULL
  13.           ref: NULL
  14.          rows: 808431
  15.         Extra: Using where
  16. 1 row in set (0.00 sec)

 

可以看到Q?a >mysql优化器这ơ不负众望,仅仅在p1分区q行查询。在q种情况下查询,真的能够带来提升查询效率么?下面分别对这ơ徏立的part_date3和之前分区失败的part_date1做一个查询对比:

CODE:
  1. mysql> select count(*) from part_date3 where
  2.     ->      c3> date '1995-01-01' and c3 <date '1995-12-31';
  3. +----------+
  4. | count(*) |
  5. +----------+
  6.  805114 |
  7. +----------+
  8. 1 row in set (4.11 sec)
  9.  
  10. mysql> select count(*) from part_date1 where
  11.     ->      c3> date '1995-01-01' and c3 <date '1995-12-31';
  12. +----------+
  13. | count(*) |
  14. +----------+
  15.  805114 |
  16. +----------+
  17. 1 row in set (40.33 sec)

 

可以看到Q分区正的话queryp旉?U,而分区错误则p旉40U(相当于没有分区)Q效率有90Q的提升Q所以我们千万要正确的用分区功能,分区后务必用explain验证Q这h能获得真正的性能提升?/p>


注意Q?/p>

在mysql5.1中徏立分的语句中Q只能包含下列函敎ͼ
ABS()
CEILING() and FLOOR() Q在使用q?个函数的建立分区表的前提是用函数的分区键是INTcdQ,例如

mysql> CREATE TABLE t (c FLOAT) PARTITION BY LIST( FLOOR(c) )(     -> PARTITION p0 VALUES IN (1,3,5),     -> PARTITION p1 VALUES IN (2,4,6)     -> );; ERROR 1491 (HY000): The PARTITION function returns the wrong type   mysql> CREATE TABLE t (c int) PARTITION BY LIST( FLOOR(c) )(     -> PARTITION p0 VALUES IN (1,3,5),     -> PARTITION p1 VALUES IN (2,4,6)     -> ); Query OK, 0 rows affected (0.01 sec) 

DAY()
DAYOFMONTH()
DAYOFWEEK()
DAYOFYEAR()
DATEDIFF()
EXTRACT()
HOUR()
MICROSECOND()
MINUTE()
MOD()
MONTH()
QUARTER()
SECOND()
TIME_TO_SEC()
TO_DAYS()
WEEKDAY()
YEAR()
YEARWEEK()



SIMONE 2016-06-07 18:06 发表评论
]]>Max MQTT connectionshttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/01/430732.htmlSIMONESIMONEWed, 01 Jun 2016 08:15:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/01/430732.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/430732.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/06/01/430732.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/430732.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/430732.htmlhttp://stackoverflow.com/questions/29358313/max-mqtt-connections?answertab=votes#tab-top


I have a need to create a server farm that can handle 5+ million connections, 5+ million topics (one per client), process 300k messages/sec.

I tried to see what various message brokers were capable so I am currently using two RHEL EC2 instances (r3.4xlarge) to make lots of available resources. So you do not need to look it up, it has 16vCPU, 122GB RAM. I am nowhere near that limit in usage.

I am unable to pass the 600k connections limit. Since there doesn't seem to be any O/S limitation (plenty of RAM/CPU/etc.) on either the client nor the server what is limiting me?

I have edited /etc/security/limits.conf as follows:

* soft  nofile  20000000 * hard  nofile  20000000  * soft  nproc  20000000 * hard  nproc  20000000  root  soft  nofile 20000000 root  hard  nofile 20000000 

I have edited /etc/sysctl.conf as follows:

net.ipv4.ip_local_port_range = 1024 65535   net.ipv4.tcp_tw_reuse = 1  net.ipv4.tcp_mem = 5242880  5242880 5242880  net.ipv4.tcp_tw_recycle = 1  fs.file-max = 20000000  fs.nr_open = 20000000  net.ipv4.tcp_syncookies = 0  net.ipv4.tcp_max_syn_backlog = 10000  net.ipv4.tcp_synack_retries = 3  net.core.somaxconn=65536  net.core.netdev_max_backlog=100000  net.core.optmem_max = 20480000 

For Apollo: export APOLLO_ULIMIT=20000000

For ActiveMQ:

ACTIVEMQ_OPTS="$ACTIVEMQ_OPTS -Dorg.apache.activemq.UseDedicatedTaskRunner=false" ACTIVEMQ_OPTS_MEMORY="-Xms50G -Xmx115G" 

I created 20 additional private addresses for eth0 on the client, then assigned them: ip addr add 11.22.33.44/24 dev eth0

I am FULLY aware of the 65k port limits which is why I did the above.

  • For ActiveMQ I got to: 574309
  • For Apollo I got to: 592891
  • For Rabbit I got to 90k but logging was awful and couldn't figure out what to do to go higher although I know its possible.
  • For Hive I got to trial limit of 1000. Awaiting a license
  • IBM wants to trade the cost of my house to use them - nah!
asked Mar 30 '15 at 23:52
redboy
10311
   
Can't really tell how to increase the throughput. However, checkout kafka.apache.org . Not sure about the MQTT support, but it seems capable of extrem throughput / # clients. – Petter Nordlander Mar 31 '15 at 7:52
   
did you try mosquitto? (mosquitto.org) – Aleksey Izmailov Apr 2 '15 at 8:02
   
Trying Hive, Apollo, Mosquito, Active, Rabbit, mosquito – redboy Apr 2 '15 at 21:58

ANSWER: While doing this I realized that I had a misspelling in my client setting within /etc/sysctl.conf file for: net.ipv4.ip_local_port_range

I am now able to connect 956,591 MQTT clients to my Apollo server in 188sec.


More info: Trying to isolate if this is an O/S connection limitation or a Broker, I decided to write a simple Client/Server.

The server:

    Socket client = null;     server = new ServerSocket(1884);     while (true) {         client = server.accept();         clients.add(client);     } 

The Client:

    while (true) {         InetAddress clientIPToBindTo = getNextClientVIP();         Socket client = new Socket(hostname, 1884, clientIPToBindTo, 0);         clients.add(client);     } 

With 21 IPs, I would expect 65535-1024*21 = 1354731 to be the boundary. In reality I am able to achieve 1231734

[root@ip ec2-user]# cat /proc/net/sockstat sockets: used 1231734 TCP: inuse 5 orphan 0 tw 0 alloc 1231307 mem 2 UDP: inuse 4 mem 1 UDPLITE: inuse 0 RAW: inuse 0 FRAG: inuse 0 memory 0 

So the socket/kernel/io stuff is worked out.

I am STILL unable to achieve this using any broker.

Again just after my client/server test this is the kernel settings.

Client:

[root@ip ec2-user]# sysctl -p net.ipv4.ip_local_port_range = 1024     65535 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_mem = 5242880      5242880 15242880 net.ipv4.tcp_tw_recycle = 1 fs.file-max = 20000000 fs.nr_open = 20000000  [root@ip ec2-user]# cat /etc/security/limits.conf * soft  nofile  2000000 * hard  nofile  2000000     root  soft  nofile 2000000 root  hard  nofile 2000000 

Server:

[root@ ec2-user]# sysctl -p net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_mem = 5242880      5242880 5242880 net.ipv4.tcp_tw_recycle = 1 fs.file-max = 20000000 fs.nr_open = 20000000 net.ipv4.tcp_syncookies = 0 net.ipv4.tcp_max_syn_backlog = 1000000 net.ipv4.tcp_synack_retries = 3 net.core.somaxconn = 65535 net.core.netdev_max_backlog = 1000000 net.core.optmem_max = 20480000 


SIMONE 2016-06-01 16:15 发表评论
]]>HDFS配置Kerberos认证 http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/31/430718.htmlSIMONESIMONETue, 31 May 2016 09:24:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/31/430718.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/430718.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/31/430718.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/430718.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/430718.html阅读全文

SIMONE 2016-05-31 17:24 发表评论
]]>
Spark History Server配置使用http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/26/430665.htmlSIMONESIMONEThu, 26 May 2016 06:12:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/26/430665.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/430665.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/26/430665.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/430665.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/430665.htmlhttp://www.cnblogs.com/luogankun/p/3981645.html

Spark history Server产生背景

以standaloneq行模式ZQ在q行Spark Application的时候,Spark会提供一个WEBUI列出应用E序的运行时信息Q但该WEBUI随着Application的完?成功/??而关闭,也就是说QSpark Applicationq行?成功/p|)后,无法查看Application的历史记录;

Spark history Server是Z应对q种情况而生的Q通过配置可以在Application执行的过E中记录下了日志事g信息Q那么在Application执行 l束后,WEBUIp重新渲染生成UI界面展现Application在执行过E中的运行时信息Q?/p>

Sparkq行在yarn或者mesos之上Q通过spark的history server仍然可以重构Z个已l完成的Application的运行时参数信息Q假如Applicationq行的事件日志信息已l记录下来)Q?/p>

 

配置&使用Spark History Server

以默认配|的方式启动spark history serverQ?/p>

cd $SPARK_HOME/sbin start-history-server.sh

报错Q?/p>

starting org.apache.spark.deploy.history.HistoryServer, logging to /home/spark/software/source/compile/deploy_spark/sbin/../logs/spark-spark-org.apache.spark.deploy.history.HistoryServer-1-hadoop000.out failed to launch org.apache.spark.deploy.history.HistoryServer:         at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:44)         ... 6 more

需要在启动时指定目录:

start-history-server.sh hdfs://hadoop000:8020/directory

hdfs://hadoop000:8020/directory可以配置在配|文件中Q那么在启动history-server时就不需要指定,后箋介绍怎么配置Q?/p>

注:该目录需要事先在hdfs上创建好Q否则history-server启动报错?/strong>

启动完成之后可以通过WEBUI讉KQ默认端口是18080Qhttp://hadoop000:18080

默认界面列表信息是空的,下面截图是我跑了几次spark-sql试后出现的?/p>

 

history server相关的配|参数描q?/strong>

1Q?spark.history.updateInterval
  默认|10
  以秒为单位,更新日志相关信息的时间间?/p>

2Qspark.history.retainedApplications
  默认|50
  在内存中保存Application历史记录的个敎ͼ如果过q个|旧的应用E序信息被删除Q当再次讉K已被删除的应用信息时需要重新构建页面?/p>

3Q?span style="color: #ff0000;">spark.history.ui.port
  默认|18080
  HistoryServer的web端口

4Qspark.history.kerberos.enabled
  默认|false
  是否使用kerberos方式d讉KHistoryServerQ对于持久层位于安全集群的HDFS上是有用的,如果讄为trueQ就要配|下面的两个属?/p>

5Qspark.history.kerberos.principal
  默认|用于HistoryServer的kerberosM名称

6Qspark.history.kerberos.keytab
  用于HistoryServer的kerberos keytab文g位置

7Qspark.history.ui.acls.enable
  默认|false
  授权用户查看应用E序信息的时候是否检查acl。如果启用,只有应用E序所有者和spark.ui.view.acls指定的用户可以查看应用程序信?否则Q不做Q何检?/p>

8Q?span style="color: #ff0000;">spark.eventLog.enabled
  默认|false
  是否记录Spark事gQ用于应用程序在完成后重构webUI

9Q?span style="color: #ff0000;">spark.eventLog.dir
  默认|file:///tmp/spark-events
  保存日志相关信息的\径,可以是hdfs://开头的HDFS路径Q也可以是file://开头的本地路径Q都需要提前创?/p>

10Q?span style="color: #ff0000;">spark.eventLog.compress
  默认|false
  是否压羃记录Spark事gQ前提spark.eventLog.enabled为trueQ默认用的是snappy

以spark.history开头的需要配|在spark-env.sh中的SPARK_HISTORY_OPTSQ以spark.eventLog开头的配置在spark-defaults.conf

 

我在试q程中的配置如下Q?/p>

spark-defaults.conf

spark.eventLog.enabled  true spark.eventLog.dir      hdfs://hadoop000:8020/directory spark.eventLog.compress true

spark-env.sh

export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=7777 -Dspark.history.retainedApplications=3 -Dspark.history.fs.logDirectory=hdfs://had oop000:8020/directory"

参数描述Q?/p>

spark.history.ui.port=7777  调整WEBUI讉K的端口号?777

spark.history.fs.logDirectory=hdfs://hadoop000:8020/directory  配置了该属性后Q在start-history-server.sh时就无需再显C的指定路径

spark.history.retainedApplications=3   指定保存Application历史记录的个敎ͼ如果过q个|旧的应用E序信息被删除

 

调整参数后启动start-history-server.sh

start-history-server.sh 

讉KWEBUIQ?http://hadoop000:7777

 

在用spark history server的过E中产生的几个疑问:

疑问1Qspark.history.fs.logDirectory和spark.eventLog.dir指定目录有啥区别Q?/strong>

l测试后发现Q?/p>

spark.eventLog.dirQApplication在运行过E中所有的信息均记录在该属性指定的路径下;

spark.history.fs.logDirectoryQSpark History Server面只展C指定路径下的信息Q?/p>

比如Qspark.eventLog.dir刚开始时指定的是hdfs://hadoop000:8020/directoryQ而后修改成hdfs://hadoop000:8020/directory2

那么spark.history.fs.logDirectory如果指定的是hdfs://hadoop000:8020/directoryQ就只能昄目录下的所有Applicationq行的日志信息;反之亦然?/p>

 

疑问2Qspark.history.retainedApplications=3 貌似没生效?Q?Q?Q?/strong>

The History Server will list all applications. It will just retain a max number of them in memory. That option does not control how many applications are show, it controls how much memory the HS will need.

注意Q该参数q不是也面中显C的application的记录数Q而是存放在内存中的个敎ͼ内存中的信息在访问页面时直接d渲染既可Q?/p>

比如说该参数配置?0个,那么内存中就最多只能存?0个applicaiton的日志信息,当第11个加入时Q第一个就会被t除Q当再次讉KW?个application的页面信息时?span style="font-size: 14px; line-height: 1.5;">需要重新读取指定\径上的日志信息来渲染展示面?nbsp;

详见官方文Qhttp://spark.apache.org/docs/latest/monitoring.html



SIMONE 2016-05-26 14:12 发表评论
]]>Spark On Yarn中spark.yarn.jar属性的使用http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/26/430664.htmlSIMONESIMONEThu, 26 May 2016 06:11:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/26/430664.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/430664.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/26/430664.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/430664.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/430664.htmlhttp://www.cnblogs.com/luogankun/p/4191796.html

今天在测试spark-sqlq行在yarn上的q程中,无意间从日志中发C一个问题:

spark-sql --master yarn
复制代码
14/12/29 15:23:17 INFO Client: Requesting a new application from cluster with 1 NodeManagers 14/12/29 15:23:17 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 14/12/29 15:23:17 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 14/12/29 15:23:17 INFO Client: Setting up container launch context for our AM 14/12/29 15:23:17 INFO Client: Preparing resources for our AM container 14/12/29 15:23:17 INFO Client: Uploading resource file:/home/spark/software/source/compile/deploy_spark/assembly/target/scala-2.10/spark-assembly-1.3.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar -> hdfs://hadoop000:8020/user/spark/.sparkStaging/application_1416381870014_0093/spark-assembly-1.3.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar 14/12/29 15:23:18 INFO Client: Setting up the launch environment for our AM container
复制代码

再开启一个spark-sql命o行,从日志中再次发现Q?/p>

复制代码
14/12/29 15:24:03 INFO Client: Requesting a new application from cluster with 1 NodeManagers 14/12/29 15:24:03 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 14/12/29 15:24:03 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 14/12/29 15:24:03 INFO Client: Setting up container launch context for our AM 14/12/29 15:24:03 INFO Client: Preparing resources for our AM container 14/12/29 15:24:03 INFO Client: Uploading resource file:/home/spark/software/source/compile/deploy_spark/assembly/target/scala-2.10/spark-assembly-1.3.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar -> hdfs://hadoop000:8020/user/spark/.sparkStaging/application_1416381870014_0094/spark-assembly-1.3.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar 14/12/29 15:24:05 INFO Client: Setting up the launch environment for our AM container
复制代码

然后查看HDFS上的文gQ?/p>

hadoop fs -ls hdfs://hadoop000:8020/user/spark/.sparkStaging/
drwx------   - spark supergroup          0 2014-12-29 15:23 hdfs://hadoop000:8020/user/spark/.sparkStaging/application_1416381870014_0093 drwx------   - spark supergroup          0 2014-12-29 15:24 hdfs://hadoop000:8020/user/spark/.sparkStaging/application_1416381870014_0094

每个Application都会上传一个spark-assembly-x.x.x-SNAPSHOT-hadoopx.x.x-cdhx.x.x.jar的jar包,影响HDFS的性能以及占用HDFS的空间?/p>

 

在Spark文(http://spark.apache.org/docs/latest/running-on-yarn.html)中发?span style="color: #ff0000;">spark.yarn.jar属性,spark-assembly-xxxxx.jar存放在hdfs://hadoop000:8020/spark_lib/?/p>

在spark-defaults.confd属性配|:

spark.yarn.jar hdfs://hadoop000:8020/spark_lib/spark-assembly-1.3.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar

再次启动spark-sql --master yarn观察日志Q?/p>

复制代码
14/12/29 15:39:02 INFO Client: Requesting a new application from cluster with 1 NodeManagers 14/12/29 15:39:02 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 14/12/29 15:39:02 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 14/12/29 15:39:02 INFO Client: Setting up container launch context for our AM 14/12/29 15:39:02 INFO Client: Preparing resources for our AM container 14/12/29 15:39:02 INFO Client: Source and destination file systems are the same. Not copying hdfs://hadoop000:8020/spark_lib/spark-assembly-1.3.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar 14/12/29 15:39:02 INFO Client: Setting up the launch environment for our AM container
复制代码

观察HDFS上文?/p>

hadoop fs -ls hdfs://hadoop000:8020/user/spark/.sparkStaging/application_1416381870014_0097

该Application对应的目录下没有spark-assembly-xxxxx.jar了,从而节省assembly包上传的q程以及HDFSI间占用?/p>

 

我在试q程中遇CcM如下的错误:

Application application_xxxxxxxxx_yyyy failed 2 times due to AM Container for application_xxxxxxxxx_yyyy 

exited with exitCode: -1000 due to: java.io.FileNotFoundException: File /tmp/hadoop-spark/nm-local-dir/filecache does not exist

?tmp/hadoop-spark/nm-local-dir路径下创建filecache文g夹即可解x错问题?/p>



SIMONE 2016-05-26 14:11 发表评论
]]>Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines)http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/26/430663.htmlSIMONESIMONEThu, 26 May 2016 05:53:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/26/430663.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/430663.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/26/430663.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/430663.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/430663.htmlhttps://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines


wrote a blog post about how LinkedIn uses Apache Kafka as a central publish-subscribe log for integrating data between applications, stream processing, and Hadoop data ingestion.

To actually make this work, though, this "universal log" has to be a cheap abstraction. If you want to use a system as a central data hub it has to be fast, predictable, and easy to scale so you can dump all your data onto it. My experience has been that systems that are fragile or expensive inevitably develop a wall of protective process to prevent people from using them; a system that scales easily often ends up as a key architectural building block just because using it is the easiest way to get things built.

I've always liked the benchmarks of Cassandra that show it doing a million writes per second on three hundred machines onEC2 and Google Compute Engine. I'm not sure why, maybe it is a Dr. Evil thing, but doing a million of anything per second is fun.

In any case, one of the nice things about a Kafka log is that, as we'll see, it is cheap. A million writes per second isn't a particularly big thing. This is because a log is a much simpler thing than a database or key-value store. Indeed our production clusters take tens of millions of reads and writes per second all day long and they do so on pretty modest hardware.

But let's do some benchmarking and take a look.

Kafka in 30 seconds

To help understand the benchmark, let me give a quick review of what Kafka is and a few details about how it works. Kafka is a distributed messaging system originally built at LinkedIn and now part of the Apache Software Foundation and used by a variety of companies.

The general setup is quite simple. Producers send records to the cluster which holds on to these records and hands them out to consumers:

The key abstraction in Kafka is the topic. Producers publish their records to a topic, and consumers subscribe to one or more topics. A Kafka topic is just a sharded write-ahead log. Producers append records to these logs and consumers subscribe to changes. Each record is a key/value pair. The key is used for assigning the record to a log partition (unless the publisher specifies the partition directly).

Here is a simple example of a single producer and consumer reading and writing from a two-partition topic.

This picture shows a producer process appending to the logs for the two partitions, and a consumer reading from the same logs. Each record in the log has an associated entry number that we call the offset. This offset is used by the consumer to describe it's position in each of the logs.

These partitions are spread across a cluster of machines, allowing a topic to hold more data than can fit on any one machine.

Note that unlike most messaging systems the log is always persistent. Messages are immediately written to the filesystem when they are received. Messages are not deleted when they are read but retained with some configurable SLA (say a few days or a week). This allows usage in situations where the consumer of data may need to reload data. It also makes it possible to support space-efficient publish-subscribe as there is a single shared log no matter how many consumers; in traditional messaging systems there is usually a queue per consumer, so adding a consumer doubles your data size. This makes Kafka a good fit for things outside the bounds of normal messaging systems such as acting as a pipeline for offline data systems such as Hadoop. These offline systems may load only at intervals as part of a periodic ETL cycle, or may go down for several hours for maintenance, during which time Kafka is able to buffer even TBs of unconsumed data if needed.

Kafka also replicates its logs over multiple servers for fault-tolerance. One important architectural aspect of our replication implementation, in contrast to other messaging systems, is that replication is not an exotic bolt-on that requires complex configuration, only to be used in very specialized cases. Instead replication is assumed to be the default: we treat un-replicated data as a special case where the replication factor happens to be one.

Producers get an acknowledgement back when they publish a message containing the record's offset. The first record published to a partition is given the offset 0, the second record 1, and so on in an ever-increasing sequence. Consumers consume data from a position specified by an offset, and they save their position in a log by committing periodically: saving this offset in case that consumer instance crashes and another instance needs to resume from it's position.

Okay, hopefully that all made sense (if not, you can read a more complete introduction to Kafka here).

This Benchmark

This test is against trunk, as I made some improvements to the performance tests for this benchmark. But nothing too substantial has changed since the last full release, so you should see similar results with 0.8.1. I am also using our newly re-written Java producer, which offers much improved throughput over the previous producer client.

I've followed the basic template of this very nice RabbitMQ benchmark, but I covered scenarios and options that were more relevant to Kafka.

One quick philosophical note on this benchmark. For benchmarks that are going to be publicly reported, I like to follow a style I call "lazy benchmarking". When you work on a system, you generally have the know-how to tune it to perfection for any particular use case. This leads to a kind of benchmarketing where you heavily tune your configuration to your benchmark or worse have a different tuning for each scenario you test. I think the real test of a system is not how it performs when perfectly tuned, but rather how it performs "off the shelf". This is particularly true for systems that run in a multi-tenant setup with dozens or hundreds of use cases where tuning for each use case would be not only impractical but impossible. As a result, I have pretty much stuck with default settings, both for the server and the clients. I will point out areas where I suspect the result could be improved with a little tuning, but I have tried to resist the temptation to do any fiddling myself to improve the results.

I have posted my exact configurations and commands, so it should be possible to replicate results on your own gear if you are interested.

The Setup

For these tests, I had six machines each has the following specs

  • Intel Xeon 2.5 GHz processor with six cores
  • Six 7200 RPM SATA drives
  • 32GB of RAM
  • 1Gb Ethernet

The Kafka cluster is set up on three of the machines. The six drives are directly mounted with no RAID (JBOD style). The remaining three machines I use for Zookeeper and for generating load.

A three machine cluster isn't very big, but since we will only be testing up to a replication factor of three, it is all we need. As should be obvious, we can always add more partitions and spread data onto more machines to scale our cluster horizontally.

This hardware is actually not LinkedIn's normal Kafka hardware. Our Kafka machines are more closely tuned to running Kafka, but are less in the spirit of "off-the-shelf" I was aiming for with these tests. Instead, I borrowed these from one of our Hadoop clusters, which runs on probably the cheapest gear of any of our persistent systems. Hadoop usage patterns are pretty similar to Kafka's, so this is a reasonable thing to do.

Okay, without further ado, the results!

Producer Throughput

These tests will stress the throughput of the producer. No consumers are run during these tests, so all messages are persisted but not read (we'll test cases with both producer and consumer in a bit). Since we have recently rewritten our producer, I am testing this new code.

Single producer thread, no replication

821,557 records/sec
(78.3 MB/sec)

For this first test I create a topic with six partitions and no replication. Then I produce 50 million small (100 byte) records as quickly as possible from a single thread.

The reason for focusing on small records in these tests is that it is the harder case for a messaging system (generally). It is easy to get good throughput in MB/sec if the messages are large, but much harder to get good throughput when the messages are small, as the overhead of processing each message dominates.

Throughout this benchmark, when I am reporting MB/sec, I am reporting just the value size of the record times the request per second, none of the other overhead of the request is included. So the actually network usage is higher than what is reported. For example with a 100 byte message we would also transmit about 22 bytes of overhead per message (for an optional key, size delimiting, a message CRC, the record offset, and attributes flag), as well as some overhead for the request (including the topic, partition, required acknowledgements, etc). This makes it a little harder to see where we hit the limits of the NIC, but this seems a little more reasonable then including our own overhead bytes in throughput numbers. So, in the above result, we are likely saturating the 1 gigabit NIC on the client machine.

One immediate observation is that the raw numbers here are much higher than people expect, especially for a persistent storage system. If you are used to random-access data systems, like a database or key-value store, you will generally expect maximum throughput around 5,000 to 50,000 queries-per-second, as this is close to the speed that a good RPC layer can do remote requests. We exceed this due to two key design principles:

  1. We work hard to ensure we do linear disk I/O. The six cheap disks these servers have gives an aggregate throughput of 822 MB/sec of linear disk I/O. This is actually well beyond what we can make use of with only a 1 gigabit network card. Many messaging systems treat persistence as an expensive add-on that decimates performance and should be used only sparingly, but this is because they are not able to do linear I/O.
  2. At each stage we work on batching together small bits of data into larger network and disk I/O operations. For example, in the new producer we use a "group commit"-like mechanism to ensure that any record sends initiated while another I/O is in progress get grouped together. For more on understanding the importance of batching, check out this presentation by David Patterson on why "Latency Lags Bandwidth".

If you are interested in the details you can read a little more about this in our design documents.

Single producer thread, 3x asynchronous replication

786,980 records/sec
(75.1 MB/sec)

This test is exactly the same as the previous one except that now each partition has three replicas (so the total data written to network or disk is three times higher). Each server is doing both writes from the producer for the partitions for which it is a master, as well as fetching and writing data for the partitions for which it is a follower.

Replication in this test is asynchronous. That is, the server acknowledges the write as soon as it has written it to its local log without waiting for the other replicas to also acknowledge it. This means, if the master were to crash, it would likely lose the last few messages that had been written but not yet replicated. This makes the message acknowledgement latency a little better at the cost of some risk in the case of server failure.

The key take away I would like people to have from this is that replication can be fast. The total cluster write capacity is, of course, 3x less with 3x replication (since each write is done three times), but the throughput is still quite good per client. High performance replication comes in large part from the efficiency of our consumer (the replicas are really nothing more than a specialized consumer) which I will discuss in the consumer section.

Single producer thread, 3x synchronous replication

421,823 records/sec
(40.2 MB/sec)

This test is the same as above except that now the master for a partition waits for acknowledgement from the full set of in-sync replicas before acknowledging back to the producer. In this mode, we guarantee that messages will not be lost as long as one in-sync replica remains.

Synchronous replication in Kafka is not fundamentally very different from asynchronous replication. The leader for a partition always tracks the progress of the follower replicas to monitor their liveness, and we never give out messages to consumers until they are fully acknowledged by replicas. With synchronous replication we just wait to respond to the producer request until the followers have replicated it.

This additional latency does seem to affect our throughput. Since the code path on the server is very similar, we could probably ameliorate this impact by tuning the batching to be a bit more aggressive and allowing the client to buffer more outstanding requests. However, in spirit of avoiding special case tuning, I have avoided this.

Three producers, 3x async replication

2,024,032 records/sec
(193.0 MB/sec)

Our single producer process is clearly not stressing our three node cluster. To add a little more load, I'll now repeat the previous async replication test, but now use three producer load generators running on three different machines (running more processes on the same machine won't help as we are saturating the NIC). Then we can look at the aggregate throughput across these three producers to get a better feel for the cluster's aggregate capacity.

Producer Throughput Versus Stored Data

One of the hidden dangers of many messaging systems is that they work well only as long as the data they retain fits in memory. Their throughput falls by an order of magnitude (or more) when data backs up and isn't consumed (and hence needs to be stored on disk). This means things may be running fine as long as your consumers keep up and the queue is empty, but as soon as they lag, the whole messaging layer backs up with unconsumed data. The backup causes data to go to disk which in turns causes performance to drop to a rate that means messaging system can no longer keep up with incoming data and either backs up or falls over. This is pretty terrible, as in many cases the whole purpose of the queue was to handle such a case gracefully.

Since Kafka always persists messages the performance is O(1) with respect to unconsumed data volume.

To test this experimentally, let's run our throughput test over an extended period of time and graph the results as the stored dataset grows:

This graph actually does show some variance in performance, but no impact due to data size: we perform just as well after writing a TB of data, as we do for the first few hundred MBs.

The variance seems to be due to Linux's I/O management facilities that batch data and then flush it periodically. This is something we have tuned for a little better on our production Kafka setup. Some notes on tuning I/O are available here.

Consumer Throughput

Okay now let's turn our attention to consumer throughput.

Note that the replication factor will not effect the outcome of this test as the consumer only reads from one replica regardless of the replication factor. Likewise, the acknowledgement level of the producer also doesn't matter as the consumer only ever reads fully acknowledged messages, (even if the producer doesn't wait for full acknowledgement). This is to ensure that any message the consumer sees will always be present after a leadership handoff (if the current leader fails).

Single Consumer

940,521 records/sec
(89.7 MB/sec)

For the first test, we will consume 50 million messages in a single thread from our 6 partition 3x replicated topic.

Kafka's consumer is very efficient. It works by fetching chunks of log directly from the filesystem. It uses the sendfile API to transfer this directly through the operating system without the overhead of copying this data through the application. This test actually starts at the beginning of the log, so it is doing real read I/O. In a production setting, though, the consumer reads almost exclusively out of the OS pagecache, since it is reading data that was just written by some producer (so it is still cached). In fact, if you run I/O stat on a production server you actually see that there are no physical reads at all even though a great deal of data is being consumed.

Making consumers cheap is important for what we want Kafka to do. For one thing, the replicas are themselves consumers, so making the consumer cheap makes replication cheap. In addition, this makes handling out data an inexpensive operation, and hence not something we need to tightly control for scalability reasons.

Three Consumers

2,615,968 records/sec
(249.5 MB/sec)

Let's repeat the same test, but run three parallel consumer processes, each on a different machine, and all consuming the same topic.

As expected, we see near linear scaling (not surprising because consumption in our model is so simple).

Producer and Consumer

795,064 records/sec
(75.8 MB/sec)

The above tests covered just the producer and the consumer running in isolation. Now let's do the natural thing and run them together. Actually, we have technically already been doing this, since our replication works by having the servers themselves act as consumers.

All the same, let's run the test. For this test we'll run one producer and one consumer on a six partition 3x replicated topic that begins empty. The producer is again using async replication. The throughput reported is the consumer throughput (which is, obviously, an upper bound on the producer throughput).

As we would expect, the results we get are basically the same as we saw in the producer only case—the consumer is fairly cheap.

Effect of Message Size

I have mostly shown performance on small 100 byte messages. Smaller messages are the harder problem for a messaging system as they magnify the overhead of the bookkeeping the system does. We can show this by just graphing throughput in both records/second and MB/second as we vary the record size.

So, as we would expect, this graph shows that the raw count of records we can send per second decreases as the records get bigger. But if we look at MB/second, we see that the total byte throughput of real user data increases as messages get bigger:

We can see that with the 10 byte messages we are actually CPU bound by just acquiring the lock and enqueuing the message for sending—we are not able to actually max out the network. However, starting with 100 bytes, we are actually seeing network saturation (though the MB/sec continues to increase as our fixed-size bookkeeping bytes become an increasingly small percentage of the total bytes sent).

End-to-end Latency

2 ms (median)
3 ms (99th percentile)
14 ms (99.9th percentile)

We have talked a lot about throughput, but what is the latency of message delivery? That is, how long does it take a message we send to be delivered to the consumer? For this test, we will create producer and consumer and repeatedly time how long it takes for a producer to send a message to the kafka cluster and then be received by our consumer.

Note that, Kafka only gives out messages to consumers when they are acknowledged by the full in-sync set of replicas. So this test will give the same results regardless of whether we use sync or async replication, as that setting only affects the acknowledgement to the producer.

Replicating this test

If you want to try out these benchmarks on your own machines, you can. As I said, I mostly just used our pre-packaged performance testing tools that ship with Kafka and mostly stuck with the default configs both for the server and for the clients. However, you can see more details of the configuration and commands here.




SIMONE 2016-05-26 13:53 发表评论
]]>
Kafka 高性能吞吐揭秘http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/26/430662.htmlSIMONESIMONEThu, 26 May 2016 05:52:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/26/430662.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/430662.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/26/430662.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/430662.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/430662.htmlhttp://umeng.baijia.baidu.com/article/227913

本文针对Kafka性能斚wq行单分析,首先单介l一下Kafka的架构和涉及到的名词Q?/p>

1.    TopicQ用于划分Message的逻辑概念Q一个Topic可以分布在多个Broker上?/p>

2.    PartitionQ是Kafka中横向扩展和一切ƈ行化的基Q每个Topic都至被切分?个Partition?/p>

3.    OffsetQ消息在Partition中的~号Q编号顺序不跨Partition?/p>

4.    ConsumerQ用于从Broker中取?消费Message?/p>

5.    ProducerQ用于往Broker中发?生Message?/p>

6.    ReplicationQKafka支持以Partition为单位对Messageq行冗余备䆾Q每个Partition都可以配|至?个Replication(当仅1个Replication时即仅该Partition本n)?/p>

7.    LeaderQ每个Replication集合中的Partition都会选出一个唯一的LeaderQ所有的dh都由Leader处理。其他Replicas从Leader处把数据更新同步到本圎ͼq程cM大家熟悉的MySQL中的Binlog同步?/p>

8.    BrokerQKafka中用Broker来接受Producer和Consumer的请求,q把Message持久化到本地盘。每个Cluster当中会选DZ个Broker来担任ControllerQ负责处理Partition的Leader选DQ协调Partitionq移{工作?/p>

9.    ISR(In-Sync Replica)Q是Replicas的一个子集,表示目前Alive且与Leader能够“Catch-up”的Replicas集合。由于读写都是首先落到Leader上,所以一般来说通过同步机制从Leader上拉取数据的Replica都会和Leader有一些gq?包括了gq时间和延迟条数两个l度)QQ意一个超q阈值都会把该Replicat出ISR。每个Partition都有它自q立的ISR?/p>

以上几乎是我们在使用Kafka的过E中可能遇到的所有名词,同时也无一不是最核心的概忉|lgQ感觉到从设计本w来_Kafkaq是_z的。这ơ本文围lKafka优异的吞吐性能Q逐个介绍一下其设计与实现当中所使用的各?#8220;黑科技”?/p>

Broker

不同于Redis和MemcacheQ{内存消息队列,Kafka的设计是把所有的Message都要写入速度低容量大的硬盘,以此来换取更强的存储能力。实际上QKafka使用盘q没有带来过多的性能损失Q?#8220;规规矩矩”的抄了一?#8220;q道”?/p>

首先Q说“规规矩矩”是因为Kafka在磁盘上只做Sequence I/OQ由于消息系l读写的Ҏ性,qƈ不存在什么问题。关于磁盘I/O的性能Q引用一lKafka官方l出的测试数?Raid-5Q?200rpm)Q?/p>

Sequence I/O: 600MB/s

Random I/O: 100KB/s

所以通过只做Sequence I/O的限Ӟ规避了磁盘访问速度低下Ҏ能可能造成的媄响?/p>

接下来我们再聊一聊Kafka是如?#8220;抄近道的”?/p>

首先QKafka重度依赖底层操作pȝ提供的PageCache功能。当上层有写操作Ӟ操作pȝ只是数据写入PageCacheQ同时标记Page属性ؓDirty。当L作发生时Q先从PageCache中查找,如果发生~页才进行磁盘调度,最l返回需要的数据。实际上PageCache是把可能多的空闲内存都当做了磁盘缓存来使用。同时如果有其他q程甌内存Q回收PageCache的代价又很小Q所以现代的OS都支持PageCache?/p>

使用PageCache功能同时可以避免在JVM内部~存数据QJVM为我们提供了强大的GC能力Q同时也引入了一些问题不适用与Kafka的设计?/p>

·         如果在Heap内管理缓存,JVM的GCU程会频J扫描HeapI间Q带来不必要的开销。如果Heapq大Q执行一ơFull GC对系l的可用性来说将是极大的挑战?/p>

·         所有在在JVM内的对象都不免带有一个Object Overhead(千万不可视)Q内存的有效I间利用率会因此降低?/p>

·         所有的In-Process Cache在OS中都有一份同LPageCache。所以通过缓存只攑֜PageCacheQ可以至让可用~存I间d?/p>

·         如果Kafka重启Q所有的In-Process Cache都会失效Q而OS理的PageCache依然可以l箋使用?/p>

PageCacheq只是第一步,KafkaZq一步的优化性能q采用了Sendfile技术。在解释Sendfile之前Q首先介l一下传l的|络I/O操作程Q大体上分ؓ以下4步?/p>

1.    OS 从硬盘把数据d内核区的PageCache?/p>

2.    用户q程把数据从内核区Copy到用户区?/p>

3.    然后用户q程再把数据写入到SocketQ数据流入内核区的Socket Buffer上?/p>

4.    OS 再把数据从Buffer中Copy到网卡的Buffer上,q样完成一ơ发送?/p>

整个q程q历两ơContext SwitchQ四ơSystem Call。同一份数据在内核Buffer与用户Buffer之间重复拯Q效率低下。其??两步没有必要Q完全可以直接在内核区完成数据拷贝。这也正是Sendfile所解决的问题,l过Sendfile优化后,整个I/Oq程变成了下面q个样子?/p>

通过以上的介l不隄出,Kafka的设计初h一切努力在内存中完成数据交换,无论是对外作Z整个消息pȝQ或是内部同底层操作pȝ的交互。如果Producer和Consumer之间生和消费进度上配合得当Q完全可以实现数据交换零I/O。这也就是我Z么说Kafka使用“盘”q没有带来过多性能损失的原因。下面是我在生环境中采到的一些指标?/p>

(20 Brokers, 75 Partitions per Broker, 110k msg/s)

此时的集只有写Q没有读操作?0M/s左右的Send的流量是Partition之间q行Replicate而生的。从recv和writ的速率比较可以看出Q写盘是使用Asynchronous+Batch的方式,底层OS可能q会q行盘写顺序优化。而在有Read Requestq来的时候分ZU情况,W一U是内存中完成数据交换?/p>

Send量从^?0M/s增加C到^?0M/sQ而磁盘Read只有不超q?0KB/s。PageCache降低盘I/O效果非常明显?/p>

接下来是M些收C一D|_已经从内存中被换出刷写到盘上的老数据?/p>

其他指标q是老样子,而磁盘Read已经飚高?0+MB/s。此时全部的数据都已l是走硬盘了(对硬盘的序dOS层会q行Prefill PageCache的优?。依然没有Q何性能问题?/p>

Tips

1.    Kafka官方q不通过Broker端的log.flush.interval.messages和log.flush.interval.ms来强制写盘,认ؓ数据的可靠性应该通过Replica来保证,而强制Flush数据到磁盘会Ҏ体性能产生影响?/p>

2.    可以通过调整/proc/sys/vm/dirty_background_ratio?proc/sys/vm/dirty_ratio来调优性能?/p>

a.    脏页率超q第一个指标会启动pdflush开始Flush Dirty PageCache?/p>

b.    脏页率超q第二个指标会阻塞所有的写操作来q行Flush?/p>

c.    Ҏ不同的业务需求可以适当的降低dirty_background_ratio和提高dirty_ratio?/p>

Partition

Partition是Kafka可以很好的横向扩展和提供高ƈ发处理以及实现Replication的基?/p>

扩展性方面。首先,Kafka允许Partition在集内的Broker之间LUdQ以此来均衡可能存在的数据倾斜问题。其ơ,Partition支持自定义的分区法Q例如可以将同一个Key的所有消息都路由到同一个Partition上去?同时Leader也可以在In-Sync的Replica中迁UR由于针Ҏ一个Partition的所有读写请求都是只由Leader来处理,所以Kafka会尽量把Leader均匀的分散到集群的各个节点上Q以免造成|络量q于集中?/p>

q发斚w。Q意Partition在某一个时d能被一个Consumer Group内的一个Consumer消费(反过来一个Consumer则可以同时消费多个Partition)QKafka非常z的Offset机制最化了Broker和Consumer之间的交互,qKafkaq不会像同类其他消息队列一P随着下游Consumer数目的增加而成比例的降低性能。此外,如果多个Consumer恰y都是消费旉序上很相q的数据Q可以达到很高的PageCache命中率,因而Kafka可以非常高效的支持高q发L作,实践中基本可以达到单机网卡上限?/p>

不过QPartition的数量ƈ不是多好QPartition的数量越多,q_到每一个Broker上的数量也就多。考虑到Broker宕机(Network Failure, Full GC)的情况下Q需要由Controller来ؓ所有宕机的Broker上的所有Partition重新选DLeaderQ假设每个Partition的选D消?0msQ如果Broker上有500个PartitionQ那么在q行选D?s的时间里Q对上述Partition的读写操作都会触发LeaderNotAvailableException?/p>

再进一步,如果挂掉的Broker是整个集的ControllerQ那么首先要q行的是重新d一个Broker作ؓController。新d的Controller要从Zookeeper上获取所有Partition的Meta信息Q获取每个信息大?-5msQ那么如果有10000个Partitionq个旉׃辑ֈ30s-50s。而且不要忘记q只是重新启动一个Controllerp的时_在这基础上还要再加上前面说的选DLeader的时?-_-!!!!!!

此外Q在Broker端,对Producer和Consumer都用了Buffer机制。其中Buffer的大是l一配置的,数量则与Partition个数相同。如果Partition个数q多Q会DProducer和Consumer的Buffer内存占用q大?/p>

Tips

1.    Partition的数量尽量提前预分配Q虽然可以在后期动态增加PartitionQ但是会冒着可能破坏Message Key和Partition之间对应关系的风险?/p>

2.    Replica的数量不要过多,如果条g允许量把Replica集合内的Partition分别调整C同的Rack?/p>

3.    一切努力保证每ơ停Broker旉可以Clean ShutdownQ否则问题就不仅仅是恢复服务所需旉长,q可能出现数据损坏或其他很诡异的问题?/p>

Producer

Kafka的研发团队表C在0.8版本里用Java重写了整个ProducerQ据说性能有了很大提升。我q没有亲自对比试用过Q这里就不做数据Ҏ了。本文结扩展阅读里提C一套我认ؓ比较好的对照l,有兴的同学可以试一下?/p>

其实在Producer端的优化大部分消息系l采取的方式都比较单一Q无非也化零ؓ整、同步变异步q么几种?/p>

Kafkapȝ默认支持MessageSetQ把多条Message自动地打成一个Group后发送出去,均摊后拉低了每次通信的RTT。而且在组lMessageSet的同Ӟq可以把数据重新排序Q从爆发式的随机写入优化成较ؓq稳的线性写入?/p>

此外Q还要着重介l的一ҎQProducer支持End-to-End的压~。数据在本地压羃后放到网l上传输Q在Broker一般不解压(除非指定要Deep-Iteration)Q直x息被Consume之后在客L解压?/p>

当然用户也可以选择自己在应用层上做压羃和解压的工作(毕竟Kafka目前支持的压~算法有限,只有GZIP和Snappy)Q不q这样做反而会意外的降低效率!Q!Q?Kafka的End-to-End压羃与MessageSet配合在一起工作效果最佻I上面的做法直接割裂了两者间联系。至于道理其实很单,压羃法中一条基本的原理“重复的数据量多Q压~比高”。无关于消息体的内容Q无关于消息体的数量Q大多数情况下输入数据量大一些会取得更好的压~比?/p>

不过Kafka采用MessageSet也导致在可用性上一定程度的妥协。每ơ发送数据时QProducer都是send()之后p为已l发送出MQ但其实大多数情况下消息q在内存的MessageSet当中Q尚未发送到|络Q这时候如果Producer挂掉Q那׃出现丢数据的情况?/p>

Z解决q个问题QKafka?.8版本的设计借鉴了网l当中的ack机制。如果对性能要求较高Q又能在一定程度上允许Message的丢失,那就可以讄request.required.acks=0 来关闭ackQ以全速发送。如果需要对发送的消息q行认Q就需要设|request.required.acks??1Q那??1又有什么区别呢Q这里又要提到前面聊的有关Replica数量问题。如果配|ؓ1Q表C消息只需要被Leader接收q确认即可,其他的Replica可以q行异步拉取无需立即q行认Q在保证可靠性的同时又不会把效率拉得很低。如果设|ؓ-1Q表C消息要Commit到该Partition的ISR集合中的所有Replica后,才可以返回ackQ消息的发送会更安全,而整个过E的延迟会随着Replica的数量正比增长,q里需要根据不同的需求做相应的优化?/p>

Tips

1.    Producer的线E不要配|过多,其是在Mirror或者Migration中用的时候,会加剧目标集Partition消息乱序的情?如果你的应用场景Ҏ息顺序很敏感的话)?/p>

2.    0.8版本的request.required.acks默认?(?.7)?/p>

Consumer

Consumer端的设计大体上还是比较常规的?/p>

·         通过Consumer GroupQ可以支持生产者消费者和队列讉K两种模式?/p>

·         Consumer API分ؓHigh level和Low level两种。前一U重度依赖ZookeeperQ所以性能差一些且不自由,但是省心。第二种不依赖Zookeeper服务Q无Z自由度和性能上都有更好的表现Q但是所有的异常(Leaderq移、Offset界、Broker宕机{?和Offset的维护都需要自行处理?/p>

·         大家可以x下不日发布的0.9 Release。开发h员又用Java重写了一套Consumer。把两套API合ƈ在一P同时L了对Zookeeper的依赖。据说性能有大q度提升哦~~

Tips

强烈推荐使用Low level APIQ虽然繁琐一些,但是目前只有q个API可以对Error数据q行自定义处理,其是处理Broker异常或由于Unclean ShutdownD的Corrupted DataӞ否则无法Skip只能{着“坏消?#8221;在Broker上被Rotate掉,在此期间该Replica会一直处于不可用状态?/p>

扩展阅读

Sendfile: https://www.ibm.com/developerworks/cn/java/j-zerocopy/

So what’s wrong with 1975 programming: https://www.varnish-cache.org/trac/wiki/ArchitectNotes

Benchmarking: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines




SIMONE 2016-05-26 13:52 发表评论
]]>
JAVA实现gif囄~放与剪切功?/title><link>http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/23/430621.html</link><dc:creator>SIMONE</dc:creator><author>SIMONE</author><pubDate>Mon, 23 May 2016 06:40:00 GMT</pubDate><guid>http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/23/430621.html</guid><wfw:comment>http://www.dentisthealthcenter.com/wangxinsh55/comments/430621.html</wfw:comment><comments>http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/23/430621.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/430621.html</wfw:commentRss><trackback:ping>http://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/430621.html</trackback:ping><description><![CDATA[<div>http://www.open-open.com/lib/view/open1394859355853.html</div><br /><div>package com.pinker.util;<br />import java.awt.Color;<br />import java.awt.Graphics;<br />import java.awt.Image;<br />import java.awt.image.BufferedImage;<br />import java.io.File;<br />import java.io.IOException;<br />import java.util.Arrays;<br />import java.util.Iterator;<br /><br />import javax.imageio.IIOImage;<br />import javax.imageio.ImageIO;<br />import javax.imageio.ImageReader;<br />import javax.imageio.ImageWriter;<br />import javax.imageio.stream.ImageInputStream;<br />import javax.imageio.stream.ImageOutputStream;<br /> <br />/**<br /> * 囑փ裁剪以及压羃处理工具c?br /> *<br /> * 主要针对动态的GIF格式囄裁剪之后Q只出现一帧动态效果的现象提供解决Ҏ<br /> *<br /> * 提供依赖三方包解x案(针对GIF格式数据特征一一解析Q进行编码解码操作)<br /> * 提供ZJDK Image I/O 的解x?JDK探烦p|)<br /> */<br />public class ImageUtil2 {<br /> <br />    public enum IMAGE_FORMAT{<br />        BMP("bmp"),<br />        JPG("jpg"),<br />        WBMP("wbmp"),<br />        JPEG("jpeg"),<br />        PNG("png"),<br />        GIF("gif");<br />         <br />        private String value;<br />        IMAGE_FORMAT(String value){<br />            this.value = value;<br />        }<br />        public String getValue() {<br />            return value;<br />        }<br />        public void setValue(String value) {<br />            this.value = value;<br />        }<br />    }<br />     <br />     <br />    /**<br />     * 获取囄格式<br />     * @param file   囄文g<br />     * @return    囄格式<br />     */<br />    public static String getImageFormatName(File file)throws IOException{<br />        String formatName = null;<br />         <br />        ImageInputStream iis = ImageIO.createImageInputStream(file);<br />        Iterator<ImageReader> imageReader =  ImageIO.getImageReaders(iis);<br />        if(imageReader.hasNext()){<br />            ImageReader reader = imageReader.next();<br />            formatName = reader.getFormatName();<br />        }<br /> <br />        return formatName;<br />    }<br />     <br />    /*************************  Z三方包解x?nbsp;   *****************************/<br />    /**<br />     * 剪切囄<br />     *<br />     * @param source        待剪切图片\?br />     * @param targetPath    裁剪后保存\径(默认为源路径Q?br />     * @param x                起始横坐?br />     * @param y                起始U坐?br />     * @param width            剪切宽度<br />     * @param height        剪切高度         <br />     *<br />     * @returns            裁剪后保存\径(囄后缀Ҏ囄本ncd生成Q?nbsp;   <br />     * @throws IOException<br />     */<br />    public static String cutImage(String sourcePath , String targetPath , int x , int y , int width , int height) throws IOException{<br />        File file = new File(sourcePath);<br />        if(!file.exists()) {<br />            throw new IOException("not found the imageQ? + sourcePath);<br />        }<br />        if(null == targetPath || targetPath.isEmpty()) targetPath = sourcePath;<br />         <br />        String formatName = getImageFormatName(file);<br />        if(null == formatName) return targetPath;<br />        formatName = formatName.toLowerCase();<br />         <br />        // 防止囄后缀与图片本w类型不一致的情况<br />        String pathPrefix = getPathWithoutSuffix(targetPath);<br />        targetPath = pathPrefix + formatName;<br />         <br />        // GIF需要特D处?br />        if(IMAGE_FORMAT.GIF.getValue() == formatName){<br />            GifDecoder decoder = new GifDecoder();  <br />            int status = decoder.read(sourcePath);  <br />            if (status != GifDecoder.STATUS_OK) {  <br />                throw new IOException("read image " + sourcePath + " error!");  <br />            }<br /> <br />            AnimatedGifEncoder encoder = new AnimatedGifEncoder();<br />            encoder.start(targetPath);<br />            encoder.setRepeat(decoder.getLoopCount());  <br />            for (int i = 0; i < decoder.getFrameCount(); i ++) {  <br />                encoder.setDelay(decoder.getDelay(i));  <br />                BufferedImage childImage = decoder.getFrame(i);<br />                BufferedImage image = childImage.getSubimage(x, y, width, height);<br />                encoder.addFrame(image);  <br />            }  <br />            encoder.finish();<br />        }else{<br />            BufferedImage image = ImageIO.read(file);<br />            image = image.getSubimage(x, y, width, height);<br />            ImageIO.write(image, formatName, new File(targetPath));<br />        }<br />        //普通图?br />        BufferedImage image = ImageIO.read(file);<br />        image = image.getSubimage(x, y, width, height);<br />        ImageIO.write(image, formatName, new File(targetPath));<br />        <br />        return targetPath;<br />    }<br />     <br />    /**<br />     * 压羃囄<br />     * @param sourcePath       待压~的囄路径<br />     * @param targetPath    压羃后图片\径(默认为初始\径)<br />     * @param width            压羃宽度<br />     * @param height        压羃高度<br />     *<br />     * @returns                   裁剪后保存\径(囄后缀Ҏ囄本ncd生成Q?nbsp;   <br />     * @throws IOException<br />     */<br />    public static String zoom(String sourcePath , String targetPath, int width , int height) throws IOException{<br />        File file = new File(sourcePath);<br />        if(!file.exists()) {<br />            throw new IOException("not found the image Q? + sourcePath);<br />        }<br />        if(null == targetPath || targetPath.isEmpty()) targetPath = sourcePath;<br />         <br />        String formatName = getImageFormatName(file);<br />        if(null == formatName) return targetPath;<br />        formatName = formatName.toLowerCase();<br />         <br />        // 防止囄后缀与图片本w类型不一致的情况<br />        String pathPrefix = getPathWithoutSuffix(targetPath);<br />        targetPath = pathPrefix + formatName;<br />         <br />        // GIF需要特D处?br />        if(IMAGE_FORMAT.GIF.getValue() == formatName){<br />            GifDecoder decoder = new GifDecoder();  <br />            int status = decoder.read(sourcePath);  <br />            if (status != GifDecoder.STATUS_OK) {  <br />                throw new IOException("read image " + sourcePath + " error!");  <br />            }<br /> <br />            AnimatedGifEncoder encoder = new AnimatedGifEncoder();<br />            encoder.start(targetPath);<br />            encoder.setRepeat(decoder.getLoopCount());  <br />            for (int i = 0; i < decoder.getFrameCount(); i ++) {  <br />                encoder.setDelay(decoder.getDelay(i));  <br />                BufferedImage image = zoom(decoder.getFrame(i), width , height);<br />                encoder.addFrame(image);  <br />            }  <br />            encoder.finish();<br />        }else{<br />            BufferedImage image = ImageIO.read(file);<br />            BufferedImage zoomImage = zoom(image , width , height);<br />            ImageIO.write(zoomImage, formatName, new File(targetPath));<br />        }<br />        BufferedImage image = ImageIO.read(file);<br />        BufferedImage zoomImage = zoom(image , width , height);<br />        ImageIO.write(zoomImage, formatName, new File(targetPath));<br />         <br />        return targetPath;<br />    }<br />     <br />    /*********************** ZJDK 解决Ҏ     ********************************/<br />     <br />    /**<br />     * d囄<br />     * @param file 囄文g<br />     * @return     囄数据<br />     * @throws IOException<br />     */<br />    public static BufferedImage[] readerImage(File file) throws IOException{<br />        BufferedImage sourceImage = ImageIO.read(file);<br />        BufferedImage[] images = null;<br />        ImageInputStream iis = ImageIO.createImageInputStream(file);<br />        Iterator<ImageReader> imageReaders = ImageIO.getImageReaders(iis);<br />        if(imageReaders.hasNext()){<br />            ImageReader reader = imageReaders.next();<br />            reader.setInput(iis);<br />            int imageNumber = reader.getNumImages(true);<br />            images = new BufferedImage[imageNumber];<br />            for (int i = 0; i < imageNumber; i++) {<br />                BufferedImage image = reader.read(i);<br />                if(sourceImage.getWidth() > image.getWidth() || sourceImage.getHeight() > image.getHeight()){<br />                    image = zoom(image, sourceImage.getWidth(), sourceImage.getHeight());<br />                }<br />                images[i] = image;<br />            }<br />            reader.dispose();<br />            iis.close();<br />        }<br />        return images;<br />    }<br />     <br />    /**<br />     * Ҏ要求处理囄<br />     *<br />     * @param images    囄数组<br />     * @param x            横向起始位置<br />     * @param y         U向起始位置<br />     * @param width      宽度    <br />     * @param height    宽度<br />     * @return            处理后的囄数组<br />     * @throws Exception<br />     */<br />    public static BufferedImage[] processImage(BufferedImage[] images , int x , int y , int width , int height) throws Exception{<br />        if(null == images){<br />            return images;<br />        }<br />        BufferedImage[] oldImages = images;<br />        images = new BufferedImage[images.length];<br />        for (int i = 0; i < oldImages.length; i++) {<br />            BufferedImage image = oldImages[i];<br />            images[i] = image.getSubimage(x, y, width, height);<br />        }<br />        return images;<br />    }<br />     <br />    /**<br />     * 写入处理后的囄到file<br />     *<br />     * 囄后缀Ҏ囄格式生成<br />     *<br />     * @param images        处理后的囄数据<br />     * @param formatName     囄格式<br />     * @param file            写入文g对象<br />     * @throws Exception<br />     */<br />    public static void writerImage(BufferedImage[] images ,  String formatName , File file) throws Exception{<br />        Iterator<ImageWriter> imageWriters = ImageIO.getImageWritersByFormatName(formatName);<br />        if(imageWriters.hasNext()){<br />            ImageWriter writer = imageWriters.next();<br />            String fileName = file.getName();<br />            int index = fileName.lastIndexOf(".");<br />            if(index > 0){<br />                fileName = fileName.substring(0, index + 1) + formatName;<br />            }<br />            String pathPrefix = getFilePrefixPath(file.getPath());<br />            File outFile = new File(pathPrefix + fileName);<br />            ImageOutputStream ios = ImageIO.createImageOutputStream(outFile);<br />            writer.setOutput(ios);<br />             <br />            if(writer.canWriteSequence()){<br />                writer.prepareWriteSequence(null);<br />                for (int i = 0; i < images.length; i++) {<br />                    BufferedImage childImage = images[i];<br />                    IIOImage image = new IIOImage(childImage, null , null);<br />                    writer.writeToSequence(image, null);<br />                }<br />                writer.endWriteSequence();<br />            }else{<br />                for (int i = 0; i < images.length; i++) {<br />                    writer.write(images[i]);<br />                }<br />            }<br />             <br />            writer.dispose();<br />            ios.close();<br />        }<br />    }<br />     <br />    /**<br />     * 剪切格式囄<br />     *<br />     * ZJDK Image I/O解决Ҏ<br />     *<br />     * @param sourceFile        待剪切图片文件对?br />     * @param destFile                  裁剪后保存文件对?br />     * @param x                    剪切横向起始位置<br />     * @param y                 剪切U向起始位置<br />     * @param width              剪切宽度    <br />     * @param height            剪切宽度<br />     * @throws Exception<br />     */<br />    public static void cutImage(File sourceFile , File destFile, int x , int y , int width , int height) throws Exception{<br />        // d囄信息<br />        BufferedImage[] images = readerImage(sourceFile);<br />        // 处理囄<br />        images = processImage(images, x, y, width, height);<br />        // 获取文g后缀<br />        String formatName = getImageFormatName(sourceFile);<br />        destFile = new File(getPathWithoutSuffix(destFile.getPath()) + formatName);<br /> <br />        // 写入处理后的囄到文?br />        writerImage(images, formatName , destFile);<br />    }<br />     <br />     <br />     <br />    /**<br />     * 获取pȝ支持的图片格?br />     */<br />    public static void getOSSupportsStandardImageFormat(){<br />        String[] readerFormatName = ImageIO.getReaderFormatNames();<br />        String[] readerSuffixName = ImageIO.getReaderFileSuffixes();<br />        String[] readerMIMEType = ImageIO.getReaderMIMETypes();<br />        System.out.println("========================= OS supports reader ========================");<br />        System.out.println("OS supports reader format name :  " + Arrays.asList(readerFormatName));<br />        System.out.println("OS supports reader suffix name :  " + Arrays.asList(readerSuffixName));<br />        System.out.println("OS supports reader MIME type :  " + Arrays.asList(readerMIMEType));<br />         <br />        String[] writerFormatName = ImageIO.getWriterFormatNames();<br />        String[] writerSuffixName = ImageIO.getWriterFileSuffixes();<br />        String[] writerMIMEType = ImageIO.getWriterMIMETypes();<br />         <br />        System.out.println("========================= OS supports writer ========================");<br />        System.out.println("OS supports writer format name :  " + Arrays.asList(writerFormatName));<br />        System.out.println("OS supports writer suffix name :  " + Arrays.asList(writerSuffixName));<br />        System.out.println("OS supports writer MIME type :  " + Arrays.asList(writerMIMEType));<br />    }<br />     <br />    /**<br />     * 压羃囄<br />     * @param sourceImage    待压~图?br />     * @param width          压羃囄高度<br />     * @param heigt          压羃囄宽度<br />     */<br />    private static BufferedImage zoom(BufferedImage sourceImage , int width , int height){<br />        BufferedImage zoomImage = new BufferedImage(width, height, sourceImage.getType());<br />        Image image = sourceImage.getScaledInstance(width, height, Image.SCALE_SMOOTH);<br />        Graphics gc = zoomImage.getGraphics();<br />        gc.setColor(Color.WHITE);<br />        gc.drawImage( image , 0, 0, null);<br />        return zoomImage;<br />    }<br />     <br />    /**<br />     * 获取某个文g的前~路径<br />     *<br />     * 不包含文件名的\?br />     *<br />     * @param file   当前文g对象<br />     * @return<br />     * @throws IOException<br />     */<br />    public static String getFilePrefixPath(File file) throws IOException{<br />        String path = null;<br />        if(!file.exists()) {<br />            throw new IOException("not found the file !" );<br />        }<br />        String fileName = file.getName();<br />        path = file.getPath().replace(fileName, "");<br />        return path;<br />    }<br />     <br />    /**<br />     * 获取某个文g的前~路径<br />     *<br />     * 不包含文件名的\?br />     *<br />     * @param path   当前文g路径<br />     * @return       不包含文件名的\?br />     * @throws Exception<br />     */<br />    public static String getFilePrefixPath(String path) throws Exception{<br />        if(null == path || path.isEmpty()) throw new Exception("文g路径为空Q?);<br />        int index = path.lastIndexOf(File.separator);<br />        if(index > 0){<br />            path = path.substring(0, index + 1);<br />        }<br />        return path;<br />    }<br />     <br />    /**<br />     * 获取不包含后~的文件\?br />     *<br />     * @param src<br />     * @return<br />     */<br />    public static String getPathWithoutSuffix(String src){<br />        String path = src;<br />        int index = path.lastIndexOf(".");<br />        if(index > 0){<br />            path = path.substring(0, index + 1);<br />        }<br />        return path;<br />    }<br />     <br />    /**<br />     * 获取文g?br />     * @param filePath        文g路径<br />     * @return                文g?br />     * @throws IOException<br />     */<br />    public static String getFileName(String filePath) throws IOException{<br />        File file = new File(filePath);<br />        if(!file.exists()) {<br />            throw new IOException("not found the file !" );<br />        }<br />        return file.getName();<br />    }<br />     <br />    /**<br />     * @param args<br />     * @throws Exception<br />     */<br />    public static void main(String[] args) throws Exception {<br />        // 获取pȝ支持的图片格?br />        //ImageCutterUtil.getOSSupportsStandardImageFormat();<br />         <br />        try {<br />            // 起始坐标Q剪切大?br />            int x = 100;<br />            int y = 75;<br />            int width = 100;<br />            int height = 100;<br />            // 参考图像大?br />            int clientWidth = 300;<br />            int clientHeight = 250;<br />             <br />             <br />            File file = new File("C:\\1.jpg");<br />            BufferedImage image = ImageIO.read(file);<br />            double destWidth = image.getWidth();<br />            double destHeight = image.getHeight();<br />             <br />            if(destWidth < width || destHeight < height)<br />                throw new Exception("源图大小于截取囄大小!");<br />             <br />            double widthRatio = destWidth / clientWidth;<br />            double heightRatio = destHeight / clientHeight;<br />             <br />            x = Double.valueOf(x * widthRatio).intValue();<br />            y = Double.valueOf(y * heightRatio).intValue();<br />            width = Double.valueOf(width * widthRatio).intValue();<br />            height = Double.valueOf(height * heightRatio).intValue();<br />             <br />            System.out.println("裁剪大小  x:" + x + ",y:" + y + ",width:" + width + ",height:" + height);<br /> <br />            /************************ Z三方包解x?*************************/<br />            String formatName = getImageFormatName(file);<br />            String pathSuffix = "." + formatName;<br />            String pathPrefix = getFilePrefixPath(file);<br />            String targetPath = pathPrefix  + System.currentTimeMillis() + pathSuffix;<br />            targetPath = ImageUtil2.cutImage(file.getPath(), targetPath, x , y , width, height);<br />             <br />            String bigTargetPath = pathPrefix  + System.currentTimeMillis() + pathSuffix;<br />            ImageUtil2.zoom(targetPath, bigTargetPath, 100, 100);<br />             <br />            String smallTargetPath = pathPrefix  + System.currentTimeMillis() + pathSuffix;<br />            ImageUtil2.zoom(targetPath, smallTargetPath, 50, 50);<br />             <br />            /************************ ZJDK Image I/O 解决Ҏ(JDK探烦p|) *************************/<br />//                File destFile = new File(targetPath);<br />//                ImageCutterUtil.cutImage(file, destFile, x, y, width, height);<br />        } catch (IOException e) {<br />            e.printStackTrace();<br />        }<br />    }<br />}</div><img src ="http://www.dentisthealthcenter.com/wangxinsh55/aggbug/430621.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.dentisthealthcenter.com/wangxinsh55/" target="_blank">SIMONE</a> 2016-05-23 14:40 <a href="http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/23/430621.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>ZRedis实现分布式锁 http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/12/430470.htmlSIMONESIMONEThu, 12 May 2016 09:52:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/12/430470.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/430470.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/05/12/430470.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/430470.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/430470.htmlhttp://chaopeng.me/blog/2014/01/26/redis-lock.html
http://blog.csdn.net/ugg/article/details/41894947

http://www.jeffkit.info/2011/07/1000/

Redis有一pd的命令,特点是以NXl尾QNX是Not eXists的羃写,如SETNX命o应该理解ؓQSET if Not eXists。这pd的命令非常有用,q里讲用SETNX来实现分布式锁?/p>

用SETNX实现分布式锁

利用SETNX非常单地实现分布式锁。例如:某客L要获得一个名字foo的锁Q客L使用下面的命令进行获取:

SETNX lock.foo <current Unix time + lock timeout + 1>

  •  如返?Q则该客L获得锁,把lock.foo的键D|ؓ旉DC键已被锁定,该客L最后可以通过DEL lock.foo来释放该锁?/li>
  •  如返?Q表明该锁已被其他客L取得Q这时我们可以先q回或进行重试等Ҏ完成或等待锁时?/li>

解决死锁

上面的锁定逻辑有一个问题:如果一个持有锁的客Lp|或崩溃了不能释放锁,该怎么解决Q我们可以通过锁的键对应的旉x判断q种情况是否发生了,如果当前的时间已l大于lock.foo的|说明该锁已失效,可以被重C用?/p>

发生q种情况Ӟ可不能简单的通过DEL来删除锁Q然后再SETNX一ơ,当多个客L到锁超时后都会试去释攑֮Q这里就可能出现一个竞态条?让我们模拟一下这个场景:

  1.  C0操作时了,但它q持有着锁,C1和C2dlock.foo查时间戳Q先后发现超时了?/li>
  2.  C1 发送DEL lock.foo
  3.  C1 发送SETNX lock.foo q且成功了?/li>
  4.  C2 发送DEL lock.foo
  5.  C2 发送SETNX lock.foo q且成功了?/li>

q样一来,C1QC2都拿C锁!问题大了Q?/p>

q好q种问题是可以避免DQ让我们来看看C3q个客户端是怎样做的Q?/p>

  1. C3发送SETNX lock.foo 惌获得锁,׃C0q持有锁Q所以Redisq回lC3一?
  2. C3发送GET lock.foo 以检查锁是否时了,如果没超Ӟ则等待或重试?/li>
  3. 反之Q如果已时QC3通过下面的操作来试获得锁:
    GETSET lock.foo <current Unix time + lock timeout + 1>
  4. 通过GETSETQC3拿到的时间戳如果仍然是超时的Q那p明,C3如愿以偿拿到锁了?/li>
  5. 如果在C3之前Q有个叫C4的客L比C3快一步执行了上面的操作,那么C3拿到的时间戳是个未超时的|q时QC3没有如期获得锁,需要再ơ等待或重试。留意一下,管C3没拿到锁Q但它改写了C4讄的锁的超时|不过q一炚w常微的误差带来的媄响可以忽略不计?/li>

注意Q?/strong>Z让分布式锁的法更稳键些Q持有锁的客L在解锁之前应该再查一ơ自q锁是否已l超Ӟ再去做DEL操作Q因为可能客L因ؓ某个耗时的操作而挂P操作完的时候锁因ؓ时已经被别得,q时׃必解锁了?/p>

CZ伪代?/h3>

Ҏ上面的代码,我写了一段Fake代码来描qC用分布式锁的全过E:

  1. # get lock
  2. lock = 0
  3. while lock != 1:
  4.     timestamp = current Unix time + lock timeout + 1
  5.     lock = SETNX lock.foo timestamp
  6.     if lock == 1 or (now() > (GET lock.foo) and now() > (GETSET lock.foo timestamp)):
  7.         break;
  8.     else:
  9.         sleep(10ms)
  10.  
  11. # do your job
  12. do_job()
  13.  
  14. # release
  15. if now() < GET lock.foo:
  16.     DEL lock.foo

是的Q要惌D逻辑可以重用Q用python的你马上想CDecoratorQ而用Java的你是不是也惛_了那谁?AOP + annotationQ行Q怎样舒服怎样用吧Q别重复代码p?/p>



背景
?很多互联|品应用中Q有些场景需要加锁处理,比如Q秒杀Q全局递增IDQ楼层生成等{。大部分的解x案是ZDB实现的,Redis为单q程单线E模 式,采用队列模式ƈ发访问变成串行访问,且多客户端对Redis的连接ƈ不存在竞争关pR其ơRedis提供一些命令SETNXQGETSETQ可以方 便实现分布式锁机制?/p>

Redis命o介绍
使用Redis实现分布式锁Q有两个重要函数需要介l?br />
SETNX命oQSET if Not eXistsQ?/span>
语法Q?br />SETNX key value
功能Q?br />当且仅当 key 不存在,?key 的D?value Qƈq回1Q若l定?key 已经存在Q则 SETNX 不做M动作Qƈq回0?/p>

GETSET命o
语法Q?br />GETSET key value
功能Q?br />给?key 的D?value Qƈq回 key 的旧?(old value)Q当 key 存在但不是字W串cdӞq回一个错误,当key不存在时Q返回nil?/p>

GET命o
语法Q?br />GET key
功能Q?br />q回 key 所兌的字W串|如果 key 不存在那么返回特D?nil ?/p>

DEL命o
语法Q?br />DEL key [KEY …]
功能Q?br />删除l定的一个或多个 key ,不存在的 key 会被忽略?/p>

兵贵_,不在多。分布式锁,我们׃靠这四个命o。但在具体实玎ͼq有很多l节Q需要仔l斟酌,因ؓ在分布式q发多进E中QQ何一点出现差错,都会D死锁Qhold住所有进E?/p>

加锁实现

SETNX 可以直接加锁操作Q比如说Ҏ个关键词foo加锁Q客L可以试
SETNX foo.lock <current unix time>

如果q回1Q表C客L已经获取锁,可以往下操作,操作完成后,通过
DEL foo.lock

命o来释N?br />如果q回0Q说明foo已经被其他客L上锁Q如果锁是非堵塞的,可以选择q回调用。如果是堵塞调用调用Q就需要进入以下个重试循环Q直x功获得锁或者重试超时。理x好的,现实是残L。仅仅用SETNX加锁带有竞争条g的,在某些特定的情况会造成死锁错误?/p>

处理死锁

?上面的处理方式中Q如果获取锁的客L端执行时间过长,q程被kill掉,或者因为其他异常崩溃,D无法释放锁,׃造成死锁。所以,需要对加锁要做?效性检。因此,我们在加锁时Q把当前旉戳作为value存入此锁中,通过当前旉戛_Redis中的旉戌行对比,如果过一定差|认ؓ锁已l时 效,防止锁无限期的锁下去Q但是,在大q发情况Q如果同时检锁失效Qƈ单粗暴的删除死锁Q再通过SETNX上锁Q可能会D竞争条g的生,卛_个客 L同时获取锁?/p>

C1获取锁,q崩溃。C2和C3调用SETNX上锁q回0后,获得foo.lock的时间戳Q通过比对旉戻I发现锁超时?br />C2 向foo.lock发送DEL命o?br />C2 向foo.lock发送SETNX获取锁?br />C3 向foo.lock发送DEL命oQ此时C3发送DELӞ其实DEL掉的是C2的锁?br />C3 向foo.lock发送SETNX获取锁?/p>

此时C2和C3都获取了锁,产生竞争条gQ如果在更高q发的情况,可能会有更多客户端获取锁。所以,DEL锁的操作Q不能直接用在锁超时的情况下,q好我们有GETSETҎQ假设我们现在有另外一个客LC4Q看看如何用GETSET方式Q避免这U情况生?/p>

C1获取锁,q崩溃。C2和C3调用SETNX上锁q回0后,调用GET命o获得foo.lock的时间戳T1Q通过比对旉戻I发现锁超时?br />C4 向foo.lock发送GESET命oQ?br />GETSET foo.lock <current unix time>
q得到foo.lock中老的旉戳T2

如果T1=T2Q说明C4获得旉戟?br />如果T1!=T2Q说明C4之前有另外一个客LC5通过调用GETSET方式获取了时间戳QC4未获得锁。只能sleep下,q入下次循环中?/p>

现在唯一的问题是QC4讄foo.lock的新旉戻I是否会对锁生媄响。其实我们可以看到C4和C5执行的时间差值极,q且写入foo.lock中的都是有效旉错,所以对锁ƈ没有影响?br />?了让q个锁更加强壮,获取锁的客户端,应该在调用关键业务时Q再ơ调用GETҎ获取T1Q和写入的T0旉戌行对比,以免锁因其他情况被执行DEL?外解开而不知。以上步骤和情况Q很Ҏ从其他参考资料中看到。客L处理和失败的情况非常复杂Q不仅仅是崩溃这么简单,q可能是客户端因为某些操作被d 了相当长旉Q紧接着 DEL 命o被尝试执?但这旉却在另外的客L手上)。也可能因ؓ处理不当Q导致死锁。还有可能因为sleep讄不合理,DRedis在大q发下被压垮?最为常见的问题q有

GETq回nil时应该走那种逻辑Q?/span>

W一U走时逻辑
C1客户端获取锁Qƈ且处理完后,DEL掉锁Q在DEL锁之前。C2通过SETNX向foo.lock讄旉戳T0 发现有客L获取锁,q入GET操作?br />C2 向foo.lock发送GET命oQ获取返回值T1(nil)?br />C2 通过T0>T1+expireҎQ进入GETSET程?br />C2 调用GETSET向foo.lock发送T0旉戻Iq回foo.lock的原值T2
C2 如果T2=T1相等Q获得锁Q如果T2!=T1Q未获得锁?/p>

W二U情况走循环走setnx逻辑
C1客户端获取锁Qƈ且处理完后,DEL掉锁Q在DEL锁之前。C2通过SETNX向foo.lock讄旉戳T0 发现有客L获取锁,q入GET操作?br />C2 向foo.lock发送GET命oQ获取返回值T1(nil)?br />C2 循环Q进入下一ơSETNX逻辑

?U逻辑貌似都是OKQ但是从逻辑处理上来_W一U情况存在问题。当GETq回nil表示Q锁是被删除的,而不是超Ӟ应该走SETNX逻辑加锁。走W一 U情늚问题是,正常的加锁逻辑应该走SETNXQ而现在当锁被解除后,走的是GETSTQ如果判断条件不当,׃引v死锁Q很悲催Q我在做的时候就到 了,具体怎么到的看下面的问?/p>

GETSETq回nil时应该怎么处理Q?/span>

C1和C2客户端调用GET接口QC1q回T1Q此时C3|络情况更好Q快速进入获取锁Qƈ执行DEL删除锁,C2q回T2(nil)QC1和C2都进入超时处理逻辑?br />C1 向foo.lock发送GETSET命oQ获取返回值T11(nil)?br />C1 比对C1和C11发现两者不同,处理逻辑认ؓ未获取锁?br />C2 向foo.lock发送GETSET命oQ获取返回值T22(C1写入的时间戳)?br />C2 比对C2和C22发现两者不同,处理逻辑认ؓ未获取锁?/p>

?时C1和C2都认为未获取锁,其实C1是已l获取锁了,但是他的处理逻辑没有考虑GETSETq回nil的情况,只是单纯的用GET和GETSET值就?ҎQ至于ؓ什么会出现q种情况Q一U是多客LӞ每个客户端连接Redis的后Q发出的命oq不是连l的Q导致从单客L看到的好像连l的命oQ到 Redis server后,q两条命令之间可能已l插入大量的其他客户端发出的命oQ比如DEL,SETNX{。第二种情况Q多客户端之间时间不同步Q或者不是严?意义的同步?/p>

旉戳的问题

我们看到foo.lock的valuegؓ旉戻I所以要在多客户端情况下Q保证锁有效Q一定要同步各服务器的时_如果各服务器_旉有差异。时间不一致的客户端,在判断锁时Q就会出现偏差,从而生竞争条件?br />锁的时与否Q严g赖时间戳Q时间戳本n也是有精度限Ӟ假如我们的时间精度ؓU,从加锁到执行操作再到解锁Q一般操作肯定都能在一U内完成。这L话,我们上面的CASEQ就很容易出现。所以,最好把旉_ֺ提升到毫U。这L话,可以保证毫秒U别的锁是安全的?/p>

分布式锁的问?/span>

1Q必要的时机制Q获取锁的客L一旦崩溃,一定要有过期机Ӟ否则其他客户端都降无法获取锁Q造成死锁问题?br />2Q分布式锁,多客L的时间戳不能保证严格意义的一致性,所以在某些特定因素下,有可能存在锁串的情况。要适度的机Ӟ可以承受概率的事g产生?br />3Q只对关键处理节点加锁,良好的习惯是Q把相关的资源准备好Q比如连接数据库后,调用加锁机制获取锁,直接q行操作Q然后释放,量减少持有锁的旉?br />4Q在持有锁期间要不要CHECK锁,如果需要严g赖锁的状态,最好在关键步骤中做锁的CHECK查机Ӟ但是Ҏ我们的测试发玎ͼ在大q发Ӟ每一ơCHECK锁操作,都要消耗掉几个毫秒Q而我们的整个持锁处理逻辑才不?0毫秒Q玩客没有选择做锁的检查?br />5Qsleep学问Qؓ了减对Redis的压力,获取锁尝试时Q@环之间一定要做sleep操作。但是sleep旉是多是门学问。需要根据自qRedis的QPSQ加上持锁处理时间等q行合理计算?br />6Q至于ؓ什么不使用Redis的mutiQexpireQwatch{机Ӟ可以查一参考资料,找下原因?/p>

锁测试数?/span>

未用sleep
W一U,锁重试时未做sleep。单ơ请求,加锁Q执行,解锁旉 


可以看到加锁和解锁时间都很快Q当我们使用

ab -n1000 -c100 'http://sandbox6.wanke.etao.com/test/test_sequence.php?tbpm=t'
AB q发100累计1000ơ请求,对这个方法进行压时?


我们会发玎ͼ获取锁的旉变成Q同时持有锁后,执行旉也变成,而delete锁的旉Q将q?0ms旉Qؓ什么会q样Q?br />1Q持有锁后,我们的执行逻辑中包含了再次调用Redis操作Q在大ƈ发情况下QRedis执行明显变慢?br />2Q锁的删除时间变长,从之前的0.2msQ变?.8msQ性能下降q?0倍?br />在这U情况下Q我们压的QPS?9Q最l发现QPS和压总量有关Q当我们q发100d100ơ请求时QQPS得到110多。当我们使用sleep?/p>

使用Sleep?/strong>

单次执行h?br />

我们看到Q和不用sleep机制Ӟ性能相当。当时用相同的压条件进行压~时 

获取锁的旉明显变长Q而锁的释放时间明昑֏短,仅是不采用sleep机制的一半。当然执行时间变成就是因为,我们在执行过E中Q重新创建数据库q接Q导致时间变长的。同时我们可以对比下Redis的命令执行压力情?nbsp;

?图中l高部分是ؓ未采用sleep机制的时的压图Q矮胖部分ؓ采用sleep机制的压图Q通上囄到压力减?0%左右Q当Ӟsleepq种方式q?有个~点QPS下降明显Q在我们的压条件下Q仅?5Qƈ且有部分h出现时情况。不q综合各U情况后Q我们还是决定采用sleep机制Q主要是Z 防止在大q发情况下把Redis压垮Q很不行Q我们之前碰到过Q所以肯定会采用sleep机制?/p>

参考资?/span>

http://www.worlduc.com/FileSystem/18/2518/590664/9f63555e6079482f831c8ab1dcb8c19c.pdf
http://redis.io/commands/setnx
http://www.dentisthealthcenter.com/caojianhua/archive/2013/01/28/394847.html


引子

redis是一个很强大的数据结构存储的nosql数据库,很方侉K对业务模型进行效率的优化。最q我的工作是负责对现有Java服务器框架进行整理,q将|络层与逻辑层脱,以便于逻辑层和|络层的横向扩展?管我在逻辑层上使用了AKKA作ؓ核心框架Q尽可能lockfreeQ但是还是免不了需要跨jvm的锁。所以我需要实C个分布式锁?/p>

官方的实?/h2>

官方?a >SETNX q一늻了一个实现?/p>

  • C4 sends SETNX lock.foo in order to acquire the lock
  • The crashed client C3 still holds it, so Redis will reply with 0 to C4.
  • C4 sends GET lock.foo to check if the lock expired. If it is not, it will sleep for some time and retry from the start.
  • Instead, if the lock is expired because the Unix time at lock.foo is older than the current Unix time, C4 tries to perform: GETSET lock.foo (current Unix timestamp + lock timeout + 1)
  • Because of the GETSET semantic, C4 can check if the old value stored at key is still an expired timestamp. If it is, the lock was acquired.
  • If another client, for instance C5, was faster than C4 and acquired the lock with the GETSET operation, the C4 GETSET operation will return a non expired timestamp. C4 will simply restart from the first step. Note that even if C4 set the key a bit a few seconds in the future this is not a problem.

但是使用官方推荐的getset实现的话Q未竞争到锁的一方确实可以判断到自己未能竞争到锁Q但却将持有锁一方的旉修改了,q样的直接后果就是,持有锁的一Ҏ法解锁!Q!

Zlua的实?/h2>

其实官方实现出现的问题,是因Z用redis独立的命令不能将get-check-setq个q程q行原子化,所以我军_引入redis-luaQ将get-check-setq个q程使用lua脚本来实现?/p>

加锁Q?/p>

  • script params: lock_key, current_timestamp, lock_timeout
  • setnx lock_key (current_timestamp + lock_timeout). if not success, set lock_key (current_timestamp + lock_timeout) if current_timestamp > value
  • client save current_timestamp(lock_create_timestamp)

解锁Q?/p>

  • script params: lock_key, lock_create_timestamp, lock_timeout
  • delete if lock_create_timestamp + lock_timeout == value

具体的实?

LUA
  1. ---lock

  2. local now = tonumber(ARGV[1])
  3. local timeout = tonumber(ARGV[2])
  4. local to = now + timeout
  5. local locked = redis.call('SETNX', KEYS[1], to)
  6. if (locked == 1) then
  7. return 0
  8. end
  9. local kt = redis.call('type', KEYS[1]);
  10. if (kt['ok'] ~= 'string') then
  11. return 2
  12. end
  13. local keyValue = tonumber(redis.call('get', KEYS[1]))
  14. if (now > keyValue) then
  15. redis.call('set', KEYS[1], to)
  16. return 0
  17. end
  18. return 1

  19. ---unlock

  20. local begin = tonumber(ARGV[1])
  21. local timeout = tonumber(ARGV[2])
  22. local kt = redis.call('type', KEYS[1]);
  23. if (kt['ok'] == 'string') then
  24. local keyValue = tonumber(redis.call('get', KEYS[1]))
  25. if ((keyValue - begin) == timeout) then
  26. redis.call('del', KEYS[1])
  27. return 0
  28. end
  29. end
  30. return 1

已知问题

redis的分布式锁会有单点的问题。当然我们的业务量也没有辑ֈ挂掉专门做锁的redis单点的水q?/p>



SIMONE 2016-05-12 17:52 发表评论
]]>
Spring 动态注册类http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/03/23/429774.htmlSIMONESIMONEWed, 23 Mar 2016 02:47:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/03/23/429774.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/429774.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/03/23/429774.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/429774.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/429774.htmlimport com.duxiu.modules.beetlsql.BeetlSQLDao;
import org.beetl.sql.core.SQLManager;
import org.springframework.beans.BeansException;
import org.springframework.beans.factory.support.DefaultListableBeanFactory;
import org.springframework.context.ApplicationContext;
import org.springframework.context.ApplicationContextAware;
import org.springframework.context.ConfigurableApplicationContext;
import org.springframework.core.io.Resource;
import org.springframework.core.io.support.PathMatchingResourcePatternResolver;
import org.springframework.core.io.support.ResourcePatternResolver;

import java.io.IOException;
import java.util.Objects;
import java.util.stream.Stream;

public class DaoFactoryBean implements ApplicationContextAware {
    
    
    @Override
    
public void setApplicationContext(ApplicationContext applicationContext) throws BeansException {
        
        ConfigurableApplicationContext context 
= (ConfigurableApplicationContext) applicationContext;
        DefaultListableBeanFactory beanFactory 
= (DefaultListableBeanFactory) context.getBeanFactory();
        ResourcePatternResolver rpr 
= new PathMatchingResourcePatternResolver(applicationContext);
        SQLManager sqlManager 
= applicationContext.getBean(SQLManager.class);
        
try {
            Resource[] resources 
= rpr.getResources("classpath:com/duxiu/**/*.class");
            Stream.of(resources).map(f 
-> {
                
try {
                    
return f.getURI().getPath().split("(classes/)|(!/)")[1].replace("/"".").replace(".class""");
                } 
catch (IOException e) {
                    e.printStackTrace();
                    
return null;
                }
            }).filter(Objects::nonNull).forEach(f 
-> {
                
try {
                    Class
<?> aClass = Class.forName(f);
                    
boolean match = Stream.of(aClass.getAnnotations()).anyMatch(c -> c instanceof BeetlSQLDao);
                    
if (match && !beanFactory.containsBean(aClass.getSimpleName())) {
                        System.out.println(sqlManager.getMapper(aClass));
                        
//beanFactory.registerSingleton(aClass.getSimpleName(),sqlManager.getMapper(aClass));
                    }
                } 
catch (ClassNotFoundException e) {
                    e.printStackTrace();
                }
            });
        } 
catch (IOException e) {
            e.printStackTrace();
        }
        System.out.println(applicationContext.getBean(SQLManager.
class));
        
/*if(!beanFactory.containsBean(beanName)){
            BeanDefinitionBuilder beanDefinitionBuilder= BeanDefinitionBuilder.rootBeanDefinition(beanClass);
            beanDefinitionBuilder.addPropertyValue("host", host);
            beanDefinitionBuilder.addPropertyValue("port", port);
            beanDefinitionBuilder.addPropertyValue("database", database);
            beanDefinitionBuilder.setInitMethodName("init");
            beanDefinitionBuilder.setDestroyMethodName("destroy");
            beanFactory.registerBeanDefinition(beanName, beanDefinitionBuilder.getBeanDefinition());
            logger.info("Add {} to bean container.", beanName);
        }
*/
    }
}


SIMONE 2016-03-23 10:47 发表评论
]]>
模块化利? 一文章掌握RequireJS常用知识http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/03/04/429538.htmlSIMONESIMONEFri, 04 Mar 2016 07:27:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/03/04/429538.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/429538.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/03/04/429538.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/429538.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/429538.htmlhttp://www.cnblogs.com/lyzg/p/4865502.html

?q本文,你可以对模块化开发和AMD规范有一个较直观的认识,q详l地学习RequireJSq个模块化开发工L常见用法。本文采取@序渐q的方式Q从 理论到实践,从RequireJS官方API文中,ȝ出在使用RequireJSq程中最常用的一些用法,q对文中不够清晰具体的内容Q加以例证和 分析Q希望本文的内容对你的能力提升有实质性的帮助?/p>

1. 模块?/h2>

怿每个前端开发h员在刚开始接触js~程Ӟ都写q类g面这样风格的代码Q?/p>

复制代码
<script type="text/javascript">     var a = 1;     var b = 2;     var c = a * a + b * b;      if(c> 1) {         alert('c > 1');     }      function add(a, b) {         return a + b;     }      c = add(a,b); </script>
复制代码
<a href="javascript:;" onclick="click(this);" title="">L?lt;/a>

q些代码的特ҎQ?/p>

  • 到处可见的全局变量
  • 大量的函?/strong>
  • 内嵌在html元素上的各种js调用

当然q些代码本n在实现功能上q没有错误,但是从代码的可重用性,健壮性以及可l护性来_q种~程方式是有问题的,其是在面逻辑较ؓ复杂的应用中Q这些问题会暴露地特别明显:

  • 全局变量极易造成命名冲突
  • 函数式编E非怸利于代码的组l和理
  • 内嵌的js调用很不利于代码的维护,因ؓhtml代码有的时候是十分臃肿和庞大的

所以当q些问题出现的时候,js大牛们就开始寻扑֎解决q些问题的究极办法,于是模块化开发就出现了。正如模块化q个概念的表面意思一P它要求在 ~写代码的时候,按层ơ,按功能,独立的逻辑Q封装成可重用的模块Q对外提供直接明了的调用接口Q内部实现细节完全私有,q且模块之间的内部实现在执行 期间互不q扰Q最l的l果是可以解决前面举例提到的问题。一个简单遵循模块化开发要求编写的例子Q?/p>

复制代码
//module.js var student = function (name) {         return name && {                 getName: function () {                     return name;                 }             };     },     course = function (name) {         return name && {                 getName: function () {                     return name;                 }             }     },     controller = function () {         var data = {};         return {             add: function (stu, cour) {                 var stuName = stu && stu.getName(),                     courName = cour && cour.getName(),                     current,                     _filter = function (e) {                         return e === courName;                     };                  if (!stuName || !courName) return;                  current = data[stuName] = data[stuName] || [];                  if (current.filter(_filter).length === 0) {                     current.push(courName);                 }             },             list: function (stu) {                 var stuName = stu && stu.getName(),                     current = data[stuName];                 current && console.log(current.join(';'));             }         }     };  //main.js  var stu = new student('lyzg'),     c = new controller();  c.add(stu,new course('javascript')); c.add(stu,new course('html')); c.add(stu,new course('css')); c.list(stu);
复制代码

以上代码定义了三个模块分别表C学生,评和控制器Q然后在main.js中调用了controller提供的add和list接口Qؓlyzgq个学生d了三门课E,然后在控制台昄了出来。运行结果如下:

javascript;html;css

通过上例Q可以看出模块化的代码结构和逻辑十分清晰Q代码看h十分优雅Q另外由于逻辑都通过模块拆分Q所以达C解耦的目的Q代码的功能也会比较 健壮。不q上例用的q种模块化开发方式也q不是没有问题,q个问题是它还是把模块引用如studentq些直接dC全局I间下,虽然通过模块减少 了很多全局I间的变量和函数Q但是模块引用本w还是要依赖全局I间Q才能被调用Q当模块较多Q或者有引入W三Ҏ块库Ӟ仍然可能造成命名冲突的问题,所 以这U全局I间下的模块化开发的方式q不是最完美的方式。目前常见的模块化开发方式,全局I间方式是最基本的一U,另外常见的还有遵循AMD规范的开发方 式,遵@CMD规范的开发方式,和ECMAScript 6的开发方式。需要说明的是,CMD和ES6跟本文的核心没有关系Q所以不会在此介l,后面的内容主要介lAMD以及实现了AMD规范?RequireJS?/p>

2. AMD规范

正如上文提到Q实现模块化开发的方式Q另外常见的一U就是遵循AMD规范的实现方式,不过AMD规范q不是具体的实现方式Q而仅仅是模块化开发的一 U解x案,你可以把它理解成模块化开发的一些接口声明,如果你要实现一个遵循该规范的模块化开发工P必d现它预先定义的API。比如它要求在加?模块Ӟ必须使用如下的API调用方式Q?/p>

require([module], callback) 其中Q?[module]Q是一个数l,里面的成员就是要加蝲的模? callbackQ是模块加蝲完成之后的回调函?/span>

所有遵循AMD规范的模块化工具Q都必须按照它的要求d玎ͼ比如RequireJSq个库,是完全遵@AMD规范实现的,所以在利用 RequireJS加蝲或者调用模块时Q如果你事先知道AMD规范的话Q你q道该怎么用RequireJS了。规范的好处在于Q不同的实现却有相同的调 用方式,很容易切换不同的工具使用Q至于具体用哪一个实玎ͼq就跟各个工L各自的优点跟目的特Ҏ关系Q这些都是在目开始选型的时候需要确定的。目 前RequireJS不是唯一实现了AMD规范的库Q像Dojoq种更全面的js库也都有AMD的实现?/p>

最后对AMD全称做一个解释,译ؓQ异步模块定义。异步强调的是,在加载模块以及模块所依赖的其它模块时Q都采用异步加蝲的方式,避免模块加蝲d了网늚渲染q度。相比传l的异步加蝲QAMD工具的异步加载更加简便,而且q能实现按需加蝲Q具体解释在下一部分说明?/p>

3. JavaScript的异步加载和按需加蝲

html中的script标签在加载和执行q程中会d|页的渲染,所以一般要求尽量将script标签攄在body元素的底部,以便加快面昄的速度Q还有一U方式就是通过异步加蝲的方式来加蝲jsQ这样可以避免js文g对html渲染的阻塞?/p>

W?U异步加载的方式是直接利用脚本生成script标签的方式:

复制代码
(function() {     var s = document.createElement('script');     s.type = 'text/javascript';     s.async = true;     s.src = 'http://yourdomain.com/script.js';     var x = document.getElementsByTagName('script')[0];     x.parentNode.insertBefore(s, x); })();
复制代码

q段代码Q放|在script标记内部Q然后该script标记d到body元素的底部即可?/p>

W?U方式是借助script的属性:defer和asyncQdeferq个属性在IE览器和早v的火狐浏览器中支持,async在支?html5的浏览器上都支持Q只要有q两个属性,script׃以异步的方式来加载,所以script在html中的位置׃重要了:

<script defer async="true" type="text/javascript" src="app/foo.js"></script> <script defer async="true" type="text/javascript" src="app/bar.js"></script>
<script defer async="true" type="text/javascript" src="app/main.js"></script>

q种方式下,所有异步js在执行的时候还是按序执行的,不然׃存在依赖问题Q比如如果上例中的main.js依赖foo.js和bar.jsQ?但是main.js先执行的话就会出错了。虽然从来理Zq种方式也算不错了,但是不够好,因ؓ它用h很繁琐,而且q有个问题就是页面需要添加多?script标记以及没有办法完全做到按需加蝲?/p>

JS的按需加蝲分两个层ơ,W一个层ơ是只加载这个页面可能被用到的JSQ第二个层次是在只在用到某个JS的时 候才d载。传l地方式很容易做到第一个层ơ,但是不容易做到第二个层次Q虽然我们可以通过合ƈ和压~工P某个页面所有的JS都添加到一个文件中去, 最大程度减资源请求量Q但是这个JSh到客L以后Q其中有很多内容可能都用不上Q要是有个工兯够做到在需要的时候才d载相关js完解决问?了,比如RequireJS?/p>

4. RequireJS常用用法ȝ

前文多次提及RequireJSQ本部分对它的常用用法详细说明Q它的官方地址是:http://www.requirejs.cn/Q你可以到该地址M载最新版RequireJS文g。RequireJS作ؓ目前使用最q泛的AMD工具Q它的主要优ҎQ?/p>

  • 完全支持模块化开?/li>
  • 能将非AMD规范的模块引入到RequireJS中?/li>
  • 异步加蝲JS
  • 完全按需加蝲依赖模块Q模块文件只需要压~؜淆,不需要合q?/li>
  • 错误调试
  • 插g支持

4.01 如何使用RequireJS

使用方式很简单,只要一个script标记可以在|页中加载RequireJSQ?/p>

<script defer async="true" src="/bower_components/requirejs/require.js"></script>

׃q里用到了defer和asyncq两个异步加载的属性,所以require.js是异步加载的Q你把这个script标记攄在Q何地斚w没有问题?/p>

4.02 如何利用RequireJS加蝲q执行当前网늚逻辑JS

4.01解决的仅仅是RequireJS的用问题,但它仅仅是一个JS库,是一个被当前面的逻辑所利用的工P真正实现|页功能逻辑的是我们?利用RequireJS~写的主JSQ这个主JSQ假设这些代码都攄在main.js文g中)又该如何利用RJ来加载执行呢Q方式如下:

<script data-main="scripts/main.js" defer async="true" src="/bower_components/requirejs/require.js"></script>

Ҏ4.01Q你会发现script标记多了一个data-mainQRJ用这个配|当前页面的主JSQ你要把逻辑都写在这个main.js里面?当RJ自n加蝲执行后,׃再次异步加蝲main.js。这个main.js是当前网|有逻辑的入口,理想情况下,整个|页只需要这一个script?讎ͼ利用RJ加蝲依赖的其它文Ӟ如jquery{?/p>

 

4.03 main.js怎么?/h3>

假设目的目录结构ؓQ?/p>

clipboard_thumb

main.js是跟当前面相关的主JSQapp文g夹存放本目自定义的模块Qlib存放W三方库?/p>

html中按4.02的方式配|RJ。main.js的代码如下:

require(['lib/foo', 'app/bar', 'app/app'], function(foo, bar, app) {     //use foo bar app do sth });

在这DJS中,我们利用RJ提供的requireҎQ加载了三个模块Q然后在q个三个模块都加载成功之后执行页面逻辑。requireҎ??参数Q第一个参数是数组cd的,实际使用Ӟ数组的每个元素都是一个模块的module IDQ第二个参数是一个回调函敎ͼq个函数在第一个参数定义的所有模块都加蝲成功后回调,形参的个数和序分别与第一个参数定义的模块对应Q比如第一个模 块时lib/fooQ那么这个回调函数的W一个参数就是fooq个模块的引用,在回调函C我们使用q些形参来调用各个模块的ҎQ由于回调是在各模块?载之后才调用的,所以这些模块引用肯定都是有效的?/p>

从以上这个简短的代码Q你应该已经知道该如何用RJ了?/p>

4.04 RJ的baseUrl和module ID

在介lRJ如何去解析依赖的那些模块JS的\径时Q必d弄清楚baseUrl和module IDq两个概c?/p>

html中的base元素可以定义当前面内部Mhttph的url前缀部分QRJ的baseUrl跟这个base元素L作用是类似的Q由?RJL动态地h依赖的JS文gQ所以必然涉及到一个JS文g的\径解析问题,RJ默认采用一UbaseUrl + moduleID的解析方式,q个解析方式后箋会D例说明。这个baseUrl非常重要QRJ对它的处理遵循如下规则:

  • 在没有用data-main和config的情况下QbaseUrl默认为当前页面的目录
  • 在有data-main的情况下Qmain.js前面的部分就是baseUrlQ比如上面的scripts/
  • 在有config的情况下QbaseUrl以config配置的ؓ?/li>

上述三种方式Q优先׃到高排列?/p>

data-main的用方式,你已l知道了Qconfig该如何配|,如下所C:

require.config({     baseUrl: 'scripts' });

q个配置必须攄在main.js的最前面。data-main与config配置同时存在的时候,以config为准Q由于RJ的其它配|也是在q个位置配置的,所?.03中的main.js可以Ҏ如下l构Q以便将来的扩展Q?/p>

复制代码
require.config({     baseUrl: 'scripts' });  require(['lib/foo', 'app/bar', 'app/app'], function(foo, bar, app) {     // use foo bar app do sth });
复制代码

 

关于module IDQ就是在requireҎ以及后箋的defineҎ里,用在依赖数组q个参数里,用来标识一个模块的字符丌Ӏ上面代码中的['lib/foo', 'app/bar', 'app/app']是一个依赖数l,其中的每个元素都是一个module ID。值得注意的是Qmodule IDq不一定是该module 相关JS路径的一部分Q有的module ID很短Q但可能路径很长Q这跟RJ的解析规则有兟뀂下一节详l介l?/p>

4.05 RJ的文件解析规?/h3>

RJ默认按baseUrl + module ID的规则,解析文gQƈ且它默认要加载的文g都是jsQ所以你的module ID里面可以不包?js的后~Q这是为啥你看到的module ID都是lib/foo, app/barq种形式了。有三种module IDQ不适用q种规则Q?/p>

假如main.js如下使用Q?/p>

复制代码
require.config({     baseUrl: 'scripts' });  require(['/lib/foo', 'test.js', 'http://cdn.baidu.com/js/jquery'], function(foo, bar, app) {     // use foo bar app do sth });
复制代码

q三个module 都不会根据baseUrl + module ID的规则来解析,而是直接用module ID来解析,{效于下面的代码Q?/p>

<script src="/lib/foo.js"></script> <script src="test.js"></script> <script src="http://cdn.baidu.com/js/jquery.js"></script>

各种module ID解析举例Q?/strong>

?Q项目结构如下:

clipboard4_thumb

main.js如下Q?/p>

复制代码
require.config({     baseUrl: 'scripts' });  require(['lib/foo', 'app/bar', 'app/app'], function(foo, bar, app) {     // use foo bar app do sth });
复制代码

baseUrl为:scripts目录

moduleID为:lib/foo, app/bar, app/app

ҎbaseUrl + moduleIDQ以及自动补后缀.jsQ最l这三个module的js文g路径为:

scripts/lib/foo.js scripts/app/bar.js scripts/app/app.js

?Q项目结构同?Q?/p>

main.js改ؓQ?/p>

复制代码
require.config({     baseUrl: 'scripts/lib',     paths: {      app: '../app'     } });   require(['foo', 'app/bar', 'app/app'], function(foo, bar, app) {     // use foo bar app do sth });
复制代码

q里出现了一个新的配|pathsQ它的作用是针对module ID中特定的部分Q进行{义,如以上代码中对appq个部分Q{义ؓ../appQ这表示一个相对\径,相对位置是baseUrl所指定的目录,由项目结 构可知,../app其实对应的是scirpt/app目录。正因ؓ有这个{义的存在Q所以以上代码中的app/bar才能被正解析,否则q按 baseUrl + moduleID的规则,app/bar不是应该被解析成scripts/lib/app/bar.js吗,但实际ƈ非如此,app/bar被解析成 scripts/app/bar.jsQ其中v关键作用的就是paths的配|。通过q个举例Q可以看出module IDq不一定是js文g路径中的一部分Qpaths的配|对于\径过E的js特别有效Q因为可以简化它的module ID?/p>

另外W一个模块的ID为fooQ同时没有paths的{义,所以根据解析规则,它的文g路径Ӟscripts/lib/foo.js?/p>

paths的配|中只有当模块位于baseUrl所指定的文件夹的同层目录,或者更上层的目录时Q才会用?./q种相对路径?/p>

?Q项目结果同?Qmain.js同例2Q?/p>

q里要说明的问题E微ҎQ不以main.jsZQ而以app.jsZQ且app依赖barQ当然configq是需要在main.js中定义的Q由于这个问题在定义模块的时候更加常见,所以用define来D例,假设app.js模块如下定义Q?/p>

复制代码
define(['./bar'], function(bar) {      return {           doSth: function() {               bar.doSth();           }      } });
复制代码

上面的代码通过define定义了一个模块,q个define函数后面介绍如何定义模块的时候再来介l,q里单了解。这里这U用法的W一个参数跟 require函数一P是一个依赖数l,W二个参数是一个回调,也是在所有依赖加载成功之后调用,q个回调的返回g成ؓq个模块的引用被其它模块所?用?/p>

q里要说的问题还是跟解析规则相关的,如果完全遵守RJ的解析规则,q里的依赖应该配|成app/bar才是正确的,但由于app.js?bar.js位于同一个目录,所以完全可利用./q个同目录的相对标识W来解析jsQ这L话只要app.js已经加蝲成功了,那么d目录下找 bar.jsp定能扑ֈ了。这U配|在定义模块的时候非常有意义Q这样你的模块就不依赖于攄q些模块的文件夹名称了?/p>

4.06 RJ的异步加?/h3>

RJ不管是requireҎq是defineҎ的依赖模块都是异步加载的Q所以下面的代码不一定能解析到正的JS文gQ?/p>

<script data-main="scripts/main" src="scripts/require.js"></script> <script src="scripts/other.js"></script>
//main.js
require.config({ paths: { foo:
'libs/foo-1.1.3' } });
//other.js
require( ['foo'], function( foo ) { //foo is undefined });

׃main.js是异步加载的Q所以other.js会比它先加蝲Q但是RJ的配|存在于main.js里面Q所以在加蝲other.jsM到RJ的配|,在other.js执行的时候解析出来的foo的\径就会变成scripts/foo.jsQ而正\径应该是scripts/libs/foo-1.1.3.js?/p>

管RJ的依赖是异步加蝲的,但是已加载的模块在多ơ依赖的时候,不会再重新加载:

define(['require', 'app/bar', 'app/app'], function(require) {     var bar= require("app/bar");     var app= require("app/app");     //use bar and app do sth });

上面的代码,在callback定义的时候,只用了一个Ş参,q主要是Z减少形参的数量,避免整个回调的签名很ѝ依赖的模块在回调内部可以直接用require(moduleID)的参数得刎ͼ׃在回调执行前Q依赖的模块已经加蝲Q所以此处调用不会再重新加蝲。但是如果此处获取一个ƈ不在依赖数组中出现的module IDQrequire很有可能获取不到该模块引用,因ؓ它可能需要重新加载,如果它没有在其它模块中被加蝲q的话?/p>

4.07 RJ官方推荐的JS文gl织l构

RJQ文件组l尽量扁qI不要多层嵌套Q最理想的是跟项目相关的攑֜一个文件夹Q第三方库放在一个文件夹Q如下所C:

image_thumb

4.08 使用define定义模块

AMD规定的模块定义规范ؓQ?/p>

define(id?, dependencies?, factory);  其中Q?id: 模块标识Q可以省略?dependencies: 所依赖的模块,可以省略?factory: 模块的实玎ͼ或者一个JavaScript对象

关于W一个参敎ͼ本文不会涉及Q因为RJ所有模块都不要使用W一个参敎ͼ如果使用W一个参数定义的模块成ؓ命名模块Q不适用W一个参数的模块成ؓ匿名模块Q命名模块如果更名,所有依赖它的模块都得修改!W二个参数是依赖数组Q跟require一P如果没有q个参数Q那么定义的是一个无依赖的模块;最后一个参数是回调或者是一个简单对象,在模块加载完毕后调用Q当然没有第二个参数Q最后一个参C会调用?/p>

本部分所举例都采用如下项目结构:

clipboard1_thumb

1. 定义单对象模块:

app/bar.js

define({  bar:'I am bar.' });

利用main.js试Q?/p>

复制代码
require.config({     baseUrl: 'scripts/lib',     paths: {         app: '../app'     } });  require(['app/bar'], function(bar) {     console.log(bar);// {bar: 'I am bar.'} });
复制代码

2. 定义无依赖的模块Q?/p>

app/nodec.jsQ?/p>

define(function () {     return {         nodec: "yes, I don't need dependence."     } });

利用main.js试Q?/p>

复制代码
require.config({     baseUrl: 'scripts/lib',     paths: {         app: '../app'     } });  require(['app/nodec'], function(nodec) {     console.log(nodec);// {nodec: yes, I don't need dependence.'} });
复制代码

3. 定义依赖其它模块的模块:

app/dec.jsQ?/p>

define(['jquery'], function($){     //use $ do sth ...     return {        useJq: true     } });

利用main.js试Q?/p>

复制代码
require.config({     baseUrl: 'scripts/lib',     paths: {         app: '../app'     } });  require(['app/dec'], function(dec) {     console.log(dec);//{useJq: true} });
复制代码

4. 循环依赖Q?/p>

当一个模块foo的依赖数l中存在barQbar模块的依赖数l中存在fooQ就会Ş成@环依赖,E微修改下bar.js和foo.js如下?/p>

app/bar.jsQ?/p>

复制代码
define(['foo'],function(foo){  return {   name: 'bar',   hi: function(){    console.log('Hi! ' + foo.name);   }  } });
复制代码

lib/foo.jsQ?/p>

复制代码
define(['app/bar'],function(bar){  return {   name: 'foo',   hi: function(){    console.log('Hi! ' + bar.name);   }  } });
复制代码

利用main.js试Q?/p>

复制代码
require.config({     baseUrl: 'scripts/lib',     paths: {         app: '../app'     } });   require(['app/bar', 'foo'], function(bar, foo) {     bar.hi();     foo.hi(); });
复制代码

q行l果Q?/p>

clipboard3_thumb1

如果改变main.js中require部分的依赖顺序,l果Q?/p>

clipboard5_thumb1

循环依赖D两个依赖的module之间Q始l会有一个在获取另一个的时候,得到undefined。解x法是Q在定义module的时候,如果用到循环依赖的时候,在define内部通过require重新获取。main.js不变Qbar.jsҎQ?/p>

复制代码
define(['require', 'foo'], function(require, foo) {     return {         name: 'bar',         hi: function() {          foo = require('foo');             console.log('Hi! ' + foo.name);         }     } });
复制代码

foo.jsҎQ?/p>

复制代码
define(['require', 'app/bar'], function(require, bar) {     return {         name: 'foo',         hi: function() {          bar = require('app/bar');             console.log('Hi! ' + bar.name);         }     } });
复制代码

利用上述代码Q重新执行,l果是:

clipboard7_thumb

模块定义ȝQ?/strong>不管模块是用回调函数定义q是单对象定义,q个模块输出的是一个引用,所以这个引用必L有效的,你的回调不能q回undefinedQ但是不局限于对象cdQ还可以是数l,函数Q甚x基本cdQ只不过如果q回对象Q你能通过q个对象l织更多的接口?/p>

4.09 内置的RJ模块

再看看这个代码:

复制代码
define(['require', 'app/bar'], function(require) {     return {         name: 'foo',         hi: function() {             var bar = require('app/bar');             console.log('Hi! ' + bar.name);         }     } });
复制代码

依赖数组中的requireq个moduleID对应的是一个内|模块,利用它加载模块,怎么用你已经看到了,比如在main.js中,在define中。另外一个内|模块是moduleQ这个模块跟RJ的另外一个配|有养I具体用法请在W?大部分去了解?/p>

4.10 其它RJ有用功能

1. 生成相对于模块的URL地址

define(["require"], function(require) {     var cssUrl = require.toUrl("./style.css"); });

q个功能在你惌动态地加蝲一些文件的时候有用,注意要用相对\径?/p>

2. 控制台调?/p>

require("module/name").callSomeFunction()

假如你想在控制台中查看某个模块都有哪些方法可以调用,如果q个模块已经在页面加载的时候通过依赖被加载过后,那么可以用以上代码在控制台中做各种试了?/p>

5. RequireJS常用配置ȝ

在RJ的配|中Q前面已l接触到了baseUrlQpathsQ另外几个常用的配置是:

  • shim
  • config
  • enforceDefine
  • urlArgs

5.01 shim

为那些没有用define()来声明依赖关pR设|模块的"览器全局变量注入"型脚本做依赖和导出配|?/p>

?Q利用exports模块的全局变量引用与RequireJS兌

目l构如图Q?/p>

clipboard3_thumb

main.js如下Q?/p>

复制代码
require.config({     baseUrl: 'scripts/lib',     paths: {         app: '../app'     },     shim: {      underscore: {       exports: '_'      }     } });  require(['underscore'], function(_) {     // 现在可以通过_调用underscore的api?/span> });
复制代码

如你所见,RJ在shim中添加了一个对underscoreq个模块的配|,q过exports属性指定该模块暴露的全局变量Q以便RJ能够对这些模块统一理?/p>

?Q利用deps配置js模块的依?/p>

目l构如图Q?/p>

clipboard5_thumb

main.js如下Q?/p>

复制代码
require.config({     baseUrl: 'scripts/lib',     paths: {         app: '../app'     },     shim: {      backbone: {         deps: ['underscore', 'jquery'],         exports: 'Backbone'      }     } });  require(['backbone'], function(Backbone) {     //use Backbone's API });
复制代码

׃backboneq个lg依赖jquery和underscoreQ所以可以通过deps属性配|它的依赖,q样backbone会在另外两个模块加载完毕之后才会加载?/p>

?Qjquery{库插g配置Ҏ

代码举例如下Q?/p>

复制代码
requirejs.config({     shim: {         'jquery.colorize': {             deps: ['jquery'],             exports: 'jQuery.fn.colorize'         },         'jquery.scroll': {             deps: ['jquery'],             exports: 'jQuery.fn.scroll'         },         'backbone.layoutmanager': {             deps: ['backbone']             exports: 'Backbone.LayoutManager'         }     } });
复制代码

 

5.02 config

常常需要将配置信息传给一个模块。这些配|往往是applicationU别的信息,需要一个手D将它们向下传递给模块。在RequireJS中,Zrequirejs.config()的config配置Ҏ实现。要获取q些信息的模块可以加载特D的依赖“module”Qƈ调用module.config()?/p>

?Q在requirejs.config()中定义configQ以供其它模块?/p>

复制代码
requirejs.config({     config: {         'bar': {             size: 'large'         },         'baz': {             color: 'blue'         }     } });
复制代码

如你所见,config属性中的barq一节是在用于module ID为barq个模块的,bazq一节是用于module ID为bazq个模块的。具体用以bar.js举例Q?/p>

define(['module'], function(module) {     //Will be the value 'large'var size = module.config().size; });

前面提到q,RJ的内|模块除了requireq有一个moduleQ用法就在此处,通过它可以来加蝲config的内宏V?/p>

5.03 enforceDefine

如果讄为trueQ则当一个脚本不是通过define()定义且不具备可供查的shim导出字串值时Q就会抛出错误。这个属性可以强制要求所有RJ依赖或加载的模块都要通过define或者shim被RJ来管理,同时它还有一个好处就是用于错误检?/p>

5.04 urlArgs

RequireJS获取资源旉加在URL后面的额外的query参数。作为浏览器或服务器未正配|时?#8220;cache bust”手段很有用。用cache bust配置的一个示例:

urlArgs: "bust=" + (new Date()).getTime()

6. 错误处理

6.01 加蝲错误的捕?/strong>

IE中捕获加载错误不完美Q?/p>

  • IE 6-8中的script.onerror无效。没有办法判断是否加载一个脚本会D404错;更甚圎ͼ?04中依然会触发state为complete的onreadystatechange事g?/li>
  • IE 9+中script.onerror有效Q但有一个bugQ在执行脚本之后它ƈ不触发script.onload事g句柄。因此它无法支持匿名AMD模块的标准方法。所以script.onreadystatechange事g仍被使用。但是,state为complete的onreadystatechange事g会在script.onerror函数触发之前触发?/li>

所以ؓ了支持在IE中捕获加载错误,需要配|enforceDefine为trueQ这不得不要求你所有的模块都用define定义Q或者用shim配置RJ对它的引用?/p>

注意Q?/strong>如果你设|了enforceDefine: trueQ而且你用data-main=""来加载你的主JS模块Q则该主JS模块必须调用define()而不是require()来加载其所需的代码。主JS模块仍然可调用require/requirejs来设|config|但对于模块加载必M用define()。比如原来的q段׃报错Q?/p>

复制代码
require.config({  enforceDefine: true,     baseUrl: 'scripts/lib',     paths: {         app: '../app'     },     shim: {      backbone: {       deps: ['underscore', 'jquery'],             exports: 'Backbone'      }     } }); require(['backbone'], function(Backbone) {     console.log(Backbone); });
复制代码

把最后三行改成:

define(['backbone'], function(Backbone) {     console.log(Backbone); });

才不会报错?/p>

6.02 paths备错

复制代码
requirejs.config({     //To get timely, correct error triggers in IE, force a define/shim exports check.     enforceDefine: true,     paths: {         jquery: [             'http://ajax.googleapis.com/ajax/libs/jquery/1.4.4/jquery.min',             //If the CDN location fails, load from this location             'lib/jquery'         ]     } });  //Later require(['jquery'], function ($) { });
复制代码

上述代码先尝试加载CDN版本Q如果出错,则退回到本地的lib/jquery.js?/p>

注意: paths备错仅在模块ID_匚w时工作。这不同于常规的paths配置Q常规配|可匚w模块ID的Q意前~部分。备错主要用于非常的错误恢复Q而不是常规的path查找解析Q因为那在浏览器中是低效的?/p>

6.03 全局 requirejs.onError

Z捕获在局域的errback中未捕获的异常,你可以重载requirejs.onError()Q?/p>

复制代码
requirejs.onError = function (err) {     console.log(err.requireType);     if (err.requireType === 'timeout') {         console.log('modules: ' + err.requireModules);     }      throw err; };
复制代码

Q完Q?/p>



SIMONE 2016-03-04 15:27 发表评论
]]>
谈Spark应用E序的性能调优http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/03/02/429506.htmlSIMONESIMONEWed, 02 Mar 2016 06:12:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/03/02/429506.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/429506.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/03/02/429506.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/429506.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/429506.html

Spark是基于内存的分布式计引擎,以处理的高效和稳定著U。然而在实际的应用开发过E中Q开发者还是会遇到U种问题Q其中一大类是和性能相关。在本文中,W者将l合自n实践Q谈谈如何尽可能地提高应用程序性能?/p>

分布式计引擎在调优斚w有四个主要关注方向,分别是CPU、内存、网l开销和I/OQ其具体调优目标如下Q?/p>

  1. 提高CPU利用率?
  2. 避免OOM?
  3. 降低|络开销?
  4. 减少I/O操作?/li>

W??数据倾斜

数据倾斜意味着某一个或某几个Partition中的数据量特别的大,q意味着完成针对q几个Partition的计需要耗费相当长的旉?/p>

?果大量数据集中到某一个PartitionQ那么这个Partition在计的时候就会成为瓶颈。图1是Spark应用E序执行q发的示意图Q在 Spark中,同一个应用程序的不同Stage是串行执行的Q而同一Stage中的不同Task可以q发执行QTask数目由Partition数来?定,如果某一个Partition的数据量特别大,则相应的task完成旉会特别长Q由此导致接下来的Stage无法开始,整个Job完成的时间就会非 帔R?/p>

要避免数据倾斜的出玎ͼ一U方法就是选择合适的keyQ或者是自己定义相关的partitioner。在Spark中Block使用 了ByteBuffer来存储数据,而ByteBuffer能够存储的最大数据量不超q?GB。如果某一个key有大量的数据Q那么在调用cache?persist函数时就会碰到spark-1476q个异常?/p>

下面列出的这些API会导致Shuffle操作Q是数据倾斜可能发生的关键点所?
1. groupByKey
2. reduceByKey
3. aggregateByKey
4. sortByKey
5. join
6. cogroup
7. cartesian
8. coalesce
9. repartition
10. repartitionAndSortWithinPartitions

囄描述

?: Sparkdq发模型

  def rdd: RDD[T] }  // TODO View bounds are deprecated, should use context bounds // Might need to change ClassManifest for ClassTag in spark 1.0.0 case class DemoPairRDD[K <% Ordered[K] : ClassManifest, V: ClassManifest](   rdd: RDD[(K, V)]) extends RDDWrapper[(K, V)] {   // Here we use a single Long to try to ensure the sort is balanced,    // but for really large dataset, we may want to consider   // using a tuple of many Longs or even a GUID   def sortByKeyGrouped(numPartitions: Int): RDD[(K, V)] =     rdd.map(kv => ((kv._1, Random.nextLong()), kv._2)).sortByKey()     .grouped(numPartitions).map(t => (t._1._1, t._2)) }  case class DemoRDD[T: ClassManifest](rdd: RDD[T]) extends RDDWrapper[T] {   def grouped(size: Int): RDD[T] = {     // TODO Version where withIndex is cached     val withIndex = rdd.mapPartitions(_.zipWithIndex)      val startValues =       withIndex.mapPartitionsWithIndex((i, iter) =>          Iterator((i, iter.toIterable.last))).toArray().toList       .sortBy(_._1).map(_._2._2.toLong).scan(-1L)(_ + _).map(_ + 1L)      withIndex.mapPartitionsWithIndex((i, iter) => iter.map {       case (value, index) => (startValues(i) + index.toLong, value)     })     .partitionBy(new Partitioner {       def numPartitions: Int = size       def getPartition(key: Any): Int =          (key.asInstanceOf[Long] * numPartitions.toLong / startValues.last).toInt     })     .map(_._2)   } }

定义隐式的{?/p>

  implicit def toDemoRDD[T: ClassManifest](rdd: RDD[T]): DemoRDD[T] =      new DemoRDD[T](rdd)   implicit def toDemoPairRDD[K <% Ordered[K] : ClassManifest, V: ClassManifest](     rdd: RDD[(K, V)]): DemoPairRDD[K, V] = DemoPairRDD(rdd)   implicit def toRDD[T](rdd: RDDWrapper[T]): RDD[T] = rdd.rdd }

在spark-shell中就可以使用?/p>

import RDDConversions._  yourRdd.grouped(5)

W?? 减少|络通信开销

Spark 的Shuffleq程非常消耗资源,Shuffleq程意味着在相应的计算节点Q要先将计算l果存储到磁盘,后箋的Stage需要将上一个Stage的结 果再ơ读入。数据的写入和读取意味着Disk I/O操作Q与内存操作相比QDisk I/O操作是非怽效的?/p>

使用iostat来查看disk i/o的用情况,disk i/o操作频繁一般会伴随着cpu load很高?/p>

如果数据和计节炚w在同一台机器上Q那么可以避免网l开销Q否则还要加上相应的|络开销?使用iftop来查看网l带宽用情况,看哪几个节点之间有大量的|络传输?
?是Spark节点间数据传输的C意图,Spark Task的计函数是通过Akka通道由Driver发送到Executor上,而Shuffle的数据则是通过Netty|络接口来实现。由于Akka 通道中参数spark.akka.framesize军_了能够传输消息的最大|所以应该避免在Spark Task中引入超大的局部变量?/p>

囄描述

?: Spark节点间的数据传输

W??选择合适的q发?/strong>

Z提高Spark应用E序的效率,可能的提升CPU的利用率。ƈ发数应该是可用CPU物理核数的两倍。在q里Qƈ发数q低QCPU得不到充分的利用Qƈ发数q大Q由于spark是每一个task都要分发到计结点,所以Q务启动的开销会上升?/p>

q发数的修改Q通过配置参数来改变spark.default.parallelismQ如果是sql的话Q可能通过修改spark.sql.shuffle.partitions来修攏V?/p>

W?? Repartition vs. Coalesce

repartition和coalesce都能实现数据分区的动态调_但需要注意的是repartition会导致shuffle操作Q而coalesce不会?/p>

W?? reduceByKey vs. groupBy

groupBy操作应该可能的避免Q第一是有可能造成大量的网l开销Q第二是可能DOOM。以WordCountZ来演CreduceByKey和groupBy的差?/p>

reduceByKey     sc.textFile(“README.md”).map(l=>l.split(“,”)).map(w=>(w,1)).reduceByKey(_ + _)

囄描述

?QreduceByKey的Shuffleq程

Shuffleq程如图2所C?/p>

groupByKey     sc.textFile(“README.md”).map(l=>l.split(“,”)).map(w=>(w,1)).groupByKey.map(r=>(r._1,r._2.sum))

囄描述

?QgroupByKey的Shuffleq程

: 可能用reduceByKey, aggregateByKey, foldByKey和combineByKey
假设有一RDD如下所C,求每个key的均?/p>

val data = sc.parallelize( List((0, 2.), (0, 4.), (1, 0.), (1, 10.), (1, 20.)) )

Ҏ一QreduceByKey

data.map(r=>(r._1, (r.2,1))).reduceByKey((a,b)=>(a._1 + b._1, a._2 + b._2)).map(r=>(r._1,(r._2._1/r._2._2)).foreach(println)

Ҏ二:combineByKey

data.combineByKey(value=>(value,1),     (x:(Double, Int), value:Double)=> (x._1+value, x._2 + 1),     (x:(Double,Int), y:(Double, Int))=>(x._1 + y._1, x._2 + y._2))

W?? BroadcastHashJoin vs. ShuffleHashJoin

在Joinq程中,l常会遇到大表和表的join. Z提高效率可以使用BroadcastHashJoin, 预先小表的内容q播到各个Executor, q样避免针对小表的Shuffleq程Q从而极大的提高q行效率?/p>

其实BroadCastHashJoin核心是利用了BroadCast函数Q如果理解清楚broadcast的优点,p比较好的明白BroadcastHashJoin的优势所在?/p>

以下是一个简单用broadcast的示例程序?/p>

val lst = 1 to 100 toList val exampleRDD = sc.makeRDD(1 to 20 toSeq, 2) val broadcastLst = sc.broadcast(lst) exampleRDD.filter(i=>broadcastLst.valuecontains(i)).collect.foreach(println)

W?? map vs. mapPartitions

有时需要将计算l果存储到外部数据库Q势必会建立到外部数据库的连接。应该尽可能的让更多的元素共享同一个数据连接而不是每一个元素的处理旉d立数据库q接?
在这U情况下QmapPartitions和foreachPartitons比map操作高效的多?/p>

W?? 数据地d

Ud计算的开销q远低于Ud数据的开销?/p>

Spark中每个Task都需要相应的输入数据Q因此输入数据的位置对于Task的性能变得很重要。按照数据获取的速度来区分,由快到慢分别是:

1.PROCESS_LOCAL
2.NODE_LOCAL
3.RACK_LOCAL

Spark在Task执行的时候会优先考虑最快的数据获取方式Q如果想可能的在更多的机器上启动TaskQ那么可以通过调低spark.locality.wait的值来实现, 默认值是3s?/p>

?了HDFSQSpark能够支持的数据源来多Q如Cassandra, HBase,MongoDB{知名的NoSQL数据库,随着Elasticsearch的日渐兴Pspark和elasticsearchl合h提供 高速的查询解决Ҏ也成ZU有益的试?/p>

上述提到的外部数据源面的一个相同问题就是如何让spark快速读取其中的数据Q?可能的计结点和数据l点部v在一h辑ֈ该目标的基本ҎQ比如在部vHadoop集群的时候,可以HDFS的DataNode和Spark Worker׃n一台机器?/p>

以cassandraZ,如果Spark的部|和Cassandra的机器有部分重叠Q那么在dCassandra中数据的时候,通过调低spark.locality.wait可以在没有部vCassandra的机器上启动Spark Task?/p>

对于Cassandra, 可以在部|Cassandra的机器上部vSpark WorkerQ需要注意的是Cassandra的compaction操作会极大的消耗CPUQ因此在为Spark Worker配置CPU核数Ӟ需要将q些因素l合在一赯行考虑?/p>

q一部分的代码逻辑可以参考源码TaskSetManager::addPendingTask

private def addPendingTask(index: Int, readding: Boolean = false) {   // Utility method that adds `index` to a list only if readding=false or it's not already there   def addTo(list: ArrayBuffer[Int]) {     if (!readding || !list.contains(index)) {       list += index     }   }    for (loc <- tasks(index).preferredLocations) {     loc match {       case e: ExecutorCacheTaskLocation =>         addTo(pendingTasksForExecutor.getOrElseUpdate(e.executorId, new ArrayBuffer))       case e: HDFSCacheTaskLocation => {         val exe = sched.getExecutorsAliveOnHost(loc.host)         exe match {           case Some(set) => {             for (e <- set) {               addTo(pendingTasksForExecutor.getOrElseUpdate(e, new ArrayBuffer))             }             logInfo(s"Pending task $index has a cached location at ${e.host} " +               ", where there are executors " + set.mkString(","))           }           case None => logDebug(s"Pending task $index has a cached location at ${e.host} " +               ", but there are no executors alive there.")         }       }       case _ => Unit     }     addTo(pendingTasksForHost.getOrElseUpdate(loc.host, new ArrayBuffer))     for (rack <- sched.getRackForHost(loc.host)) {       addTo(pendingTasksForRack.getOrElseUpdate(rack, new ArrayBuffer))     }   }    if (tasks(index).preferredLocations == Nil) {     addTo(pendingTasksWithNoPrefs)   }    if (!readding) {     allPendingTasks += index  // No point scanning this whole list to find the old task there   } }

如果准备让spark支持新的存储源,q而开发相应的RDDQ与位置相关的部分就是自定义getPreferredLocations函数Q以elasticsearch-hadoop中的EsRDDZQ其代码实现如下?/p>

override def getPreferredLocations(split: Partition): Seq[String] = {   val esSplit = split.asInstanceOf[EsPartition]   val ip = esSplit.esPartition.nodeIp   if (ip != null) Seq(ip) else Nil }

W?? 序列?/strong>

使用好的序列化算法能够提高运行速度Q同时能够减内存的使用?/p>

Spark在Shuffle的时候要数据先存储到磁盘中Q存储的内容是经q序列化的。序列化的过E牵涉到两大基本考虑的因素,一是序列化的速度Q二是序列化后内Ҏ占用的大?/p>

kryoSerializer与默认的javaSerializer相比Q在序列化速度和序列化l果的大方面都h极大的优ѝ所以徏议在应用E序配置中用KryoSerializer.

spark.serializer  org.apache.spark.serializer.KryoSerializer

默认的cache没有对缓存的对象q行序列化,使用的StorageLevel是MEMORY_ONLY,q意味着要占用比较大的内存。可以通过指定persist中的参数来对~存内容q行序列化?/p>

exampleRDD.persist(MEMORY_ONLY_SER)

需要特别指出的是persist函数是等到job执行的时候才会将数据~存hQ属于gq执? 而unpersist函数则是立即执行Q缓存会被立x除?/p>

作者简介:讔RQ?《Apache Spark源码剖析》作者,x于大数据实时搜烦和实时流数据处理Q对elasticsearch, storm及drools多有研究Q现p于携E?/p>



SIMONE 2016-03-02 14:12 发表评论
]]>
playframwork dist 打包时将非项目中的文件也打包q去http://www.dentisthealthcenter.com/wangxinsh55/archive/2016/02/26/429448.htmlSIMONESIMONEFri, 26 Feb 2016 08:46:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/02/26/429448.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/429448.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/02/26/429448.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/429448.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/429448.htmlhttp://stackoverflow.com/questions/12231862/how-to-make-play-framework-dist-command-adding-some-files-folders-to-the-final

Play uses sbt-native-packager, which supports the inclusion of arbitrary files by adding them to the mappings:

mappings in Universal ++=   (baseDirectory.value / "scripts" * "*" get) map     (x => x -> ("scripts/" + x.getName)) 
The syntax assumes Play 2.2.x

val jdk8 = new File("D:\\JDK\\JDK8\\jre1_8_0_40")
mappings in Universal ++= (jdk8 ** "*" get) map (x => x -> ("jre8/" + jdk8.relativize(x).getOrElse(x.getName)))


SIMONE 2016-02-26 16:46 发表评论
]]>
利用中文数据跑Google开源项目word2vechttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/01/13/429028.htmlSIMONESIMONEWed, 13 Jan 2016 05:49:00 GMThttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/01/13/429028.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/comments/429028.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/archive/2016/01/13/429028.html#Feedback0http://www.dentisthealthcenter.com/wangxinsh55/comments/commentRss/429028.htmlhttp://www.dentisthealthcenter.com/wangxinsh55/services/trackbacks/429028.htmlhttp://www.cnblogs.com/hebin/p/3507609.html

一直听说word2vec在处理词与词的相似度的问题上效果十分好,最q自׃上手跑了跑Google开源的代码Q?span style="color: #0000ff;">https://code.google.com/p/word2vec/Q?/span>

1、语?/span>

首先准备数据Q采用网上博客上推荐的全|新L?SogouCA)Q大ؓ2.1G?nbsp;

从ftp上下载数据包SogouCA.tar.gzQ?/span>
1 wget ftp://ftp.labs.sogou.com/Data/SogouCA/SogouCA.tar.gz --ftp-user=hebin_hit@foxmail.com --ftp-password=4FqLSYdNcrDXvNDi -r

解压数据包:

1 gzip -d SogouCA.tar.gz 2 tar -xvf SogouCA.tar

再将生成的txt文g归ƈ到SogouCA.txt中,取出其中包含content的行q{码,得到语料corpus.txtQ大ؓ2.7G?/span>

1 cat *.txt > SogouCA.txt 2 cat SogouCA.txt | iconv -f gbk -t utf-8 -c | grep "<content>" > corpus.txt

2、分?/span>

?/span>ANSJ对corpus.txtq行分词Q得到分词结果resultbig.txtQ大ؓ3.1G?/span>

在分词工具seg_tool目录下先~译再执行得到分词结果resultbig.txtQ内?26221个词Q次数总计572308385个?/span>
 分词l果Q?/span>
  
3、用word2vec工具训练词向?/span>
1 nohup ./word2vec -train resultbig.txt -output vectors.bin -cbow 0 -size 200 -window 5 -negative 0 -hs 1 -sample 1e-3 -threads 12 -binary 1 &

vectors.bin是word2vec处理resultbig.txt后生成的词的向量文gQ在实验室的服务器上训练?个半时?/span>

4、分?/span>
4.1 计算怼的词Q?/span>
1 ./distance vectors.bin

 ./distance可以看成计算词与词之间的距离Q把词看成向量空间上的一个点Qdistance看成向量I间上点与点的距R?/span>

下面是一些例子: 

4.2 潜在的语a学规?/span>

在对demo-analogy.sh修改后得C面几个例子:
法国的首都是巴黎Q英国的首都是u敦, vector("法国") - vector("巴黎) + vector("英国") --> vector("伦敦")"

4.3 聚类

经q分词后的语料resultbig.txt中的词聚cdƈ按照cd排序Q?/span>

1 nohup ./word2vec -train resultbig.txt -output classes.txt -cbow 0 -size 200 -window 5 -negative 0 -hs 1 -sample 1e-3 -threads 12 -classes 500  & 2 sort classes.txt -k 2 -n > classes_sorted_sogouca.txt  

例如Q?/span>

4.4 短语分析

先利用经q分词的语料resultbig.txt中得出包含词和短语的文gsogouca_phrase.txtQ再训练该文件中词与短语的向量表C?/span>

1 ./word2phrase -train resultbig.txt -output sogouca_phrase.txt -threshold 500 -debug 2 2 ./word2vec -train sogouca_phrase.txt -output vectors_sogouca_phrase.bin -cbow 0 -size 300 -window 10 -negative 0 -hs 1 -sample 1e-3 -threads 12 -binary 1

下面是几个计相似度的例子:

5、参考链?/strong>Q?/span>

1. word2vecQTool for computing continuous distributed representations of wordsQ?a >https://code.google.com/p/word2vec/

2. 用中文把玩Google开源的Deep-Learning目word2vecQ?a >http://www.cnblogs.com/wowarsenal/p/3293586.html

3. 利用word2vec对关键词q行聚类Q?a >http://blog.csdn.net/zhaoxinfan/article/details/11069485

6、后l准备仔l阅ȝ文献Q?/span>

[1] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR, 2013.
[2] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. In Proceedings of NIPS, 2013.
[3] Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. Linguistic Regularities in Continuous Space Word Representations. In Proceedings of NAACL HLT, 2013.

[4] Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch[J]. The Journal of Machine Learning Research, 2011, 12: 2493-2537.

 



SIMONE 2016-01-13 13:49 发表评论
]]>
久久一级片
<noframes id="395jp"><noframes id="395jp"><video id="395jp"><video id="395jp"></video></video>
<i id="395jp"><font id="395jp"><delect id="395jp"></delect></font></i>
<nobr id="395jp"></nobr><noframes id="395jp"><noframes id="395jp"><dl id="395jp"></dl><video id="395jp"></video><noframes id="395jp"><dl id="395jp"></dl>
<video id="395jp"><video id="395jp"><dl id="395jp"></dl></video></video> <nobr id="395jp"><nobr id="395jp"><meter id="395jp"></meter></nobr></nobr>
<video id="395jp"></video><nobr id="395jp"></nobr>
<video id="395jp"></video>