很早就想采集一波火猫的直播源,但官网随便点击一个分类都是只剩聊聊数个直播间开播着,目测比战旗还惨淡,就作罢。今天看到一篇分析抓取该网站直播间真实地址的帖子,受益良多。
火猫直播,估计看dota2的网友才有所了解。今天试试提取其直播源,下文中如有错误或有其他更好的方法,还烦请各路大神们不吝赐教,感谢。
火猫PC网页版直播间页面:http://www.huomao.com/4166
火猫移动网页版直播间页面:https://www.huomao.com/mobile/mob_live/4166
用chrome调试移动网页版(加载的内容少些)。chrome按F12进入开发者模式–“toggle device toolbar”,打开一个直播间查看network,很容易就发现请求的流媒体地址形式如下:
http://live-js-hls.huomaotv.cn/live/3kmwyA_720/index.m3u8?t=1573929991&r=377152471701&stream=3kmwyA&rid=oubvc2y3v&token=73da9b8fa1bbfb6d5f45abf9e70f19ed&url=http%3A%2F%2Flive-js-hls.huomaotv.cn%2Flive%2F3kmwyA_720%2Findex.m3u8&from=huomaoh5room http://play-tx-hls.huomaotv.cn/live/3kmwyA_720.m3u8?t=1573930254&r=102528199855&stream=3kmwyA&rid=oubvc2y3v&token=bb79ddbaeab0e90aa6f576ae806dc2bc&url=http%3A%2F%2Fplay-tx-hls.huomaotv.cn%2Flive%2F3kmwyA_720.m3u8&from=huomaoh5room
先到网页源代码里搜下.m3u8或者.flv,看能不能直接找到地址,果然有!
开始以为这样就完了,但使用代码直接request直播页面获取的源码里,并没有加载这段播放器代码,估计是通过js加载的了。
只能再回来研究刚才流媒体地址的请求,多刷新几次,观察链接中参数的变化,拆解这个链接,可以发现:t是10位的Unix时间戳;stream是每个直播间固定值;rid固定不变,目前还不清楚是什么。token值每次都有变化,很明显是个32位MD5值。
在PotPlayer中尝试播放,发现播放链接可以精简成这种形式:
http://live-js-hls.huomaotv.cn/live/3kmwyA_720/index.m3u8?token=73da9b8fa1bbfb6d5f45abf9e70f19ed
就先搜索能不能找到token
第一个token就有值,但和我们的token并不一样,而且看hm_stat.js的文件名和内容像是统计代码,先忽略。
第二个mob_live.js是在流媒体地址之后加载的,也可以忽略了。
只好一条条查看在流媒体地址之前请求的内容,先看XHR对象的返回。
发现第四条的返回内容就是我们要的播放地址,看到有各种码率和线路,还有roomStatus是当前播放状态,1:开播,0:关播。
只需要模拟请求就行了,现在来看请求方法和参数,同样多试几个直播间,多刷新几次。可以看到:
POST地址:https://www.huomao.com/swf/live_data
参数:cdns固定1不变;streamtype固定live不变;VideoIDS每个直播间都不一样;
from固定为huomaoh5room;time当前10位时间戳;token也是个32位MD5值,并且每次刷新都会变化,估计是和时间有关。
接下来找VideoIDS和token。
一步步来,因为token和上面一样,先放着不管。搜索VideoIDS,居然没有,搜cdns、streamtype都没有,尝试直接搜下参数值,huomaoh5room也没有,只在源码中搜到唯一有价值的变量stream = WI_CGoCcm_OcYxdYBkw,这不就是我们要的VideoIDS吗,可以确定的是,这个变量一定会被使用。
打开看看:是插在一段JS代码里,而且JS代码里有混淆过的痕迹,先去解密试试。
解密后代码:
// 黑鸟博客 guihet.com 转载自52PoJie var stream = "D9/WEYPKn7LefjlqLFc"; var t; var image = "https://static.huomao.com/upload/web/images/channel/2019/40/20191004150142uiaVP7Jp.jpg"; var _0xb483 = ["_decode", "http://www.sojson.com/javascriptobfuscator.html"]; (function(_0xd642x1) { _0xd642x1[_0xb483[0]] = _0xb483[1] })(window); var __Ox20469 = ["stream", "parse", "roomStatus", "err", "error", "length", "streamList", "default", "list_hls", "type", "HD", "<v?ideo poster="", "" preload width="100%" preload="metadata" webkit-playsinline playsinline="true" x5-playsinline x-webkit-airplay="true" id="videolive" controls name="media" style="width: 100%;display: block;"><source src="", "url", ""></v?ideo>", "prepend", ".video-box", "init"]; function a() { var _0xd16fx2 = {}; _0xd16fx2[__Ox20469[0]] = stream; aes[__Ox20469[17]](_0xd16fx2, t, function(_0xd16fx3) { var _0xd16fx4 = JSON[__Ox20469[1]](_0xd16fx3); var _0xd16fx5 = _0xd16fx4[__Ox20469[2]]; if (_0xd16fx4[__Ox20469[3]] || _0xd16fx4[__Ox20469[4]]) { return false }; for (var _0xd16fx6 = 0; _0xd16fx6 < _0xd16fx4[__Ox20469[6]][__Ox20469[5]]; _0xd16fx6++) { if (_0xd16fx4[__Ox20469[6]][_0xd16fx6][__Ox20469[7]]) { var _0xd16fx7 = _0xd16fx4[__Ox20469[6]][_0xd16fx6][__Ox20469[8]]; for (var _0xd16fx8 = 0; _0xd16fx8 < _0xd16fx7[__Ox20469[5]]; _0xd16fx8++) { if (_0xd16fx7[_0xd16fx8][__Ox20469[9]] == __Ox20469[10]) { $(__Ox20469[16])[__Ox20469[15]](__Ox20469[11] + image + __Ox20469[12] + _0xd16fx7[_0xd16fx8][__Ox20469[13]] + __Ox20469[14]); return } } } } }) }
原来是sojson加密的啊,而且可以看出这段代码正是用来拼接出video-box标签中html代码的。随便下个断点验证下,就在最后的return处,刷新。果然,变量_0xd16fx3就是POST返回的数据。
我们现在把断点下到function(_0xd16fx3)前面,单步执行看详细过程。单步执行跳到了sea.js,一眼看去又是大段加密。
解密后的代码如下:
// 黑鸟博客 guihet.com 转载自52PoJie var __encode = 'sojson.com', _0xb483 = ["_decode", "http://www.sojson.com/javascriptobfuscator.html"]; (function(_0xd642x1) { _0xd642x1[_0xb483[0]] = _0xb483[1] })(window); var __Ox1fa4d = ["length", "AES_ExpandKey: Only key lengths of 16, 24 or 32 bytes allowed!", "slice", "concat", "", "fromCharCode", "hmh5", "initdata", "encode", "POST", "stream", "huomaoh5room", "/CdnVerification/stream_refresh", "parse", "status", "success", "time", "data", "getTime", "url", "ajax", "live", "cid", "province", "country", "/swf/live_data"]; var aes = (function() { var _0x5c69x2; var _0x5c69x3; var _0x5c69x4; function _0x5c69x5() { _0x5c69x2 = new Array(256); for (var _0x5c69x6 = 0; _0x5c69x6 < 256; _0x5c69x6++) { _0x5c69x2[_0x5c69x13[_0x5c69x6]] = _0x5c69x6 }; _0x5c69x3 = new Array(16); for (var _0x5c69x6 = 0; _0x5c69x6 < 16; _0x5c69x6++) { _0x5c69x3[_0x5c69x14[_0x5c69x6]] = _0x5c69x6 }; _0x5c69x4 = new Array(256); for (var _0x5c69x6 = 0; _0x5c69x6 < 128; _0x5c69x6++) { _0x5c69x4[_0x5c69x6] = _0x5c69x6 << 1; _0x5c69x4[128 + _0x5c69x6] = (_0x5c69x6 << 1) ^ 0x1b } } function _0x5c69x7() { delete _0x5c69x2; delete _0x5c69x3; delete _0x5c69x4 } function _0x5c69x8(_0x5c69x9) { var _0x5c69xa = _0x5c69x9[__Ox1fa4d[0]], _0x5c69xb, _0x5c69xc = 1; switch (_0x5c69xa) { case 16: _0x5c69xb = 16 * (10 + 1); break; case 24: _0x5c69xb = 16 * (12 + 1); break; case 32: _0x5c69xb = 16 * (14 + 1); break; default: alert(__Ox1fa4d[1]) }; for (var _0x5c69x6 = _0x5c69xa; _0x5c69x6 < _0x5c69xb; _0x5c69x6 += 4) { var _0x5c69xd = _0x5c69x9[__Ox1fa4d[2]](_0x5c69x6 - 4, _0x5c69x6); if (_0x5c69x6 % _0x5c69xa == 0) { _0x5c69xd = new Array(_0x5c69x13[_0x5c69xd[1]] ^ _0x5c69xc, _0x5c69x13[_0x5c69xd[2]], _0x5c69x13[_0x5c69xd[3]], _0x5c69x13[_0x5c69xd[0]]); if ((_0x5c69xc <<= 1) >= 256) { _0x5c69xc ^= 0x11b } } else { if ((_0x5c69xa > 24) && (_0x5c69x6 % _0x5c69xa == 16)) { _0x5c69xd = new Array(_0x5c69x13[_0x5c69xd[0]], _0x5c69x13[_0x5c69xd[1]], _0x5c69x13[_0x5c69xd[2]], _0x5c69x13[_0x5c69xd[3]]) } }; for (var _0x5c69xe = 0; _0x5c69xe < 4; _0x5c69xe++) { _0x5c69x9[_0x5c69x6 + _0x5c69xe] = _0x5c69x9[_0x5c69x6 + _0x5c69xe - _0x5c69xa] ^ _0x5c69xd[_0x5c69xe] } } } function _0x5c69xf(_0x5c69x10, _0x5c69x9) { var _0x5c69x11 = _0x5c69x9[__Ox1fa4d[0]]; _0x5c69x18(_0x5c69x10, _0x5c69x9[__Ox1fa4d[2]](0, 16)); for (var _0x5c69x6 = 16; _0x5c69x6 < _0x5c69x11 - 16; _0x5c69x6 += 16) { _0x5c69x15(_0x5c69x10, _0x5c69x13); _0x5c69x1a(_0x5c69x10, _0x5c69x14); _0x5c69x1d(_0x5c69x10); _0x5c69x18(_0x5c69x10, _0x5c69x9[__Ox1fa4d[2]](_0x5c69x6, _0x5c69x6 + 16)) }; _0x5c69x15(_0x5c69x10, _0x5c69x13); _0x5c69x1a(_0x5c69x10, _0x5c69x14); _0x5c69x18(_0x5c69x10, _0x5c69x9[__Ox1fa4d[2]](_0x5c69x6, _0x5c69x11)) } function _0x5c69x12(_0x5c69x10, _0x5c69x9) { var _0x5c69x11 = _0x5c69x9[__Ox1fa4d[0]]; _0x5c69x18(_0x5c69x10, _0x5c69x9[__Ox1fa4d[2]](_0x5c69x11 - 16, _0x5c69x11)); _0x5c69x1a(_0x5c69x10, _0x5c69x3); _0x5c69x15(_0x5c69x10, _0x5c69x2); for (var _0x5c69x6 = _0x5c69x11 - 32; _0x5c69x6 >= 16; _0x5c69x6 -= 16) { _0x5c69x18(_0x5c69x10, _0x5c69x9[__Ox1fa4d[2]](_0x5c69x6, _0x5c69x6 + 16)); _0x5c69x22(_0x5c69x10); _0x5c69x1a(_0x5c69x10, _0x5c69x3); _0x5c69x15(_0x5c69x10, _0x5c69x2) }; _0x5c69x18(_0x5c69x10, _0x5c69x9[__Ox1fa4d[2]](0, 16)) } var _0x5c69x13 = new Array(99, 124, 119, 123, 242, 107, 111, 197, 48, 1, 103, 43, 254, 215, 171, 118, 202, 130, 201, 125, 250, 89, 71, 240, 173, 212, 162, 175, 156, 164, 114, 192, 183, 253, 147, 38, 54, 63, 247, 204, 52, 165, 229, 241, 113, 216, 49, 21, 4, 199, 35, 195, 24, 150, 5, 154, 7, 18, 128, 226, 235, 39, 178, 117, 9, 131, 44, 26, 27, 110, 90, 160, 82, 59, 214, 179, 41, 227, 47, 132, 83, 209, 0, 237, 32, 252, 177, 91, 106, 203, 190, 57, 74, 76, 88, 207, 208, 239, 170, 251, 67, 77, 51, 133, 69, 249, 2, 127, 80, 60, 159, 168, 81, 163, 64, 143, 146, 157, 56, 245, 188, 182, 218, 33, 16, 255, 243, 210, 205, 12, 19, 236, 95, 151, 68, 23, 196, 167, 126, 61, 100, 93, 25, 115, 96, 129, 79, 220, 34, 42, 144, 136, 70, 238, 184, 20, 222, 94, 11, 219, 224, 50, 58, 10, 73, 6, 36, 92, 194, 211, 172, 98, 145, 149, 228, 121, 231, 200, 55, 109, 141, 213, 78, 169, 108, 86, 244, 234, 101, 122, 174, 8, 186, 120, 37, 46, 28, 166, 180, 198, 232, 221, 116, 31, 75, 189, 139, 138, 112, 62, 181, 102, 72, 3, 246, 14, 97, 53, 87, 185, 134, 193, 29, 158, 225, 248, 152, 17, 105, 217, 142, 148, 155, 30, 135, 233, 206, 85, 40, 223, 140, 161, 137, 13, 191, 230, 66, 104, 65, 153, 45, 15, 176, 84, 187, 22); var _0x5c69x14 = new Array(0, 5, 10, 15, 4, 9, 14, 3, 8, 13, 2, 7, 12, 1, 6, 11); function _0x5c69x15(_0x5c69x16, _0x5c69x17) { for (var _0x5c69x6 = 0; _0x5c69x6 < 16; _0x5c69x6++) { _0x5c69x16[_0x5c69x6] = _0x5c69x17[_0x5c69x16[_0x5c69x6]] } } function _0x5c69x18(_0x5c69x16, _0x5c69x19) { for (var _0x5c69x6 = 0; _0x5c69x6 < 16; _0x5c69x6++) { _0x5c69x16[_0x5c69x6] ^= _0x5c69x19[_0x5c69x6] } } function _0x5c69x1a(_0x5c69x16, _0x5c69x1b) { var _0x5c69x1c = new Array()[__Ox1fa4d[3]](_0x5c69x16); for (var _0x5c69x6 = 0; _0x5c69x6 < 16; _0x5c69x6++) { _0x5c69x16[_0x5c69x6] = _0x5c69x1c[_0x5c69x1b[_0x5c69x6]] } } function _0x5c69x1d(_0x5c69x16) { for (var _0x5c69x6 = 0; _0x5c69x6 < 16; _0x5c69x6 += 4) { var _0x5c69x1e = _0x5c69x16[_0x5c69x6 + 0], _0x5c69x1f = _0x5c69x16[_0x5c69x6 + 1]; var _0x5c69x20 = _0x5c69x16[_0x5c69x6 + 2], _0x5c69x21 = _0x5c69x16[_0x5c69x6 + 3]; var _0x5c69x1c = _0x5c69x1e ^ _0x5c69x1f ^ _0x5c69x20 ^ _0x5c69x21; _0x5c69x16[_0x5c69x6 + 0] ^= _0x5c69x1c ^ _0x5c69x4[_0x5c69x1e ^ _0x5c69x1f]; _0x5c69x16[_0x5c69x6 + 1] ^= _0x5c69x1c ^ _0x5c69x4[_0x5c69x1f ^ _0x5c69x20]; _0x5c69x16[_0x5c69x6 + 2] ^= _0x5c69x1c ^ _0x5c69x4[_0x5c69x20 ^ _0x5c69x21]; _0x5c69x16[_0x5c69x6 + 3] ^= _0x5c69x1c ^ _0x5c69x4[_0x5c69x21 ^ _0x5c69x1e] } } function _0x5c69x22(_0x5c69x16) { for (var _0x5c69x6 = 0; _0x5c69x6 < 16; _0x5c69x6 += 4) { var _0x5c69x1e = _0x5c69x16[_0x5c69x6 + 0], _0x5c69x1f = _0x5c69x16[_0x5c69x6 + 1]; var _0x5c69x20 = _0x5c69x16[_0x5c69x6 + 2], _0x5c69x21 = _0x5c69x16[_0x5c69x6 + 3]; var _0x5c69x1c = _0x5c69x1e ^ _0x5c69x1f ^ _0x5c69x20 ^ _0x5c69x21; var _0x5c69x23 = _0x5c69x4[_0x5c69x1c]; var _0x5c69x24 = _0x5c69x4[_0x5c69x4[_0x5c69x23 ^ _0x5c69x1e ^ _0x5c69x20]] ^ _0x5c69x1c; var _0x5c69x25 = _0x5c69x4[_0x5c69x4[_0x5c69x23 ^ _0x5c69x1f ^ _0x5c69x21]] ^ _0x5c69x1c; _0x5c69x16[_0x5c69x6 + 0] ^= _0x5c69x24 ^ _0x5c69x4[_0x5c69x1e ^ _0x5c69x1f]; _0x5c69x16[_0x5c69x6 + 1] ^= _0x5c69x25 ^ _0x5c69x4[_0x5c69x1f ^ _0x5c69x20]; _0x5c69x16[_0x5c69x6 + 2] ^= _0x5c69x24 ^ _0x5c69x4[_0x5c69x20 ^ _0x5c69x21]; _0x5c69x16[_0x5c69x6 + 3] ^= _0x5c69x25 ^ _0x5c69x4[_0x5c69x21 ^ _0x5c69x1e] } } var _0x5c69x26 = new Array(32); var _0x5c69x27 = []; function _0x5c69x28() { var _0x5c69x29 = [232, 48, 164, 196, 9, 249, 30, 45, 33, 85, 62, 235, 143, 78, 88, 15, 57, 48, 69, 50, 52, 51, 69, 66, 49, 65, 70, 55, 51, 54, 56, 53]; _0x5c69x5(); _0x5c69x26[__Ox1fa4d[0]] = 32; for (var _0x5c69x6 = 0; _0x5c69x6 < 32; _0x5c69x6++) { _0x5c69x26[_0x5c69x6] = _0x5c69x6 }; _0x5c69x8(_0x5c69x26); _0x5c69xf(_0x5c69x29, _0x5c69x26); _0x5c69x27 = _0x5c69x29 } var _0x5c69x2a = __Ox1fa4d[4]; function _0x5c69x2b() { if (!_0x5c69x2e) { _0x5c69x2e = true; var _0x5c69x2c = _0x5c69x27; _0x5c69x12(_0x5c69x2c, _0x5c69x26); _0x5c69x12(_0x5c69x2c, _0x5c69x26); var _0x5c69x2d = __Ox1fa4d[4]; for (var _0x5c69x6 = 0; _0x5c69x6 < _0x5c69x2c[__Ox1fa4d[0]]; _0x5c69x6++) { _0x5c69x2d += String[__Ox1fa4d[5]](_0x5c69x2c[_0x5c69x6]) }; _0x5c69x2a = _0x5c69x2d }; _0x5c69x2f(); return _0x5c69x2a } var _0x5c69x2e = false; function _0x5c69x2f() { _0x5c69x7() } _0x5c69x28(); function _0x5c69x30(_0x5c69x31, _0x5c69x32, _0x5c69x33) { if (window[__Ox1fa4d[6]] && window[__Ox1fa4d[6]][__Ox1fa4d[7]]) { _0x5c69x31 = Base64[__Ox1fa4d[8]](_0x5c69x31); $[__Ox1fa4d[20]]({ type: __Ox1fa4d[9], data: { "VideoIDS": window[__Ox1fa4d[6]][__Ox1fa4d[7]][__Ox1fa4d[10]], "url": _0x5c69x31, "time": _0x5c69x32, "from": __Ox1fa4d[11], "token": md5(window[__Ox1fa4d[6]][__Ox1fa4d[7]][__Ox1fa4d[10]] + __Ox1fa4d[11] + _0x5c69x31 + _0x5c69x32 + _0x5c69x2b()) }, url: __Ox1fa4d[12], success: function(_0x5c69x34) { var _0x5c69x35; try { _0x5c69x35 = JSON[__Ox1fa4d[13]](_0x5c69x34) } catch (error) {}; if (_0x5c69x35) { if (_0x5c69x35[__Ox1fa4d[14]] == __Ox1fa4d[15]) { _0x5c69x33(_0x5c69x35[__Ox1fa4d[17]][__Ox1fa4d[16]], new Date()[__Ox1fa4d[18]](), _0x5c69x35[__Ox1fa4d[17]][__Ox1fa4d[19]]) } } } }) } } function _0x5c69x36(_0x5c69x37, _0x5c69x32, _0x5c69x33) { var _0x5c69x38 = md5(_0x5c69x37[__Ox1fa4d[10]] + __Ox1fa4d[11] + _0x5c69x32 + _0x5c69x2b()); $[__Ox1fa4d[20]]({ type: __Ox1fa4d[9], data: { "cdns": 1, "streamtype": __Ox1fa4d[21], "cid": _0x5c69x37[__Ox1fa4d[22]], "VideoIDS": _0x5c69x37[__Ox1fa4d[10]], "district": _0x5c69x37[__Ox1fa4d[23]], "country": _0x5c69x37[__Ox1fa4d[24]], "from": __Ox1fa4d[11], "time": _0x5c69x32, "token": _0x5c69x38 }, url: __Ox1fa4d[25], success: function(_0x5c69x34) { _0x5c69x33(_0x5c69x34) } }) } return { "refresh": _0x5c69x30, "init": _0x5c69x36 } })()
最后一个_0x5c69x36函数看起来很熟悉的样子,里面包含我们需要POST的几个参数,注意到token值的生成代码:”token”: _0x5c69x38var _0x5c69x38 = md5(_0x5c69x37[__Ox1fa4d[10]] + __Ox1fa4d[11] + _0x5c69x32 + _0x5c69x2b());同样继续断点:
哈,很明显了,token为VideoIDS值、’huomaoh5room’、当前时间戳、_0x5c69x2b()返回值拼接后再MD5加密,而且经过多次调试后发现_0x5c69x2b()函数的返回值固定为“6FE26D855E1AEAE090E243EB1AF73685”,至此,就全部搞定了。
附上Python代码实现:
import requests import time import hashlib import re # 黑鸟博客 - guihet.com 转载自52PoJie def get_time(): tt = str(int((time.time() * 1000))) return tt def get_videoids(rid): room_url = 'https://www.huomao.com/mobile/mob_live/' + str(rid) response = requests.get(url=room_url).text try: videoids = re.findall(r'var stream = "([\w\W]+?)";', response)[0] except: videoids = 0 return videoids def get_token(videoids, time): token = hashlib.md5((str(videoids) + 'huomaoh5room' + str(time) + '6FE26D855E1AEAE090E243EB1AF73685').encode('utf-8')).hexdigest() return token def get_real_url(rid): videoids = get_videoids(rid) if videoids: time = get_time() token = get_token(videoids, time) room_url = 'https://www.huomao.com/swf/live_data' post_data = { 'cdns': 1, 'streamtype': 'live', 'VideoIDS': videoids, 'from': 'huomaoh5room', 'time': time, 'token': token } response = requests.post(url=room_url, data=post_data).json() roomStatus = response.get('roomStatus', 0) if roomStatus == '1': real_url = response.get('streamList')[0].get('list')[0] else: real_url = '直播间未开播' else: real_url = '直播间不存在' return real_url rid = input('请输入火猫直播房间号:\n') real_url = get_real_url(rid) print('该直播间源地址为:\n') print(real_url)
火猫直播间视频源真实地址分析 - 黑鸟博客
[url=http://www.gq06z45ug2xm2593x0u3n40ycm7kx6b9s.org/]ucpowitocy[/url]
cpowitocy http://www.gq06z45ug2xm2593x0u3n40ycm7kx6b9s.org/
acpowitocy
转载最好把原地址链接也发出来
有没有QQ群啊,加个群啊,之前的QQ群为什么解散了啊
技术好帖?!