今天在摸魚逛V2EX的時候,有個帖子引發了個人注意python
帖子內容:git
視頻連接加密以後是這樣的: lxxt6jIID2Byq541xEB6F3u71bYaE5A/A-1dMFS4o9mx8uzpm81KxH25u1E29:Cl7Wg|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_:hQ5Ue|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_/hQ5Ue|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_\hQ5Ue|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_.hQ5Ue|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7__hQ5Ue|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_AhQ5Ue|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7bhQW5e|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7ChQW5e|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7dhQW5e 網站連接在這: www.tvsky.tv/Industry/Sh… 請問是什麼加密, 求助。github
做爲一個助人爲樂的好青年,固然要順手幫樓主看一下啦😳web
打開這個網站看看,這是一個用Flash播放器加載並播放視頻的頁面,傳入播放器的參數如帖中所述是有加密的bash
傳入播放器的參數:網絡
flvurl=lxxt6jIID2Byq541xEB6F3u71bYaE5A/A-1dMFS4o9mx8uzpm81KxH25u1E29:Cl7Wg|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_:hQ5Ue|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_/hQ5Ue|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_\hQ5Ue|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_.hQ5Ue|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7__hQ5Ue|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_AhQ5Ue|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7bhQW5e|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7ChQW5e|lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7dhQW5e&isautoplay=1&adswf=
複製代碼
抓包發現有一個.flv文件的連接,應該就是播放器加載出來的視頻ide
全局搜索這個URL的部份內容是搜不到的,判斷出這個URL應該是在播放器中對傳入的flvurl參數進行解密,而後再加載出視頻函數
那麼,遇到這種狀況的時候咱們應該怎麼作才能破解出這個解密URL的過程呢?工具
首先,咱們須要將這個頁面上的Flash播放器給逆向一下,就像在爬HTML5視頻網站碰到加密參數時逆向JavaScript同樣。網站
可是Flash播放器是一個被編譯後的.swf文件,咱們並不能像JavaScript那樣直接看到代碼,須要先進行反編譯。
是時候祭出JPEXS了,在GitHub上能夠找到,傳送門:https://github.com/jindrapetrik/jpexs-decompiler/releases
下載完後啓動它,界面長這樣:
默認的語言是英語,能夠切換成中文,在Settings – Change language裏選擇
而後咱們將這個播放器的.swf文件給下載下來,並使用JPEXS打開
播放器文件地址在源頁面的HTML中能夠看到是:
http://www.tvsky.tv/FlvPlay/Playerx.swf
複製代碼
而後咱們有兩種方式快速定位到可能存在解密代碼的位置
第一種方式:
打開後找到腳本組下frame1的DoAction腳本
點擊後窗口右側會反編譯這個腳本的內容,並展現出反編譯出來的AS源代碼和P代碼(相似於彙編語言),咱們只須要看AS源代碼的部分就好了
根據在網頁中播放器的樣子,在加載時會有一個「正在加載Flv文件」的字樣,直接按Ctrl+F搜索它
找到init函數
第二種:
隨便找一個腳本打開,而後按Ctrl+Shift+F打開全局搜索,一樣搜索「正在加載Flv文件」
快速定位出加載視頻部分後,根據init函數這裏的代碼能夠看出,_loc2_就是被傳進播放器的flvurl
那麼下面的這部分就是它的解密操做了
# init部分:
_flvurl = _loc2_.split("|");
var _loc1_ = 0;
while(_loc1_ < _flvurl.length)
{
_flvurl[_loc1_] = Pass2Str(_flvurl[_loc1_]);
_loc1_ = _loc1_ + 1;
}
var PwdStr = "AbCdEfGhIjKlMnOpQrStUvWxYzaBcDeFgHiJkLmNoPqRsTuVwXyZ1234509876-_.\\/:";
var PwdStrRan = "12345678987654321";
var _PwdLen = 4;
var _PwdAddLen = 4;
function Pass2Str(Str)
{
var _loc2_ = "";
var _loc3_ = "";
var _loc4_ = 0;
var _loc1_ = 1;
while(_loc1_ <= Str.length)
{
_loc2_ = Str.substr(_loc1_,1);
if(_loc1_ % (_PwdLen + 1) != 0)
{
_loc3_ = _loc3_ + NumS(_loc2_,_loc4_);
}
else
{
_loc4_ = parseInt(_loc2_);
}
_loc1_ = _loc1_ + 1;
}
return _loc3_;
}
function NumS(s, _PwdAddLen1)
{
var _loc1_ = PwdStr.indexOf(s);
_loc1_ = _loc1_ - (_PwdAddLen + _PwdAddLen1 - 1);
if(_loc1_ <= 0)
{
return PwdStr.substr(_loc1_ + PwdStr.length,1);
}
return PwdStr.substr(_loc1_,1);
}
複製代碼
而後將反編譯出來的ActionScript代碼的解密URL部分改寫成Python代碼:
# http://www.tvsky.tv/Industry/Show/278/33875/ 的視頻url解密部分
# 爲方便對照AS代碼閱讀,這裏只對反編譯出來的AS代碼直接進行「翻譯」,沒有使用Python的一些更簡潔的寫法
_pwd_len = 4
_pwd_add_len = 4
pwd_str = "AbCdEfGhIjKlMnOpQrStUvWxYzaBcDeFgHiJkLmNoPqRsTuVwXyZ1234509876-_.\\/:"
def decode(flv_url: str):
""" function init() { ...... var _loc2_ = flvurl; ...... _flvurl = _loc2_.split("|"); var _loc1_ = 0; while(_loc1_ < _flvurl.length) { _flvurl[_loc1_] = Pass2Str(_flvurl[_loc1_]); _loc1_ = _loc1_ + 1; } ...... } :param flv_url: flash參數裏的flvurl部分的value :return: 解密後視頻url列表 """
new_flv_url = flv_url.split("|")
_loc1_ = 0
while _loc1_ < len(new_flv_url):
new_flv_url[_loc1_] = pass2str(new_flv_url[_loc1_])
_loc1_ += 1
return new_flv_url
def pass2str(str_: str):
""" function Pass2Str(Str) { var _loc2_ = ""; var _loc3_ = ""; var _loc4_ = 0; var _loc1_ = 1; while(_loc1_ <= Str.length) { _loc2_ = Str.substr(_loc1_,1); if(_loc1_ % (_PwdLen + 1) != 0) { _loc3_ = _loc3_ + NumS(_loc2_,_loc4_); } else { _loc4_ = parseInt(_loc2_); } _loc1_ = _loc1_ + 1; } return _loc3_; } :param str_: 加密的url字符串 :return: 解密後的url字符串 """
_loc1_ = 1
_loc3_ = ""
_loc4_ = 0
while _loc1_ <= len(str_):
_loc2_ = str_[_loc1_ - 1]
if _loc1_ % (_pwd_len + 1) != 0:
_loc3_ = _loc3_ + num_s(_loc2_, _loc4_)
else:
_loc4_ = int(_loc2_) if _loc2_.isdigit() else 0
_loc1_ = _loc1_ + 1
return _loc3_
def num_s(s, _pwd_add_len1):
""" function NumS(s, _PwdAddLen1) { var _loc1_ = PwdStr.indexOf(s); _loc1_ = _loc1_ - (_PwdAddLen + _PwdAddLen1 - 1); if(_loc1_ <= 0) { return PwdStr.substr(_loc1_ + PwdStr.length,1); } return PwdStr.substr(_loc1_,1); } """
_loc1_ = pwd_str.index(s)
_loc1_ = _loc1_ - (_pwd_add_len + _pwd_add_len1 - 1)
if _loc1_ <= 0:
return pwd_str[_loc1_ + len(pwd_str) - 1]
return pwd_str[_loc1_ - 1]
if __name__ == '__main__':
url_list = decode(
"lxxt6jIID2Byq541xEB6F3u71bYaE5A/A-1dMFS4o9mx8uzpm81KxH25u1E29:Cl7Wg|"
"lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_:hQ5Ue|"
"lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_/hQ5Ue|"
"lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_\hQ5Ue|"
"lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_.hQ5Ue|"
"lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7__hQ5Ue|"
"lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7_AhQ5Ue|"
"lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7bhQW5e|"
"lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7ChQW5e|"
"lxxt4hGGB6F3u763zGD9i0X_4EBDh7CAC.6Irkx6q7oz7TYOL2uErB25u1E7dhQW5e"
)
print(url_list)
複製代碼
執行一下看看效果
BOOM!
若是這篇文章有幫到你,請大力點贊,謝謝~~ 歡迎關注個人知乎帳號loco_z和個人知乎專欄《手把手教你寫爬蟲》,我會時不時地發一些爬蟲相關的乾貨和黑科技,說不定能讓你有所啓發。