前一段時間,寫過一篇帖子,http://eip.teamshub.com/t/3245814
裏面最後提到了一種數據"分光鏡"的技術。
今天我就來講說這個東東究竟是什麼,以及怎麼運用。
當年習慣性的在QQ上搜索技術羣,尤爲是服務器技術的羣,其中吹水的不少,加了看了幾天就退了。被朋友推薦加入了一個tcpcopy的羣(這個羣那時候須要推薦才能加入),剛加入這個羣的時候,6天沒一我的說話。正準備放棄的時候,看到了羣主的github鏈接,因而上去看了一眼。
彷佛是一個相似抓包的開源工具,我感興趣的是那幾千個star爲何會給一個這樣的工具。
因而和羣主聊天,開始研究它的技術實現,最終發現,這確實是一個不得了的東西。
有幸和羣主一塊兒約出來吃飯,更加深刻了解了這個東西的用途,這個東西不少大公司都在使用,確實起到了很好的測試效果。我開始在個人項目中使用,起到了很是好的效果。
數據分光鏡是什麼呢?
它解決的是"開發 - 測試 -運維"一條線上的信任關係。
開發不斷的出新的迭代,傳統的測試,每每經過測試工具,覆蓋必定的場景,可是不可能完美覆蓋全部的場景,致使了運維沒次上線系統更新都是心驚膽戰,生怕出什麼簍子,今後往復循環,你們都活的很累。
而實際線上的數據流動多種多樣,幾乎能夠覆蓋各類狀況,遠遠超過了傳統測試能覆蓋的測試結果。
那麼,咱們有沒有一種方式,將在線的巨大流量,在不知不覺中導入到個人迭代的新系統中去呢?從而完成線上測試?
答案是能夠的,咱們須要將在線的數據流量,神不知鬼不覺的複製出來一份(相似光學的分光,你能夠把數據變成一種光),傳遞給個人更新服務器,從而作到壓力測試。
那麼你可能會有一個疑問,我複製過去的流量,那麼返回數據流量如何處理?其實很簡單,在Linux下,使用IP Route規則作一個"黑洞",把哪些沒必要要的數據"吸"入黑洞就能夠了。這樣,在線數據什麼都不會受到影響,同時,導出的流量,足足能夠驗證你的新上線的服務器是否穩定。固然,你也能夠用此調試BUG,好比一個能引發服務器崩潰的BUG,在生產環境下不可停機的基礎上,咱們導入流量到測試服務器上,在測試服務器上作BUG的驗證,極大的減輕了測試和運維的壓力,同時也能夠給程序員增長上線的信心。
來讓咱們看一張圖,看看tcpcopy是怎麼作的。html
這是來源在一個網易平臺的網絡廣告推薦算法的更新測試實例。(做者是網易的)
因爲推薦算法會不斷的更新,若是產生了推送廣告不是用戶想要的,就會形成很大的負面影響,
那麼,使用tcpcopy,實際就能夠將在線流量導入到每次的測試服務器上運行,便可獲得相應的測試結果。
tcpcopy實際上不止可用在http,全部的tcp基礎數據包也支持。好比一些硬件的數據流,視頻流等等。當初我和做者開始開發適合於tcp任意數據包的數據流格式,並得到了成功。
它的實現是利用底層的數據抓包,直接從物理層鏈路抓數據,並修改包中關於IP指向的部分(將原有指向當前服務器的地址信息修改爲測試主機的IP信息,並修改路由信息將數據跳到測試主機上去。前提是測試主機必須和生產環境在一個網段上)
那麼,生產環境的數據流,就會被自動分紅兩份,一份流向當前正常的生產環境。另外一份流向測試服務器。
那麼,在測試服務器上,咱們也須要作一些工做,設置迴流的路由"中斷",畢竟,咱們不但願測試的服務器返回數據干擾生產環境,因此,咱們必須作一個"黑洞",把測試服務器返回的數據所有吸取掉。
怎麼作這個黑洞呢?咱們須要設置一條路由規則,讓返回的數據走到一個其它的IP上面去,在這個IP上把數據丟棄便可。同時,返回上層tcp發送"成功"。欺騙測試服務器網絡一切正常,實際上返回的數據由於不存在的IP在底層被丟棄掉了。只是測試服務器不知道而已。
例如,具體事例,能夠參考這裏(http://www.cnblogs.com/tommyli/p/4239570.html)
之因此再說一遍,是由於這篇文章的順序有問題,若是按照它的順序,會形成一部分數據迴流到正式環境,因此,步驟很重要,在這裏強調一下,別被坑了。
第一步先製造數據黑洞,保證測試主機數據不能流回生產環境。
route add -net 192.168.100.0 netmask 255.255.255.0 gw 10.53.132.52
數據黑洞就形成了。
intercept -i eth1 -F tcp and src port 8090 -d固然,若是你的測試服務器被拿下幹別的事情,最好恢復一下路由,不然會對別人不知情的使用者形成數據發送的困擾,切記。
那麼,黑洞設置完了,咱們看看怎麼設置在線服務器。
先須要在在線服務器上安裝tcpcopy,建議使用源代碼編譯安裝方式,由於代碼會根據環境進行一些優化的配置。
而後運行
tcpcopy -x 8090-10.53.132.55:8090 -s 10.53.132.52 -c 192.168.100.x -n 5
這裏的一些參數,我須要說明一下
直接上源碼來的快些,實際上,它能夠支持不止是指定端口轉發那麼簡單,還能夠根據規則決定生產環境的那些數據被轉發,那些不轉發,真正起到"分光"的做用,那就是後話了。
printf("tcpcopy " VERSION "\n");
#if (!TC_PCAP_SND)
printf("-x use to specify the IPs and ports of the source and target\n"
" servers. Suppose 'sourceIP' and 'sourcePort' are the IP and port \n"
" number of the source server you want to copy from, 'targetIP' and \n");
printf(" 'targetPort' are the IP and port number of the target server you want\n"
" to send requests to, the format of could be as follows:\n"
" 'sourceIP:sourcePort-targetIP:targetPort,...'. Most of the time,\n");
printf(" sourceIP could be omitted and thus could also be:\n"
" 'sourcePort-targetIP:targetPort,...'. As seen, the IP address and the\n"
" port number are segmented by ':' (colon), the sourcePort and the\n");
printf(" targetIP are segmented by '-', and two 'transfer's are segmented by\n"
" ',' (comma). For example, './tcpcopy -x 80-192.168.0.2:18080' would\n"
" copy requests from port '80' on current server to the target port\n"
" '18080' of the target IP '192.168.0.2'.\n");
#else
printf("-x use to specify the IPs, ports and MAC addresses of\n"
" the source and target. The format of could be as follow:\n");
printf(" 'sourceIP:sourcePort@sourceMac-targetIP:targetPort@targetMac,...'.\n"
" Most of the time, sourceIP could be omitted and thus could\n"
" also be: sourcePort@sourceMac-targetIP:targetPort@targetMac,...'.\n");
printf(" Note that sourceMac is the MAC address of the interface where \n"
" packets are going out and targetMac is the next hop's MAC address.\n");
#endif
printf("-H change the localhost IP address to the given IP address\n");
printf("-c change the client IP to one of IP addresses when sending to the\n"
" target server. For example,\n"
" './tcpcopy -x 8080-192.168.0.2:8080 -c 62.135.200.x' would copy\n"
" requests from port '8080' of current online server to the target port\n"
" '8080' of target server '192.168.0.2' and modify the client IP to be\n"
" one of net 62.135.200.0/24.\n");
#if (TC_OFFLINE)
printf("-i set the pcap file used for tcpcopy to (only valid for the\n"
" offline version of tcpcopy when it is configured to run at\n"
" enable-offline mode).\n");
printf("-a accelerated times for offline replay\n");
printf("-I set the threshold interval for offline replay acceleration\n"
" in millisecond.\n");
#endif
#if (TC_PCAP)
printf("-i The name of the interface to listen on. This is usually a driver\n"
" name followed by a unit number, for example eth0 for the first\n"
" Ethernet interface.\n");
printf("-F user filter (same as pcap filter)\n");
printf("-B buffer size for pcap capture in megabytes(default 16M)\n");
printf("-S capture bytes per packet\n");
#endif
#if (TC_PCAP_SND)
printf("-o The name of the interface to send. This is usually a driver\n"
" name followed by a unit number, for example eth0 for the first\n"
" Ethernet interface.\n");
#endif
printf("-n use to set the replication times when you want to get a \n"
" copied data stream that is several times as large as the online data.\n"
" The maximum value allowed is 1023. As multiple copying is based on \n"
" port number modification, the ports may conflict with each other,\n");
printf(" in particular in intranet applications where there are few source IPs\n"
" and most connections are short. Thus, tcpcopy would perform better \n"
" when less copies are specified. For example, \n"
" './tcpcopy -x 80-192.168.0.2:8080 -n 3' would copy data flows from \n");
printf(" port 80 on the current server, generate data stream that is three\n"
" times as large as the source data, and send these requests to the\n"
" target port 8080 on '192.168.0.2'.\n");
printf("-f use this parameter to control the port number modification process\n"
" and reduce port conflications when multiple tcpcopy instances are\n"
" running. The value of should be different for different tcpcopy\n"
" instances. The maximum value allowed is 1023.\n");
printf("-m set the maximum memory allowed to use for tcpcopy in megabytes, \n"
" to prevent tcpcopy occupying too much memory and influencing the\n"
" online system. When the memory exceeds this limit, tcpcopy would quit\n"
" automatically. The parameter is effective only when the kernel \n");
#if (TC_MILLION_SUPPORT)
printf(" version is 2.6.32 or above. The default value is 4096.\n");
#else
printf(" version is 2.6.32 or above. The default value is 1024.\n");
#endif
printf("-M MTU value sent to backend (default 1500)\n");
printf("-D MSS value sent back(default 1460)\n");
printf("-R set default rtt value\n");
printf("-U set user session pool size in kilobytes(default 1).\n"
" The maximum value allowed is 63.\n");
printf("-C parallel connections between tcpcopy and intercept.\n"
" The maximum value allowed is 11(default 2 connections).\n");
printf("-s intercept server list\n"
" Format:\n"
" ip_addr1:port1, ip_addr2:port2, ...\n");
printf("-t set the session timeout limit. If tcpcopy does not receive response\n"
" from the target server within the timeout limit, the session would \n"
" be dropped by tcpcopy. When the response from the target server is\n"
" slow or the application protocol is context based, the value should \n"
" be set larger. The default value is 120 seconds.\n");
printf("-k set the session keepalive timeout limit.\n");
printf("-l save the log information in \n"
"-r set the percentage of sessions transfered (integer range:1~100)\n"
"-p set the target server listening port. The default value is 36524.\n");git