優雅地尋找網站源碼

0x0 前言

滲透過程當中若是能獲取到網站的源代碼，那麼無疑開啓了上帝視角。雖然以前出現過很多經過搜索引擎查找同類網站，而後批量掃備份的思路，可是卻沒人分享其具體過程，這裏筆者便整理了本身開發分佈式掃描器的目錄掃描模塊的一些嘗試的思路，同時分享一些尋找源碼的其餘手段，但願能給讀者帶來一些新的體驗。php

0x1 搜索技巧

0x1.1 代碼託管平臺

國外的github和國內的gitee都是第三方代碼託管平臺，經過一些搜索技巧，咱們能夠從中發現不少泄露的敏感信息，其中就包括一些程序的源代碼。html

這裏筆者對碼雲平時用的很少，故對此只是簡單提提，下面，則重點介紹github的用法:python

學習這個用法就我我的而言最大的好處是，遇到返回大量數據的時候，能夠根據一些特色來過濾掉一些垃圾數據。git

Github的搜索頁面:github.com/searchgithub

(1) quick cheat sheetdocker

基礎查詢:api

搜索倉庫:bash

搜索代碼:markdown

搜索用戶:cookie

(2)我的查詢Dork

filename:config.php dbpasswdfilename:.bashrc passwordshodan\_api\_key language:pythonpath:sites datab ases password"baidu.com" ssh language:yamlfilename:file.php admin in:pathorg:companyname "AWS\_ACCESS\_KEY_ID:"
複製代碼

(3)針對某個關鍵詞查詢

用雙引號括起來,如"qq.com"

(4)可使用GitDorker來自定義dork，實現自動化查詢。

git clone https://github.com/obheda12/GitDorker.gitcd GitDorkerdocker build -t gitdorker .docker run -it gitdorkerdocker run -it -v $(pwd)/tf:/tf gitdorker -tf tf/TOKENSFILE -q tesla.com -d dorks/DORKFILE -o tesladocker run -it -v $(pwd)/tf:/tf xshuden/gitdorker -tf tf/TOKENSFILE -q tesla.com -d dorks/DORKFILE -o tesla
複製代碼

免安裝使用:

python3 GitDorker.py -tf ./TF/TOKENSFILE -q ximalaya.com -d ./Dorks/alldorksv3 -o x mly
複製代碼

參考:

github.com/techgaun/gi…

infosecwriteups.com/github-dork…

0x1.2 搜索引擎

Google:

XX源碼XX完整包xx安裝程序xx備份xx代碼xx開源xx源程序xx框架xx ext:rar | ext:tar.gz |ext:zip
複製代碼

0x1.3 網盤搜索

www.feifeipan.com/

www.dalipan.com/

www.chaonengsou.com/ 這個網站作了個集合，比較全。

0x2 曲線思路

若是如0x1所述，依然沒辦法找到源碼，說明目標系統是那種小衆或者商業類型的，致使沒有在互聯網流傳普遍，故沒辦法搜索到。

這個時候，咱們即可以採用曲線思路，經過尋找本網站根目錄下的備份文件，源代碼包進行下載，若是仍然沒有找到，則去尋找同套系統的其餘網站，掃描這些網站目錄下的備份文件和源代碼包，從而獲取到系統源碼。

咱們不能作思想上的巨人，行動上的矮子，那麼如何高效地完成這一過程呢? 能夠劃分爲下面幾個步驟來完成。

0x2.1 提取特徵

關於特徵，重點收集主頁特徵，即直接訪問域名顯示的頁面，由於主頁是最容易被搜索引擎爬蟲爬到的，次之，則是收集主頁可訪問到的其餘標誌性頁面特徵。

(1) logo 特徵

請求favicon.ico獲取hash

(2) 關鍵詞特徵

網站title、網站版權信息、j avas cript關鍵字信息、html源碼結構信息、http返回頭特徵。

0x2.2 資產收集

關於資產收集，除了調度本身寫的腳本集成fofa,shodan,zoomeye三個平臺以外，我還很喜歡使用一個工具，由於它的功能比較豐富且運行也較爲穩定——-fofaviewer。

下載地址:github.com/wgpsec/fofa…

0x2.3 簡單fuzz

收集到資產以後，前期，我喜歡用httpx進行一些路徑的簡單探測

cat targets.xt|deduplicate|httpx -path '/wwwroot.zip' -status-code

至關於作一層簡單的過濾，來幫助nuclei減小請求的量。

0x2.4 編寫nuclei插件

閱讀和學習編寫插件的官方文檔:Guide可知:

編寫插件第一步: 插件信息

新建back-up-files.yaml文件，寫入以下內容

參考:nuclei.projectdiscovery.io/templating-… 可知

id是必須的，不能包含空格，通常與文件名相同

info區域是動態的，除了name, author, des cription, severity and tags，也能夠添加其餘key:value，tags是支持用於nuclei檢索調用的，可參照同類插件來寫。

id: back-up-filesinfo:  name: Find Resource Code Of Target Template  author: xq17  severity: medium  tags: exposure,backup
複製代碼

編寫插件的第二步:發送請求

參考:nuclei.projectdiscovery.io/templating-… 可知

1.HTTP Requests start with a request block which specifies the start of the requests for the template.

2.Request method can be GET, POST, PUT, DELETE, etc depending on the needs.

3.Redirection conditions can be specified per each template. By default, redirects are not followed. However, if desired, they can be enabled with redirects: true in request d etails.

4.The next part of the requests is the path of the request path. Dynamic variables can be placed in the path to modify its behavior on runtime.

Variables start with {{ and end with }} and are case-sensitive.

{{b aseURL}} - This will replace on runtime in the request by the original URL as specified in the target file.

{{Hostname}} - Hostname variable is replaced by the hostname of the target on runtime.

5.Headers can also be specified to be sent along with the requests. Headers are placed in form of key/value pairs. An example header configuration looks like this:
# headers contains the headers for the requestheaders: # Custom user-agent header User-Agent: Some-Random-User-Agent # Custom request origin Origin: https://google.com
複製代碼
6.Body specifies a body to be sent along with the request. (發送POST包須要用到)

7.To maintain cookie b ased browser like session between multiple requests, you can simply use cookie-reuse: true in your template, Useful in cases where you want to maintain session between series of request to complete the exploit chain and to perform authenticated scans.(Session重用，做用是串聯攻擊鏈，實現登陸驗證再攻擊)
# cookie-reuse accepts boolean input and false as defaultcookie-reuse: true
複製代碼
8.Request condition allows to check for condition between multiple requests for writing complex checks and exploits involving multiple HTTP request to complete the exploit chain.

with DSL matcher, it can be utilized by adding req-condition: true and numbers as suffix with respective attributes, status_code_1, status_code_3, andbody_2 for example.(編寫複雜攻擊鏈)
  req-condition: true   matchers:     - type: dsl       dsl:         - "status\_code\_1 == 404 && status\_code\_2 == 200 && contains((body\_2), 'secret\_string')"
複製代碼
…還有許多高級用法好比支持raw http，race之類的，可是這裏用不上，文檔這個東西，夠用就行。

requests:  - method: GET    path:    - "{{b aseURL}}/wwwroot.zip"    - "{{b aseURL}}/www.zip"
複製代碼

編寫插件的第三步: 判斷返回內容

參考:nuclei.projectdiscovery.io/templating-… 知

Multiple matchers can be specified in a request. There are basically 6 types of matchers:

status(狀態碼) size(返回包大小) word(字符串) regex(正則匹配) binary(二進制文件)

還有一個dsl，高度自定義驗證返回內容，能夠對返回內容作一些操做(這裏暫時用不上)

可用的輔助函數: nuclei.projectdiscovery.io/templating-…,

對於words and regexes,能夠對返回內容的多個匹配條件用AND或OR進行組合。

Multiple words and regexes can be specified in a single matcher and can be configured with different conditions like AND and OR

能夠對返回的包，選定match的區域，默認是body，也支持選擇header等任意地方。

Multiple parts of the response can also be matched for the request, default matched part is body if not defined.

支持對條件取反，這個就是反證法的妙處了。

All types of matchers also support negative conditions, mostly useful when you look for a match with an exclusions. This can be used by adding negative: true in the matchers block.

支持使用多個matchers

Multiple matchers can be used in a single template to fingerprint multiple conditions with a single request.

支持matchers-condition

While using multiple matchers the default condition is to follow OR operation in between all the matchers, AND operation can be used to make sure return the result if all matchers returns true.

結合上面文檔的介紹，能夠寫入以下的判斷。

    matchers-condition: and    matchers:      - type: binary        binary:          - "504B0304"  # zip        part: body      - type: dsl        dsl:          - "len(body)>0"      - type: status        status:          - 200
複製代碼

編寫插件的第四步: 連接起各個部分

上面的代碼內容按順序連接起來，則是以下:

id: back-up-filesinfo:  name: Find Resource Code Of Target Template  author: xq17  severity: medium  tags: exposure,backuprequests:  - method: GET    path:    - "{{b aseURL}}/wwwroot.zip"    - "{{b aseURL}}/www.zip"    matchers-condition: and    matchers:      - type: binary        binary:          - "504B0304"  # zip        part: body      - type: dsl        dsl:          - "len(body)>0"      - type: status        status:          - 200
複製代碼

0x2.5 測試插件

本地起一個靶機，進行調試:

python3 -m http.server 9091
複製代碼

而後調試:

echo 'http://127.0.0.1:9091' | nuclei -t back-up-files.yaml -debug -timeout 2 -stats -proxy-url http://127.0.0.1:8080/
複製代碼

發包過程:

能夠看到nuclei應用上插件以後，能夠快速Fuzz出網站備份文件。

0x3 總結

第一篇主要是介紹了一些思路和nuclei插件編寫簡單思路，用於幫助新手快速入門，第二篇則是關於如何加強該插件，增長掃描目錄列表，更精確的判斷返回值等內容(這裏建議讀者，能夠先自行閱讀下nuclei-template的文檔，這樣學習效果更佳!),第三篇則是運用前兩篇的知識點和加強型插件，來完成一次真實的尋找網站源碼之旅。