通用輕量級二進制格式協議解析器

時間 2019-11-21

標籤通用輕量級二進制格式協議解析简体版

原文原文鏈接

在通訊協議中，常常碰到使用私有協議的場景，報文內容是肉眼沒法直接看明白的二進制格式。因爲協議的私有性質，即便大名鼎鼎的 Wireshark，要解析其內容，也無能爲力。python

面對這種狀況，開發人員一般有兩個辦法：第一，對照報文內容和協議規範進行人工分析（假設內容沒有通過加密、壓縮）；第二，編程實現協議報文的解析（源於程序員的懶惰 ^_^）。程序員

很明顯，第二條道路是主流。目前比較常見的實現方式是開發對應的 Wireshark 插件，包括 C、Lua 等插件。固然，插件完成後須要運行 Wireshark 才能調用插件進行協議解析，操做上相對厚重。編程

有沒有更好的辦法？天然咱們會想到 Python，用腳本對二進制格式報文進行內容解析。這方面的資料，網上也有一大把。數組

到此，彷佛問題就完結了。可是，仔細考慮，仍有提高的空間，通常而言，解析私有協議，每每是一種協議對應一個解析腳本。在只有少數幾種協議須要解析的狀況下，採用這種 ad hoc 的方式能夠搞定，解析腳本甚至能夠用完即丟。若是有不少種協議呢？好比說幾十種，那怎麼辦？此時還用這種「一事一議」的方法未免就不太聰明瞭。服務器

那麼，可否開發一個通用的二進制格式協議解析腳本，使得理論上對於任意一種二進制格式的報文，都可以解析出來？數據結構

本篇給出一個 Python 編寫的解析腳本，試圖回答這個問題。app

咱們經過一個示例，演示如何使用此腳本。socket

假設有一個應用層協議 FOO，承載在 UDP 12345 端口上，協議內容是記錄用戶從 FTP 服務器下載或上傳文件的信息。ide

將協議的一次交互過程抓包，獲得文件 Foo.pcap，內容以下
ui

下面是 F00 協議規範

如今咱們再準備一個模板文件 packet.template，該文件的基本內容已經提供好。只須要在文件中增長一個協議模板 FOO，協議模板的內容以下。

{FOO}
# 字段名                 類型          長度     描述                取值約束
Version                 UINT8         1       版本號              (1)
MagicField              CHAR[]        3       表示協議的魔術字串    FOO
MessageLength           UINT16        2       消息長度
MessageSerialNO         UINT32        4       消息序列號
MessageType             UINT16        2       消息類型            (1-配置消息|2-操做消息)
Time                    TIME          4       當前時間             日誌產生時間，用UTC時間表示
UserNameLength          UINT8         1       用戶名長度           N
UserName                CHAR[]        N       用戶名
OperationType           UINT8         1       操做類型            (1-上傳|2-下載)
SourceAddressType       UINT8         1       源地址類型           {4-4/IPV4 | 16-16/IPV6}
SourceAddress           IPV4|IPV6     U       源地址
DestinationAddressType  UINT8         1       目的地址類型         {4-4/IPV4 | 16-16/IPV6}
DestinationAddress      IPV4|IPV6     U       目的地址
SourcePort              UINT16        2       源端口
DestinationPort         UINT16        2       目的端口
FileNameLength          UINT8         1       文件名長度           N
FileName                CHAR[]        N       文件名

對比上圖，能夠發現，協議模板與協議規範，其字段是一一對應的，部分字段做了局部的微調。

要注意的是【長度】和【取值約束】。如何填寫這兩列，請參考模板文件中的說明。

都準備好後，在命令行運行：

D:\>c:\Python27\python.exe packet_parse_python2.py packet.template FOO.pcap

下面是部分解析結果輸出

IPv4
  版本 -- 4
  包頭長度 -- 5
  Differentiated Services Codepoint -- 0
  Explicit Congestion Notification -- 0
  總長度 -- 84
  標識 -- 66
  標記 -- 0
  分段偏移 -- 0
  生存期 -- 64
  協議 -- 17 UDP
  報文頭校驗碼 -- 0x6284
  源地址 -- 192.168.203.1
  目的地址 -- 192.168.203.128

UDP
  源端口號 -- 1122
  目標端口號 -- 12345 FOO
  數據報長度 -- 64
  校驗值 -- 56346

FOO
  版本號 -- 1
  表示協議的魔術字串 -- FOO
  消息長度 -- 298
  消息序列號 -- 287454020
  消息類型 -- 2 操做消息
  當前時間 -- 2006-11-25 00:09:04
  用戶名長度 -- 8
  用戶名 -- TestUSER
  操做類型 -- 2 下載
  源地址類型 -- 4 4/IPV4
  源地址 -- 192.168.0.1
  目的地址類型 -- 4 4/IPV4
  目的地址 -- 192.168.0.2
  源端口 -- 4660
  目的端口 -- 22136
  文件名長度 -- 15
  文件名 -- packet_template

……部分省略……

56 c0 00 08 08 00 45 00 - 00 54 00 42 00 00 40 11
            |     |  |    |     |     |     |  |
            |     |  |    |     |     |     |  協議 -- 17
            |     |  |    |     |     |     生存期 -- 64
            |     |  |    |     |     標記 -- 0 | 分段偏移 -- 0
            |     |  |    |     標識 -- 66
            |     |  |    總長度 -- 84
            |     |  Differentiated Services Codepoint -- 0 | Explicit Congestion Notification -- 0
            |     版本 -- 4 | 包頭長度 -- 5
            上層協議 -- 0x0800

62 84 c0 a8 cb 01 c0 a8 - cb 80 04 62 30 39 00 40
|     |           |             |     |     |
|     |           |             |     |     數據報長度 -- 64
|     |           |             |     目標端口號 -- 12345
|     |           |             源端口號 -- 1122
|     |           目的地址 -- 192.168.203.128
|     源地址 -- 192.168.203.1
報文頭校驗碼 -- 0x6284

dc 1a 01 46 4f 4f 01 2a - 11 22 33 44 00 02 45 67
|     |  |        |       |           |     |
|     |  |        |       |           |     當前時間 -- 2006-11-25 00:09:04
|     |  |        |       |           消息類型 -- 2
|     |  |        |       消息序列號 -- 287454020
|     |  |        消息長度 -- 298
|     |  表示協議的魔術字串 -- FOO
|     版本號 -- 1
校驗值 -- 56346

89 a0 08 54 65 73 74 55 - 53 45 52 02 04 c0 a8 00
      |  |                         |  |  |
      |  |                         |  |  源地址 -- 192.168.0.1
      |  |                         |  源地址類型 -- 4
      |  |                         操做類型 -- 2
      |  用戶名 -- TestUSER
      用戶名長度 -- 8

01 04 c0 a8 00 02 12 34 - 56 78 0f 70 61 63 6b 65
   |  |           |       |     |  |
   |  |           |       |     |  文件名 -- packet_template
   |  |           |       |     文件名長度 -- 15
   |  |           |       目的端口 -- 22136
   |  |           源端口 -- 4660
   |  目的地址 -- 192.168.0.2
   目的地址類型 -- 4

74 5f 74 65 6d 70 6c 61 - 74 65

設計思路：代碼不變，數據驅動。原則上只須要擴充協議模板。

解析的應用層協議，其字段之間暫不支持相似 ASN1 語法的 TLV 嵌套形式。

部分 BUG 遺留，須要完善，歡迎反饋。

最後，附上

解析腳本（Python 2.X）packet_parse_python2.py

# coding=gbk
# 須要 bitstring-2.1.1 和 win_inet_pton-1.0.1 支持
# first
#   cd <Python2x目錄>\Lib\bitstring-2.1.1
#   python setup.py install
# second
#   cd <Python2x目錄>\Lib\win_inet_pton-1.0.1
#   python setup.py install

from bitstring import BitStream
import sys
import re
import time
import win_inet_pton
import socket
import struct

# 字段名 類型 長度 中文描述 取值約束
class TemplateEntry(object):
    def __init__(self, field):
        self.fieldName = field[0]
        self.fieldType = field[1]
        self.fieldLen  = field[2]
        self.fieldChDesc = field[3]
        self.fieldRestri = field[4]

class ParseTemplate(object):
    def __init__(self, templateName):
        self.name = templateName
        self.arrayTemplate = []

if len(sys.argv) < 3:
    print "使用方法:", sys.argv[0], '[-d] 解析模板 待解析文件'
    print "說    明: -d 爲可選參數，表示打開調試開關"
    exit()

matchedTuples = []
currentBitPosition = 0
lastReadBits = 0

templateFile = sys.argv[1]
dataFile = sys.argv[2]
debug = 0
if sys.argv[1] == '-d':
    debug = 1
    templateFile = sys.argv[2]
    dataFile = sys.argv[3]

template_list = [];

print '解析模板文件', templateFile, '...'
try:
    template_file = open(templateFile, 'r')
except IOError:
    print "文件", templateFile, '打開失敗'
    exit()

all_lines = template_file.readlines();
template_file.close();

for each_line in all_lines:
    if not re.match('#', each_line):
        try:
            each_line = each_line.strip('\n'); # 去掉末尾換行
            # print '\n', r'trim \n -> ['+each_line+']'

            if re.match('{', each_line):
                match = re.search(r'{(.*)}', each_line)
                TemplateName =  match.group(1)
                myTemplate = ParseTemplate(TemplateName)
                if debug:
                    print '結構名:', myTemplate.name
                template_list.append(myTemplate)
            else:
                field_split = each_line.split("\t")
                # \t 分割字段, 例如 'A\t\t\tB\t\tC\t\tD\tE' 獲得
                #                   'A', '', '', 'B', '', 'C', '', 'D', 'E'
                # print r'split \t ->' , field_split

                while '' in field_split: # 去掉分割獲得的多個空串
                    field_split.remove('')
                if len(field_split) == 0:
                    # print '空列表'
                    continue
                # print r"remove '' ->" , field_split
                while len(field_split) != 5: # 補足長度
                    field_split.append('')
                curEntry = TemplateEntry(field_split)
                myTemplate.arrayTemplate.append(curEntry)
        except ValueError:
            pass

print '\n打開數據文件', dataFile, '...'
try:
    data_file = open(dataFile, 'rb')
except IOError:
    print "文件", dataFile, '打開失敗'
    exit()

whole_content = data_file.read()
# print whole_content

if debug:
    # 16 字節一行顯示: xx xx xx xx xx xx xx xx -- xx xx xx xx xx xx xx xx
    lines = [whole_content[i:i+8] for i in range(0, len(whole_content), 8)]
    line_count = 0
    for line in lines:
        if line_count %2 == 1:
            print '-', ' '.join("{0:02x}".format(ord(x)) for x in line)
        else:
            print ' '.join("{0:02x}".format(ord(x)) for x in line),
        line_count += 1
print

b = BitStream(bytes=whole_content)

if debug:
    for template in template_list:
        print 'template:', template.name
        for element in template.arrayTemplate:
            print "  <%s> <%s> <%s> <%s> <%s>" % (element.fieldName, element.fieldType, element.fieldLen, element.fieldChDesc, element.fieldRestri)


def matchTuples(index, parseTuples):
    tempResult = ''
    for parseTuple in parseTuples:
        if index == parseTuple[0]: # 同一個 parseIndex 有可能對應多個字段
            if len(tempResult) == 0:
                tempResult = parseTuple[1]
            else:
                tempResult = tempResult + ' | ' + parseTuple[1]
    if len(tempResult) == 0:
        return None
    else:
        return tempResult # parseString

def displayParseLine(line, displayTuples):
    displayOneLineList = []
    count = 0
    start = 16*(line - 1)
    end = 16*line
    for i in range(start, end, 1): # 收集一行內待顯示豎線的位置信息
        parseString = matchTuples(i, displayTuples)
        if parseString:
            count += 1
            displayOneLineList.append((i, parseString))

    displayOneLineList.append((0, ''))

    # XX XX XX XX XX XX XX XX - XX XX XX XX XX XX XX XX -- 一行 16 進制字符
    # 0  3  6  9  12 15 18 21   26 29 32 35 38 41 44 47 -- verticalIndexRange
    # 0  1  2  3  4  5  6  7    8  9  10 11 12 13 14 15 -- x
    verticalIndexRange = range(0,22,3) + range(26,48,3)

    lastWidth = 0 # 上一行輸出寬度
    for verticalCount in range(count,0,-1): # (count, count-1 ... 1)
        i = 0
        j = 0       # 當前行輸出多少個字符
        for verticalIndex in range(49): # 48 -- 行內最大座標
            if verticalIndex in verticalIndexRange:
                x = verticalIndexRange.index(verticalIndex)
                if matchTuples(start+x, displayTuples):
                    sys.stdout.write('|')
                    i += 1
                    j += 1
                    if i >= verticalCount: break
                else:
                    sys.stdout.write(' ') # print ' ',
                    j += 1
            else:
                sys.stdout.write(' ') # print ' ',
                j += 1
        if lastWidth > j:
            sys.stdout.write(' '*(lastWidth-j-1))
            print displayOneLineList[-(count-verticalCount+1)][1]
        elif lastWidth == 0: # 第一次循環，特別對待
            print
        lastWidth = j

    sys.stdout.write(' '*(lastWidth-1))
    print displayOneLineList[0][1]

def displayParse(text, displayTuples):
    # 16 字節一行顯示: xx xx xx xx xx xx xx xx -- xx xx xx xx xx xx xx xx
    lines = [text[i:i+8] for i in range(0, len(text), 8)]
    line_count = 0
    for line in lines:
        if line_count%2 == 1:
            print '-', ' '.join("{0:02x}".format(ord(x)) for x in line)
        else:
            print
            print ' '.join("{0:02x}".format(ord(x)) for x in line),
        line_count += 1
        if line_count%2 == 0:
            displayParseLine(line_count/2, displayTuples)

# 根據類型和長度肯定 bitstring:read 的讀入參數
def valueFromTypeAndLen(b, fieldType, fieldLen):
    '''
    輸入格式
        UINT8                    1
        UINT16          2
        UINT16LE        2
        UINT32          4
        HEX[]           4
        IPV4            4
        CHAR[]          10
        IPV6            16
        CHAR[]          N
        IPV4|IPV6       U
        {XXXX}          1
    '''
    global currentBitPosition, lastReadBits
    data = ''
    length = int(fieldLen);
    if length == 0:
        return data

    try:
        # fieldLen > 1 只考慮少數狀況
        if fieldType == "UINT8":
            data = b.read('uint:8')
            currentBitPosition += 8
            lastReadBits = 8
        elif fieldType == "UINT16":
            data = b.read('uint:16')
            currentBitPosition += 16
            lastReadBits = 16
        elif fieldType == "UINT16LE":
            data = b.read('uintle:16')
            currentBitPosition += 16
            lastReadBits = 16
        elif fieldType == "UINT16BE":
            data = b.read('uintbe:16')
            currentBitPosition += 16
            lastReadBits = 16
        elif fieldType == "UINT32":
            data = b.read('uint:32')
            currentBitPosition += 32
            lastReadBits = 32
        elif fieldType == "UINT32LE":
            data = b.read('uintle:32')
            currentBitPosition += 32
            lastReadBits = 32
        elif fieldType == "UINT32BE":
            data = b.read('uintbe:32')
            currentBitPosition += 32
            lastReadBits = 32
        elif fieldType == "UINT64":
            data = b.read('uint:64')
            currentBitPosition += 64
            lastReadBits = 64
        elif fieldType == "TIME":
            data = b.read('uint:32')
            data = time.strftime("%Y-%m-%d %X", time.gmtime(int(data))) # 轉成 UTC 時間
            currentBitPosition += 32
            lastReadBits = 32
        elif fieldType == "TIME_LE":
            data = b.read('uintle:32')
            data = time.strftime("%Y-%m-%d %X", time.gmtime(int(data))) # 轉成 UTC 時間
            currentBitPosition += 32
            lastReadBits = 32
        elif fieldType == "HEX[]":
            length = 8*length
            formatString = 'hex:' + str(length)
            data = b.read(formatString)
            currentBitPosition += length
            lastReadBits = length
        elif fieldType == "MAC[]":
            length = 8*length
            formatString = 'hex:' + str(length)
            data = b.read(formatString)
            data = data.lstrip('0x')
            data = [data[i:i+2] for i in range(0, len(data), 2)]
            data = ':'.join(data)
            currentBitPosition += length
            lastReadBits = length
        elif fieldType == "IPV4":
            data = b.read('uint:32')
            data = socket.inet_ntoa(struct.pack('!L', data))
            currentBitPosition += 32
            lastReadBits = 32
        elif fieldType == "IPV6":
            formatString = 'bytes:16'
            data = b.read(formatString)
            data = socket.inet_ntop(socket.AF_INET6, data)
            currentBitPosition += 128
            lastReadBits = 128
        elif fieldType == "CHAR[]":
            formatString = 'bytes:' + str(length)
            data = b.read(formatString)
            data = data.strip('\0')
            currentBitPosition += 8*length
            lastReadBits = 8*length
        elif fieldType == "BIT[]":
            data = b.read(length).uint
            currentBitPosition += length
            lastReadBits = length
    except:
        print "Reading ERROR"
        displayParse(whole_content, matchedTuples)
        exit()
    else:
        pass
    return data

# 將輸入參數轉成整數並返回結果
# 目前只考慮輸入參數是字符串和整數兩種狀況
def value2int(value):
    if type(value) == type(''): # 字符串
        if value.startswith('0x') or value.startswith('0X'):
            return int(value, 16)
        else:
            return int(value)
    return value # 若是自己已是整數，則返回整數自身

# 查找 index 是否在 searchString 的數字索引中，根據 mode 的不一樣表現不一樣
# mode searchString                                  index
# 1    ()中的內容: (0xd4c3b2a1) or (4-IPV4|16-IPV6)  必須爲 0xd4c3b2a1 或 4 或 16，不然出錯
# 2    []中的內容: [0-enlish | 1-中文]               能夠爲 0、1 或者其餘
# 3    {}中的內容: {4-4/IPV4 | 16-16/IPV6}           必須爲 4 或 16，而且'-'後跟內容，不然出錯
def indexMatchList(index, searchString, mode):
    origin_index = index
    index = value2int(index)
    array = searchString.split('|')
    for entry in array:
        match_tuples = re.search(r'\s*([^-]+)-([^-]*)\s*', entry) # 提取 id-name 對
        match_digtal = re.search(r'\s*((0x|0X)?[\da-fA-F]+)\s*', entry) # 提取 (1|2|3) 中的單個數字
        if match_tuples != None: # id-<null> 或 id-xxx
            id = match_tuples.group(1)
            id = id.strip()
            id = value2int(id)
            name = match_tuples.group(2)
            name = name.strip()
            if index == id:
                if len(name) == 0 and mode == 3:
                    print 'NOT Allowed -- only type no union struct'
                    exit()
                else:
                    return name
        elif match_digtal != None: # only id
            id = match_digtal.group(1)
            id = id.strip()
            id = value2int(id)
            if index == id:
                if mode == 3:
                    print 'NOT Allowed -- only type no union struct'
                    exit()
                else:
                    return True
    if mode == 1 or mode == 3: # 沒找到 id
        print '<%s> NOT in <%s>' % (origin_index, searchString)
        exit()
    else:
        return '<<other>>'

# 在 < 1-xxx | 2-yyy | 3-zzz | ... > 中查找數字索引對應的名字
# 好比 1 對應 xxx, 2 對應 yyy
def indexToString(index, searchString):
    index = value2int(index)
    array = searchString.split('|')
    for entry in array:
        match_tuples = re.search(r'\s*([^-]+)-([^-]+)\s*', entry) # 提取 <id,name> 對
        if match_tuples != None:
            id = match_tuples.group(1)
            name = match_tuples.group(2)
            name = name.strip()
            id = id.strip()
            id = value2int(id)
            if index == id:
                return name
    return None

# 根據名字查找模板
def findTemplate(templateName, templateList):
    for template in templateList:
        if templateName == template.name:
            return template
    return None

# 報文解析
def parseBinary(bitStream, templateName, templateList):
    global matchedTuples
    curTemplate = findTemplate(templateName, templateList)
    if curTemplate == None:
        print "\n沒找到模板", templateName
        exit()
    print templateName
    lastAction = 0
    isUnion = 0
    lastLength = None
    for element in curTemplate.arrayTemplate:
        (fieldName, fieldType, fieldLen, fieldChDesc, fieldRestri) = (element.fieldName, element.fieldType, element.fieldLen, element.fieldChDesc, element.fieldRestri)
        if debug:
            print "<%s> <%s> <%s> <%s> <%s>" % (fieldName, fieldType, fieldLen, fieldChDesc, fieldRestri)

        if fieldLen == 'U' and '|' in fieldType:
            # 針對 (長度, 類型)==('U', 'XXX|YYY|ZZZ') 格式進行修正
            if isUnion: # 前面是否設置了 union 標記
                (fieldLen, fieldType) = unionFormat.split('/')
                fieldLen = fieldLen.strip()
                fieldType = fieldType.strip()
                isUnion = 0
        elif fieldLen == 'N':
            if lastLength != None: # lastLength 可能在處理上一字段時被賦值爲 0
                fieldLen = lastLength
                lastLength = None

        if fieldType.startswith('{'): # 結構字段
            match = re.search(r'{(.*)}', fieldType) # 從 fieldType 中提取結構名
            curTemplateName = match.group(1)
            if fieldLen == 'X':
                while 1:
                    print '\n================================================================================'
                    parseBinary(bitStream, curTemplateName, templateList)
            else:
                parseBinary(bitStream, curTemplateName, templateList)

        else: # 普通字段
            fieldValue = valueFromTypeAndLen(bitStream, fieldType, fieldLen)
            print ' ', fieldChDesc, '--', fieldValue,
            parseIndex = (currentBitPosition - lastReadBits)/8
            parseString = fieldChDesc + ' -- ' + str(fieldValue)

            # 對說明字段進行判斷
            match_angle_bracket = re.search(r'<(.*)>', fieldRestri) # 包含 <> 表示後面跟若干個結構
            match_parenthese = re.search(r'\((.*)\)', fieldRestri) # 包含 () 表示取值必須在()範圍中
            match_square_bracket = re.search(r'\[(.*)\]', fieldRestri) # 包含 [] 表示取值能夠在、也能夠不在[]範圍內
            match_brace = re.search(r'{(.*)}', fieldRestri) # 包含 {} 表示下一字段類型由當前字段值決定

            if fieldRestri == 'N':
                lastLength = int(fieldValue)
                print

            elif match_angle_bracket != None: # <1-Ethernet | 20-IEEE_802_11> or <1-TCP|17-UDP>
                curRestriction = match_angle_bracket.group(1)
                name = indexToString(fieldValue, curRestriction)
                if name != None:
                    lastAction = 1
                    lastTemplate = name
                    print name
                else:
                    print 'NOT found template: value = <%s>, searchstring = <%s>' % (fieldValue, curRestriction)
                    exit()

            elif match_parenthese != None: # (0xd4c3b2a1) or (4-IPV4 | 16-IPV6)
                curRestriction = match_parenthese.group(1)
                ret = indexMatchList(fieldValue, curRestriction, 1)
                if ret != True:
                    print ret
                else:
                    print

            elif match_square_bracket != None: # [0-enlish | 1-中文]
                curRestriction = match_square_bracket.group(1)
                ret = indexMatchList(fieldValue, curRestriction, 2)
                if ret != True:
                    print ret
                else:
                    print

            elif match_brace != None: # {4-4/IPV4 | 16-16/IPV6}
                curRestriction = match_brace.group(1)
                ret = indexMatchList(fieldValue, curRestriction, 3)
                print ret
                isUnion = 1
                unionFormat = ret
            else:
                print

            matchedTuples.append((parseIndex, parseString))

    if lastAction:
        print
        if debug:
            print 'searching template <%s> int the END' % lastTemplate
        parseBinary(bitStream, lastTemplate, templateList)

parseBinary(b, "pcap_file_header", template_list)

View Code

模板文件 packet.template

# 字段名    類型    長度    描述    取值約束
#
# 格式要求：【字段名】【類型】【長度】【描述】【取值約束】字段之間使用 TAB 鍵分隔
#
# 【長度】字段說明：
# 數字 固定長度
# N    由上一字段決定: 即當前字段的長度由上一字段值決定
# S    當前字段爲結構體，由所在行的類型字段決定
# U    當前字段類型由上一字段決定, union
#
# 【取值約束】字段說明：
# []表示解析的字段值能夠在範圍內, 也能夠不在範圍內, 若是在範圍中，則字段值有明確的含義，不然表示不定
#     好比 [1-ftp | 2-http | 3-dns] 表示：1爲FTP, 2爲HTTP, 3爲DNS，若是是其餘數字，則表示other
#
# ()表示解析的字段值必須在()範圍中
#     好比 (4-IPV4 | 16-IPV6)表示 字段值只能等於4或6
#
# <>除了與()同樣外，<>中的數字還表示模板，好比<0x0800-IPv4>和<1-TCP|17-UDP>
#     之因此用符號<>，由於相似 HTML 中的連接 <a href>
#
# {} 表示下一字段類型由當前字段值決定，相似 C 語言中的 type-union 結構
#     好比 {4-4/IPV4 | 16-16/IPV6} 表示，若是當前字段值是4，則下一字段爲IPV4地址，若是是6，則爲IPV6地址
#     斜槓/將左右分爲 長度/類型
#
# N  若是是 N，表示下一字段的長度由當前值決定，當前字段與下一字段相似 Len/Value 的關係
#
# <include general.template> -- 能夠考慮將通用的協議描述(好比 IP/TCP/UDP )放在一個單獨的模板文件裏
{pcap_file_header}
# 字段名      類型                                      長度    描述              取值約束
pcap_magic    HEX[]                                     4       pcap文件標識      {0xd4c3b2a1-S/{header_info_little} | 0xa1b2c3d4-S/{header_info_big}}
header_info   {header_info_little}|{header_info_big}    U       pcap文件頭信息    pcap_magic字段決定後續字段是大端、仍是小端

{header_info_big}
# 字段名            類型                  長度    描述                  取值約束
version_major       UINT16BE              2       主版本號              #define PCAP_VERSION_MAJOR 2
version_minor       UINT16BE              2       次版本號              #define PCAP_VERSION_MINOR 4
thiszone            UINT32BE              4       時區
sigfigs             UINT32BE              4       精確時間戳
snaplen             UINT32BE              4       抓包最大長度
linktype            UINT32BE              4       鏈路類型              <1-EthernetS | 20-IEEE_802_11S>

{header_info_little}
# 字段名            類型                  長度    描述                  取值約束
version_major       UINT16LE              2       主版本號              #define PCAP_VERSION_MAJOR 2
version_minor       UINT16LE              2       次版本號              #define PCAP_VERSION_MINOR 4
thiszone            UINT32LE              4       時區
sigfigs             UINT32LE              4       精確時間戳
snaplen             UINT32LE              4       抓包最大長度
linktype            UINT32LE              4       鏈路類型              <1-EthernetS | 20-IEEE_802_11S>

{EthernetS}
# pcap 的特殊結構: 抓包個數事先未知, 讀文件要無限循環
# 字段名            類型                  長度    描述                  取值約束
packet_header       {Ethernet}            X       數據結構數組          X表示無限個元素

{pcap_packet_header}
# 字段名            類型                  長度    描述                  取值約束
tv_sec              TIME_LE               4       抓包時間(看成小端序)  1970/1/1零點開始以來的秒數
tv_usec             UINT32LE              4       毫秒數(看成小端序)    當前秒以後的毫秒數
caplen              UINT32LE              4       抓包長度(看成小端序)
len                 UINT32LE              4       實際長度(看成小端序)

{Ethernet}
# 字段名            類型                  長度    描述                  取值約束
packet_header       {pcap_packet_header}  S       數據包頭              通用包頭
ether_dhost         MAC[]                 6       目的MAC地址
ether_shost         MAC[]                 6       源MAC地址
ether_type          HEX[]                 2       上層協議              <0x0800-IPv4>

{IPv4}
# 字段名            類型          長度    描述                                取值約束
Version             BIT[]         4       版本                                (4) 表示當前爲 IPv4 版本
IHL                 BIT[]         4       包頭長度                            以32位爲單位，最小值爲5
DSCP                BIT[]         6       Differentiated Services Codepoint
ECN                 BIT[]         2       Explicit Congestion Notification
Total Length        UINT16        2       總長度                              IP數據包的總長度，包括包頭和後跟的數據
Identification      UINT16        2       標識                                識別報文，用於分段重組
Flags               BIT[]         3       標記                                分爲 保留|DF|MF 共3位
Fragment Offset     BIT[]         13      分段偏移                            當前段在整個數據報中的偏移--以64位爲單位
Time To Live        UINT8         1       生存期                              報文在internet中的生存時間。若是爲零，則丟棄
Protocol            UINT8         1       協議                                <1-TCP|17-UDP>
Header Checksum     HEX[]         2       報文頭校驗碼
Source IP           IPV4          4       源地址
Destination IP      IPV4          4       目的地址

{TCP}

{UDP}
# 字段名                類型          長度    描述                取值約束
Source port             UINT16        2       源端口號
Destination port        UINT16        2       目標端口號          <12345-FOO>
Length                  UINT16        2       數據報長度
Checksum                UINT16        2       校驗值

{FOO}
# 字段名                類型          長度    描述                取值約束
Version                 UINT8         1       版本號              (1)
MagicField              CHAR[]        3       表示協議的魔術字串  FOO
MessageLength           UINT16        2       消息長度
MessageSerialNO         UINT32        4       消息序列號
MessageType             UINT16        2       消息類型            (1-配置消息|2-操做消息)
Time                    TIME          4       當前時間            日誌產生時間，用UTC時間表示
UserNameLength          UINT8         1       用戶名長度          N
UserName                CHAR[]        N       用戶名
OperationType           UINT8         1       操做類型            (1-上傳|2-下載)
SourceAddressType       UINT8         1       源地址類型          {4-4/IPV4 | 16-16/IPV6}
SourceAddress           IPV4|IPV6     U       源地址
DestinationAddressType  UINT8         1       目的地址類型        {4-4/IPV4 | 16-16/IPV6}
DestinationAddress      IPV4|IPV6     U       目的地址
SourcePort              UINT16        2       源端口
DestinationPort         UINT16        2       目的端口
FileNameLength          UINT8         1       文件名長度          N
FileName                CHAR[]        N       文件名

View Code