背景:node
最近要解析一些樹狀結構的debug trace文本,爲了便於閱讀,但願解析成a.b.c 的結構。python
每一個父節點和子節點靠一個Tab識別,葉子節點以ptr開頭(除了Tab)。算法
核心思想:優化
首先找到葉子節點,而後依次向前找到父節點(父節點比當前節點少一個Tab),當遇到「}」, 表示這棵樹結束了。this
現模擬debug trace 建一個文本文件1.txt。spa
內容以下:debug
service[hi] name: [1] { name:[11] { name: [111] { ptr -- [1111]--[value0] ptr -- [1112]--[value1] } name: [112] { name: [1121] { ptr -- [111211]--[value2] } } } name:[12] { ptr -- [121]--[value3] } name:[13] { ptr -- [131]--[value4] } } service[Jeff] name: [1] { name:[11] { name: [111] { ptr -- [1111]--[value0] ptr -- [1112]--[value1] } name: [112] { name: [1121] { ptr -- [111211]--[value2] } } } name:[12] { ptr -- [121]--[value3] } name:[13] { ptr -- [131]--[value4] } }
解析程序以下:code
1.common.pyorm
''' Created on 2012-5-26 author: Jeff ''' def getValue(string,key1,key2): """ get the value between key1 and key2 in string """ index1 = string.find(key1) index2 = string.find(key2) value = string[index1 + 1 :index2] return value def getFiledNum(string,key,begin): """ get the number of key in string from begin position """ keyNum = 0 start = begin while True: index = string.find(key, start) if index == -1: break keyNum = keyNum + 1 start = index + 1 return keyNum
2. main.pyxml
''' Created on 2012-5-26 author: Jeff ''' import common import linecache fileName = "1.txt" fileNameWrite = "result.txt" leafNode = "ptr" curLine = 0 nextLine = 0 f = open(fileName,'r') fw = open(fileNameWrite,'w') # read line while True: data = f.readline() if not data: break curLine = curLine + 1 # find the leafNode if data.startswith("service"): index = data.find('\n') print data[0:index] fw.write(data[0:index] + '\n') continue if data.find(leafNode) != -1: nextLine = curLine + 1 #print "data is %s, current line is %d, next line is %d." %(data,curLine,nextLine) # value of leaf node value = common.getValue(data, '[', ']') string = value #print "value of leaf node is %s" % value # get the number of tab tabNum = common.getFiledNum(data, '\t', 0) #print( "Tab number is %d" % tabNum ) # i for read previous line # j for create perfix i = curLine - 1 j = tabNum - 1 while True: prefix = '\t' * j + 'name' # get previous line preline=linecache.getline(fileName,i) #print "previous line is %s" % preline if preline.startswith("{"): break if preline.startswith(prefix): #print "this line start with prefix value = common.getValue(preline, '[', ']') string = value + "." + string i = i - 1 j = j - 1 else: i = i - 1 print string fw.write(string + '\n') fw.close() f.close()
解析結果result.txt:
service[hi] 1.11.111.1111 1.11.111.1112 1.11.111.1121.111211 1.12.121 1.13.131 service[jeff] 1.1.11.111.1111 1.1.11.111.1112 1.1.11.111.1121.111211 1.1.12.121 1.1.13.131
優化:
1.字符串相加的部分改爲 all = ‘%s%s%s%s’ % (str0, str1, str2, str3) 形式 或者 ''.join 的形式。
2.要寫入得內容保存在List中,最後用f.writelines(list)一塊兒寫入。
3.算法優化,請參考 個人博客 :《Python 解析樹狀結構文件(算法優化)》