xml格式以下:python
<?xml version="1.0" encoding="gb2312" ?> <plugin> <args identity="s" size="120" extend=""> <hsmdb size="120"> <!--期貨內存表 begin--> <table name="fuexchfare" logic_name="usersvr" type="1" note="1整表,0部分"> <statement note="期貨交易所費用表"> <select>*</select> <from>fuexchfare</from> </statement> <indexs note="idx_bkfuexchfare"> <index name="futu_code" type="2"/> <index name="futucode_type" type="2"/> <index name="futu_exch_type" type="2"/> <index name="underlying_code" type="2"/> <index name="fopt_type" type="2"/> <index name="hedge_type" type="2"/> <index name="entrust_bs" type="2"/> <index name="futufare_type" type="2"/> </indexs> </table> </hsmdb> <syncs note="緩存同步配置"> <target as_group_name="AS_ALL" first="0" last="1"/> <target as_group_name="AS_QUERY" first="0" last="1"/> <target as_group_name="AS_TRADE" first="0" last="1"/> <target as_group_name="AS_TRANS" first="0" last="0"/> <target as_group_name="AS_SETT_CALC" first="0" last="0"/> </syncs> </args> </plugin>
如今要解析出內存表的中文名和英文名。緩存
首先是編碼問題。python2對中文編碼支持特別弱智。這裏處理的方法是:經過ue將encoding="gb2312"改成encoding="utf-8",而後另存爲時,編碼格式選擇utf-8。而後在編碼解析。ide
解析代碼以下:編碼
# -*- coding: utf-8 -*- # coding=UTF-8 # *************************************************************** # 一開始xml中的編碼格式爲gbk,經過ue另存爲選擇utf-8格式,解決亂碼問題 # *************************************************************** import sys import xml.etree.ElementTree as ET reload(sys) sys.setdefaultencoding('utf-8') tree = ET.parse('F:/Python/hsmdb_as2.xml') root = tree.getroot() for child in root: for child1 in child: if child1.tag == 'hsmdb': for child2 in child1: if child2.tag == 'table': for child3 in child2: if child3.tag == 'statement': for child4 in child3: if child4.tag == 'from': print child4.text, print (child3.attrib)['note'], print (child2.attrib)['logic_name']
完成編碼code