解析xml

時間 2019-11-11

標籤解析 xml 欄目 XML 简体版

原文原文鏈接

xml格式以下：python

<?xml version="1.0" encoding="gb2312" ?>
<plugin>
    <args identity="s" size="120" extend="">
        <hsmdb size="120">
            <!--期貨內存表 begin-->
            <table name="fuexchfare" logic_name="usersvr" type="1" note="1整表，0部分">
                <statement note="期貨交易所費用表">
                    <select>*</select>
                    <from>fuexchfare</from>
                </statement>
                <indexs note="idx_bkfuexchfare">
                    <index name="futu_code" type="2"/>
                    <index name="futucode_type" type="2"/>
                    <index name="futu_exch_type" type="2"/>
                    <index name="underlying_code" type="2"/>
                    <index name="fopt_type" type="2"/>
                    <index name="hedge_type" type="2"/>
                    <index name="entrust_bs" type="2"/>
                    <index name="futufare_type" type="2"/>
                </indexs>
            </table>
        </hsmdb>
        <syncs note="緩存同步配置">
            <target as_group_name="AS_ALL" first="0" last="1"/>
            <target as_group_name="AS_QUERY" first="0" last="1"/>
            <target as_group_name="AS_TRADE" first="0" last="1"/>
            <target as_group_name="AS_TRANS" first="0" last="0"/>
            <target as_group_name="AS_SETT_CALC" first="0" last="0"/>
        </syncs>
    </args>
</plugin>

如今要解析出內存表的中文名和英文名。緩存

首先是編碼問題。python2對中文編碼支持特別弱智。這裏處理的方法是：經過ue將encoding="gb2312"改成encoding="utf-8"，而後另存爲時，編碼格式選擇utf-8。而後在編碼解析。ide

解析代碼以下：編碼

# -*- coding: utf-8 -*-
# coding=UTF-8
# ***************************************************************
# 一開始xml中的編碼格式爲gbk，經過ue另存爲選擇utf-8格式，解決亂碼問題
# ***************************************************************
import sys
import xml.etree.ElementTree as ET
reload(sys)
sys.setdefaultencoding('utf-8')

tree = ET.parse('F:/Python/hsmdb_as2.xml')
root = tree.getroot()
for child in root:
	for child1 in child:
		if child1.tag == 'hsmdb':
			for child2 in child1:
				if child2.tag == 'table':
					for child3 in child2:
						if child3.tag == 'statement':
							for child4 in child3:
								if child4.tag == 'from':
									print child4.text,
									print (child3.attrib)['note'],
									print (child2.attrib)['logic_name']

完成編碼code

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。