以前在使用xml解析的時候,在網上搜了不少教程,最終沒有能按照網上的教程實現需求。html
因此呢,只好本身去看源碼,在sax的__init__.py下看到這麼一段代碼:python
1 def parse(source, handler, errorHandler=ErrorHandler()): 2 parser = make_parser() 3 parser.setContentHandler(handler) 4 parser.setErrorHandler(errorHandler) 5 parser.parse(source) # 能夠看出來,執行xml解析至少須要兩個參數:source:源文件路徑和實例化的handler對象
下面咱們就用一個例子來是實現一下:(事先說明,這個例子是網上找的,不是本身寫的)code
<bookstore> <book category="CHILDREN"> <title>Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="WEB"> <title>Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>
下面將對各個步驟的做用逐個說明:xml
#!usr/bin/env python # -*- coding: utf-8 -*- # @Time : 2018/5/30 22:43 # @Author : Adong_Chen from xml import sax class TestHandler(sax.ContentHandler): # 定義本身的handler類,繼承sax.ContentHandler def __init__(self): sax.ContentHandler.__init__(self) # 弗父類和子類都須要初始化(作一些變量的賦值操做等) self._content = "" self._tag = "" def startElement(self, name, attrs): # 遇到<tag>標籤時候會執行的方法,這裏的name,attrs不用本身傳值的(這裏實際上是重寫) self._tag = name if name == "bookstore": print "=========BOOKSTORE=========" if self._tag == "book": print "BOOK: " + attrs["category"] print "--------------------------" def endElement(self, name): # 遇到</tag>執行的方法,name不用本身傳值(重寫) # print "endElement" if name == "bookstore": print "=========BOOKSTORE=========" elif name == "title": print "Title: " + self._content elif name == "author": print "Author: " + self._content elif name == "year": print "Year: " + self._content elif name == "price": print "Price: " + self._content else: pass def characters(self, content): # 獲取標籤內容 self._content = content if __name__ == "__main__": handler = TestHandler() # 自定義類實例化成對象 sax.parse("Test2.xml", handler) # 解析xml文件
執行結果以下:htm
=========BOOKSTORE========= BOOK: CHILDREN -------------------------- Title: Harry Potter Author: J K. Rowling Year: 2005 Price: 29.99 BOOK: WEB -------------------------- Title: Learning XML Author: Erik T. Ray Year: 2003 Price: 39.95 =========BOOKSTORE=========