每一個人都知道什麼是XML,也知道它的格式.若是深刻點理解如何使用XML,可能就不是每一個人都知道的了. XML是一種自描述性文檔,它的做用是內容的承載,和展現沒有任何關係.因此,如何將XML裏的數據以合理的方式取出展現,是XML編程的主要部分. 這篇文章從廣度上來描述XML的一切特性. javascript
XML有一大堆的官方文檔和Spec文檔以及教程.可是它們都太專業,文字太官方,又難懂,文字多,例子少,篇幅分散且跨度大. 因而須要一篇小文章,以通俗的話語以歸納的角度來闡述XML領域的技術.再給幾個小的example. 這就是我寫這篇文章的緣由.寫它也是爲了自我學習總結. css
本文所用的代碼結構以下圖: html
首先肯定這篇文章使用的XML例子,後面全部的代碼都基於此例. 前端
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="test/xsl" href="bookStore.xsl"?> <!DOCTYPE bookStore PUBLIC "bookStore.dtd" "bookStore.dtd"> <bookStore name="java" xmlns="http://joey.org/bookStore" xmlns:audlt="http://japan.org/book/audlt" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="bookStore.xsd"> <keeper> <name>Joey</name> </keeper> <books> <book id="1"> <title>XML</title> <author>Steve</author> </book> <book id="2"> <title>JAXP</title> <author>Bill</author> </book> <book id="3" audlt:color="yellow"> <audlt:age> >18 </audlt:age> <title>Love</title> <author>teacher</author> </book> </books> </bookStore>
<?xml version="1.0" encoding="uft-8">
<!DOCTYPE root-element SYSTEM "filename">
<?xml-stylesheet type="text/css" href="cd_catalog.css"?> 或者 <?xml-stylesheet type="text/xsl" href="simple.xsl"?>
<note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd"> ... </note>
XML存儲時所使用的字符編碼. 這個編碼告訴解析程序應該使用什麼編碼格式來對XML解碼. 爲了國際通用,使用UTF-8吧. 對於純英文,UTF8只須要一個字節來表示一個英文字符. XML的size也不會太大. java
命名空間語法包括聲明部分 默認命名xmlns="<URL>"或者指定命名xmlns:prefix="http://<namespace specification URL>" 和 使用部分<prefix:tag>或者<tag prefix:attr="">. node
命名空間解決了兩個問題. jquery
在Java或者JavaScript中是使用namespace的, 注意如下幾點: android
document.getElementsByTagNameNS("http://japan.org/book/audlt", "age"); document.getElementsByTagName("audlt:age");
驗證XML合法性靠的是DTD或者XSD.這是XML的兩個規範. XSD比DTD要新,因此也先進. web
本文中的XML裏面聲明瞭DTD的引用,XML parser就會自動加載DTD來驗證XML. 這須要給parser設定兩個前提.一是開啓了驗證模式,而是明白DTD的加載位置. XML parser能夠是JS,java或者browser. 加載位置可使用PUBLIC ID或者SYSTEM ID來判斷.請看下面的聲明: chrome
<!DOCTYPE bookStore SYSTEM "bookStore.dtd">
上面的聲明沒有PUBLIC ID, 只有SYSTEM ID, SYSTEM ID=XML當前路徑+"/bookStore.dtd". 可見system id是一個相對與XML的路徑.
聲明PUBLIC ID:
<!DOCTYPE bookStore PUBLIC "bookStore.dtd" "bookStore.dtd">
PUBLIC ID也爲"bookStore.dtd". 這時候,Parser會自動根據這兩個ID去嘗試加載DTD文件,若是加載不到,則拋出exception. JAVA中,咱們能夠經過實現EntityResolver接口的方法來自定義DTD的所在位置. 詳情請看JAVA部分.
本文用的DTD是:
<!ELEMENT bookStore (keeper, books)> <!ATTLIST bookStore name CDATA #REQUIRED> <!ELEMENT keeper (name)> <!ELEMENT name (#PCDATA)> <!ELEMENT books (book)> <!ELEMENT book (title, author)> <!ATTLIST book id ID #REQUIRED> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)>
使用XSD來驗證XML只須要一個XSD的定義文件,開啓Parser的XSD驗證功能. XSD的驗證方法在後面的JAVA代碼中能夠看到. 本文使用的XSD以下:
<?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="bookStore" type="bookStoreType" /> <xsd:complexType name="bookStoreType"> <xsd:sequence> <xsd:element name="keeper" type="keeperType"></xsd:element> <xsd:element name="books" type="booksType"></xsd:element> </xsd:sequence> <xsd:attribute name="name" type="xsd:string"></xsd:attribute> </xsd:complexType> <xsd:complexType name="keeperType"> <xsd:sequence> <xsd:element name="name" type="xsd:string"></xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="booksType"> <xsd:sequence> <xsd:element name="book" type="bookType"></xsd:element> </xsd:sequence> </xsd:complexType> <xsd:complexType name="bookType"> <xsd:sequence> <xsd:element name="title" type="xsd:string"></xsd:element> <xsd:element name="author" type="xsd:string"></xsd:element> </xsd:sequence> <xsd:attribute name="id" type="xsd:int"></xsd:attribute> </xsd:complexType> </xsd:schema>
以下面的代碼片斷所示,XML能夠有stylesheet轉換成其餘格式, 如HTML, TXT等. stylesheet能夠是css,也能夠是xsl.
<?xml-stylesheet type="test/xsl" href="bookStore.xsl"?>主流browser都已經支持這種轉換格式. 除了自動轉換,咱們也可使用代碼對轉換進行控制.咱們能夠用java在服務器端進行xslt的轉換,也可使用javascript在前端對xml進行xslt轉換. 代碼在後面都可找到. 書寫xsl的時候,namespace必定要注意. xpath必定要和namespace所對應. 我所使用的XSL爲:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:b="http://joey.org/bookStore" xmlns:a="http://japan.org/book/audlt"> <xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"></xsl:output> <xsl:template match="/"> <html> <body> <h2>Book Store<<<xsl:value-of select="/b:bookStore/@name"></xsl:value-of>>></h2> <div> There are <xsl:value-of select="count(/b:bookStore/b:books/b:book)"></xsl:value-of> books. </div> <div> Keeper of this store is <xsl:value-of select="/b:bookStore/b:keeper/b:name"></xsl:value-of> </div> <xsl:for-each select="/b:bookStore/b:books/b:book"> <div> Book: <span>title=<xsl:value-of select="b:title"></xsl:value-of></span>; <span>author=<xsl:value-of select="b:author"></xsl:value-of></span> <xsl:if test="@a:color"> <span style="color:yellow">H Book, require age<xsl:value-of select="a:age"></xsl:value-of></span> </xsl:if> </div> </xsl:for-each> </body> </html> </xsl:template> </xsl:stylesheet>
Javascript對XML的支持在IE和FF+Chrome上是不一樣的. IE使用的ActiveXObject來生成一個XML的實例.FF與Chrome等其它主流瀏覽器均遵循w3c規範. 生成的XML document可使用其DOM方法對dom tree進行操做. 也能夠藉助框架dojo,jquery等簡化操做.
下面這個例子是使用JS對XML進行XSLT轉化,從而生成HTML.
function createXMLDoc(xmlStr) { var xmlDoc; if (window.DOMParser) { // FF Chrome var parser=new DOMParser(); xmlDoc=parser.parseFromString(xmlStr,"text/xml"); } else if (window.ActiveXObject){ // Internet Explorer xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.loadXML(xmlStr); } return xmlDoc; } function transform(xmlDoc, xslDoc) { if (window.XSLTProcessor) { // chrome FF var xslp = new XSLTProcessor(); xslp.importStylesheet(xslDoc); return xslp.transformToFragment(xmlDoc,document); } else if (window.ActiveXObject){ // IE return xmlDoc.transformNode(xslDoc); } } var xmlStr = ['<bookStore name="java" xmlns="http://joey.org/bookStore" xmlns:audlt="http://japan.org/book/audlt" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="bookStore.xsd">', '<keeper><name>Joey</name></keeper>', '<books>', '<book id="1"> <title>XML</title><author>Steve</author></book>', '<book id="2"><title>JAXP</title> <author>Bill</author></book>', '<book id="3" audlt:color="yellow"><audlt:age> >18 </audlt:age> <title>Love</title><author>teacher</author></book>', '</books></bookStore>'].join(''); var xslStr = ['<?xml version="1.0" encoding="UTF-8"?>', '<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:b="http://joey.org/bookStore" xmlns:a="http://japan.org/book/audlt">', '<xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes" />', '<xsl:template match="/">', '<html>', '<body>', '<h2>Book Store<<<xsl:value-of select="/b:bookStore/@name"/>>></h2>', '<div>There are <xsl:value-of select="count(/b:bookStore/b:books/b:book)"/> books.</div>', '<div>Keeper of this store is <xsl:value-of select="/b:bookStore/b:keeper/b:name"/></div>', '<xsl:for-each select="/b:bookStore/b:books/b:book">', '<div>Book: ', '<span>title=<xsl:value-of select="b:title"/></span>;<span>author=<xsl:value-of select="b:author"/></span>', '<xsl:if test="@a:color">', '<span color="yellow">H Book, require age<xsl:value-of select="a:age"/></span>', '</xsl:if>', '</div>', '</xsl:for-each>', '</body>', '</html>', '</xsl:template>', '</xsl:stylesheet>'].join(''); var xmlDoc = createXMLDoc(xmlStr); var xslDoc = createXMLDoc(xslStr); var dom = transform(xmlDoc, xslDoc); console.log(dom.childNodes[0].outerHTML);
Java對XML的支持被稱爲JAXP(Java API for XML Processing). JAXP被當作標準,放入了J2SE1.4.今後之後,JRE自帶XML的處理類庫. 固然,JAXP容許使用第三方的XML Parser,不一樣的parser有着不一樣的優缺點,用戶能夠本身選擇. 但全部的Parser均必須實現JAXP所約定的Interface. 掌握JAXP,須要知道如下內容. 這些都會在後面進行描述.
每一個接口與類的使用方法就不使用文字描述了,後面會用代碼和註釋的方式一一介紹JAXP的類庫. 在描述SAX,StAX,DOM等方法以前,有必要作一個highlevel的比較. 每個解析方法的優缺點是什麼?改如何選擇它們.
首先,XML解析器存在SAX, StAX和DOM, 而XML文件生成方法又有StAX和DOM. XPath是一個查詢DOM的工具. XSLT是轉換XML格式的工具. 以下圖所示:
XML的解析從數據結構上來說,分兩大類: Streaming和Tree. Streaming又分爲SAX和StAX. Tree就是DOM. SAX和StAX均是順序解析XML,並生成讀取事件.咱們能夠經過監聽事件來獲得咱們想要的內容. DOM是一次性的以tree結構形式載入內存.
Streaming VS DOM
Pull VS Push
|
SAX | StAX | DOM |
API Type | Push, Streaming | Pull, Streaming | Tree, In momery |
Support XPath? | No | No | Yes |
Read XML | Yes | Yes | Yes |
Write XML | No | Yes | Yes |
CRUD | No | No | Yes |
Parsing Validation (DTD, XSD) |
Yes | Optional (JDK embedded |
Yes |
借用oracle網上的一張圖來講明SAX的架構.
SAXParser是調用XMLReader的, 若是使用SAXParser,則須要傳參DefaultHandler. DefaultHandler實現了上圖的4個Handler接口. 你也能夠直接使用XMLReader,而後調用它的parser方法.只是在parser前,需set每一個Handler. SAXParser是Event-Driven設計模式, 隨着讀取XML的字節,隨着傳遞event給handler來處理.
讀的工做實際上是有XMLReader來作的,全部的events也是XMLReader產生的.因此,將一個非XML格式的文件模擬成一個XML,只須要複寫XMLReader,讀取非XML文件時,發出假的Event,這樣handler將會把這個文件當作一個XML來處理. 這種機制會在XSLT中用到.
關於模擬XML
SAX能夠將一個非XML格式文件的讀取模擬成一個XML的文件的讀取.經過構造XML的讀取Event. 只是SAX須要複寫XMLReader.
用於處理XML的各類數據類型的讀取事件.這裏面的事件有
用於處理XML解析階段所發生的警告和錯誤.裏面有三個方法,warning(), error()和fatalError(). waring和error用於處理XML的validation(DTD或XSD)錯誤.這種錯誤並不影響XML的解析,你能夠把這種錯誤產生的exception壓下來,而不向上拋.這樣XML的解析不會被終斷. fatalError是XML結構錯誤,這種錯誤沒法被壓制,即便個人handler不拋,Parser會向外拋exception.
DTD定義中存在ENTITY和NOTATION.這都屬於用戶自定義屬性. XML Parser沒法理解用戶自定義的ENTITY或者NOTATION, 因而它把這方面的驗證工做交給了DTDHandler. DTDHandler裏面只有2個方法:notationDecl和unparsedEntityDecl. 咱們實現這兩個方法來驗證咱們的NOTATION部分是否正確.
在XML的驗證段落裏面提到過DTD的定位. EntityResolver能夠幫助咱們作這件事情. EntityResolver裏面只有一個方法,叫作ResolveEntity(publicId, systemId). 每當Parser須要使用external文件的時候,就會調用這個方法. 咱們能夠在這個方法裏面作一些預處理. 代碼以下:
public class MyEntityResolver implements EntityResolver { @Override public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException { if ("bookStore.dtd".equals(publicId)) { InputStream in = this.getClass().getResourceAsStream("/jaxp/resources/bookStore.dtd"); InputSource is = new InputSource(in); return is; } return null; } }
請注意裏面是如何開啓validation模式的. XSD有兩種開啓方法.
public class MySAX { private SAXParser parser; public static void main(String[] args) throws Exception { new MySAX(); } public MySAX() throws ParserConfigurationException, SAXException, IOException { // Use "javax.xml.parsers.SAXParserFactory" system property to specify a Parser. // java -Djavax.xml.parsers.SAXParserFactory=yourFactoryHere [...] // If property is not specified, use J2SE default Parser. // The default Parser is "com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl". SAXParserFactory spf = SAXParserFactory.newInstance(); spf.setNamespaceAware(true); // Use XSD defined by JAXP 1.3, JAVA1.5 //SchemaFactory sf = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema"); //spf.setSchema(sf.newSchema(this.getClass().getResource("/jaxp/resources/bookStore.xsd"))); // or Use old way defined by JAXP 1.2 // parser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage","http://www.w3.org/2001/XMLSchema"); // parser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaSource", new File("schema.xsd"));// XSD disabled, use DTD. spf.setValidating(true); this.parser = spf.newSAXParser();
// You can directly use SAXParser to parse XML. Or use XMLReader. // SAXParser warps and use XMLReader internally. // I will use XMLReader here. //this.parser.parse(InputStrean, DefaultHandler); XMLReader reader = this.parser.getXMLReader(); reader.setContentHandler(new MyContentHandler()); reader.setDTDHandler(new MyDTDHandler()); reader.setErrorHandler(new MyErrorHandler()); reader.setEntityResolver(new MyEntityResolver()); InputStream in = this.getClass().getResourceAsStream("/jaxp/resources/bookStore.xml"); InputSource is = new InputSource(in); is.setEncoding("UTF-8"); reader.parse(is); } }
借用oracle的圖片來講明DOM解析的架構.
JAVA對XML的解析標準存在DOM, JDOM, DOM4J. 有人認爲JDOM和DOM4J都是DOM的另外一種實現方法,這是錯誤的.
獲得DOM數據模型之後,可使用DOM的遍歷方法來尋找元素,也可使用XPATH來查找指定元素,XPath的重點注意事項是NamespaceContext. 接下來是DOM的code實例.
public class MyDOM { public static void main(String[] args) throws Exception { new MyDOM(); } public MyDOM() throws Exception { // Use "javax.xml.parsers.DocumentBuilderFactory" system property to specify a Parser. // java -Djavax.xml.parsers.DocumentBuilderFactory=yourFactoryHere [...] // If property is not specified, use J2SE default Parser. // The default Parser is "com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl". DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); dbf.setIgnoringComments(false); dbf.setNamespaceAware(true); dbf.setIgnoringElementContentWhitespace(true); // Use XSD defined by JAXP 1.3, JAVA1.5 // SchemaFactory sf = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema"); // dbf.setSchema(sf.newSchema(this.getClass().getResource("/jaxp/resources/bookStore.xsd"))); // or Use old way defined by JAXP 1.2 // dbf.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaLanguage","http://www.w3.org/2001/XMLSchema"); // dbf.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaSource", new File("schema.xsd")); // dbf.setSchema(schema); // XSD disabled, use DTD. dbf.setValidating(true); DocumentBuilder db = dbf.newDocumentBuilder(); db.setErrorHandler(new MyErrorHandler()); db.setEntityResolver(new MyEntityResolver()); Document document = db.parse(this.getClass().getResourceAsStream("/jaxp/resources/bookStore.xml")); // Operate on Document according to DOM module. NodeList list = document.getElementsByTagNameNS("http://joey.org/bookStore", "book"); System.out.println(list.item(2).getAttributes().item(0).getLocalName()); // Node that if you don't specify name space, you need to use Qualified Name. System.out.println(document.getElementsByTagName("audlt:age").item(0).getTextContent()); // Use xpath to query xml XPathFactory xpf = XPathFactory.newInstance(); XPath xp = xpf.newXPath(); // Need to set a namespace context. NamespaceContext nc = new NamespaceContext() { @Override public String getNamespaceURI(String prefix) { if (prefix.equals("b")) return "http://joey.org/bookStore"; if (prefix.equals("a")) return "http://japan.org/book/audlt"; return null; } @Override public String getPrefix(String namespaceURI) { if (namespaceURI.equals("http://joey.org/bookStore")) return "b"; if (namespaceURI.equals("http://japan.org/book/audlt")) return "a"; return null; } @Override public Iterator getPrefixes(String namespaceURI) { return null; } }; xp.setNamespaceContext(nc); System.out.println(xp.evaluate("/b:bookStore/@name", document)); System.out.println(xp.evaluate("/b:bookStore/b:books/b:book[@id=3]/@a:color", document)); } }
StAX和SAX比較,代碼簡單,且能夠寫XML. 但StAX規範對於解析時的validation不是強制的.因此,JDK自帶StAX解析器就不支持Parsing Validation.
StAX存在兩種API, Cursor API(XMLStreamReader, XMLStreamWriter)和Iterator API(XMLEventReader, XMLEventWriter). Cursor API就是一個像遊標同樣的讀或者寫API. 咱們得不停的調用XML writer和XML reader來讀寫XML每個字段,這是的代碼邏輯層和XML解析層交叉在一塊兒,很混亂. Iterator API將邏輯層和XML解析層分離,對Event進行封裝,全部的數據都封裝在Event中,邏輯層和解析層靠Event實體來打交道,實現了鬆耦合. 這是個人理解:
public class MyStAX { public static void main(String[] args) throws Exception { coursorAPIReadWrite(); eventAPIReadWrite(); } // use cursor API to read and write XML. public static void coursorAPIReadWrite() throws Exception { XMLInputFactory xif = XMLInputFactory.newInstance(); // Set properties for validation, namespace... // But, JDK embeded StAX parser does not support validation. //xif.setProperty(XMLInputFactory.IS_VALIDATING, true); xif.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, true); // Handle the external Entity. xif.setXMLResolver(new XMLResolver() { public Object resolveEntity(String publicID, String systemID, String baseURI, String namespace) throws XMLStreamException { if (publicID.equals("bookStore.dtd")) { return Class.class.getResourceAsStream("/jaxp/resources/bookStore.dtd"); } return null; } }); XMLOutputFactory xof = XMLOutputFactory.newInstance(); // Set namespace repairable. Sometimes it will bring you bug. Use it carefully. // xof.setProperty(XMLOutputFactory.IS_REPAIRING_NAMESPACES, true); InputStream sourceIn = Class.class.getResourceAsStream("/jaxp/resources/bookStore.xml"); OutputStream targetOut = System.out; //new FileOutputStream(new File("target.xml")); XMLStreamReader reader = xif.createXMLStreamReader(sourceIn); XMLStreamWriter writer = xof.createXMLStreamWriter(targetOut, reader.getEncoding()); writer.writeStartDocument(reader.getEncoding(), reader.getVersion()); while (reader.hasNext()) { int event = reader.next(); switch (event) { case XMLStreamConstants.DTD: out(reader.getText()); writer.writeCharacters("\n"); writer.writeDTD(reader.getText()); writer.writeCharacters("\n"); break; case XMLStreamConstants.PROCESSING_INSTRUCTION: out(reader.getPITarget()); writer.writeCharacters("\n"); writer.writeProcessingInstruction(reader.getPITarget(), reader.getPIData()); break; case XMLStreamConstants.START_ELEMENT: out(reader.getName()); NamespaceContext nc = reader.getNamespaceContext(); writer.setNamespaceContext(reader.getNamespaceContext()); writer.setDefaultNamespace(nc.getNamespaceURI("")); writer.writeStartElement(reader.getPrefix(), reader.getLocalName(), reader.getNamespaceURI()); for (int i=0; i<reader.getAttributeCount(); i++) { QName qname = reader.getAttributeName(i); String name=qname.getLocalPart(); if (qname.getPrefix()!=null && !qname.getPrefix().equals("")) { //name = qname.getPrefix()+":"+name; } writer.writeAttribute(name, reader.getAttributeValue(i)); } for (int i=0; i<reader.getNamespaceCount(); i++) { writer.writeNamespace(reader.getNamespacePrefix(i), reader.getNamespaceURI(i)); } break; case XMLStreamConstants.ATTRIBUTE: out(reader.getText()); break; case XMLStreamConstants.SPACE: out("SPACE"); writer.writeCharacters("\n"); break; case XMLStreamConstants.CHARACTERS: out(reader.getText()); writer.writeCharacters(reader.getText()); break; case XMLStreamConstants.END_ELEMENT: out(reader.getName()); writer.writeEndElement(); break; case XMLStreamConstants.END_DOCUMENT: writer.writeEndDocument(); break; default: out("other"); break; } } writer.close(); reader.close(); } public static void eventAPIReadWrite() throws Exception { XMLInputFactory xif = XMLInputFactory.newInstance(); xif.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, true); // Handle the external Entity. xif.setXMLResolver(new XMLResolver() { public Object resolveEntity(String publicID, String systemID, String baseURI, String namespace) throws XMLStreamException { if (publicID.equals("bookStore.dtd")) { return Class.class.getResourceAsStream("/jaxp/resources/bookStore.dtd"); } return null; } }); XMLOutputFactory xof = XMLOutputFactory.newInstance(); InputStream sourceIn = Class.class.getResourceAsStream("/jaxp/resources/bookStore.xml"); OutputStream targetOut = System.out; XMLEventReader reader = xif.createXMLEventReader(sourceIn); XMLEventWriter writer = xof.createXMLEventWriter(targetOut); while(reader.hasNext()) { XMLEvent event = reader.nextEvent(); out(event.getEventType()); writer.add(event); } reader.close(); writer.close(); } public static void out(Object o) { System.out.println(o); } }
上面瞭解了SAX,DOM和STAX,它們均爲XML解析方法. 其中SAX只適合解析讀取. DOM則是XML內存中的數據展示. STAX能夠解析,也能夠寫出到文件系統.
若是將DOM從內存輸出XML文件. 若是須要將一個XML文件轉換成一個HTML或任意其餘格式文件,則須要JAXP的XSLT特性. 這裏的轉換包括:
XSLT的下面包含了4個包:
從上面能夠看出,JAXP能夠進行4*4=16種轉換方式.(sax, sax), (sax, dom), (sax, stream)...
再高級一點,利用SAXSouce----->DOMResult的轉化功能, 和SAX模擬XML讀取功能, XSLT能夠將一個非XML格式的文件,轉換成一個DOM. 下面的代碼將包含此例. 代碼中還包含另一個例子,就是把XML按照XSL的格式轉換成HTML.
注意, XSLT處理DTD有技巧:
在xml2html的轉換中, 使用StreamSource在代碼的書寫上是最簡單的, 但爲何使用了SAXSource? 那是由於要轉換的XML中引用了DTD, StreamSource沒法處理外部引用, 會致使Transformer拋TransformerException. 失敗的異常內容爲DTD文件找不到. 那麼,在這種狀況下,咱們只能使用SAXSource,並給它賦予一個能夠解析外部DTD引用的XMLReader. 終於成功了.
public class MyXSLT { TransformerFactory tff; public static void main(String[] args) throws Exception { MyXSLT xslt = new MyXSLT(); xslt.xml2html(); xslt.str2xml(); } public MyXSLT() { tff = TransformerFactory.newInstance(); } public void xml2html() throws Exception { Transformer tr = tff.newTransformer(new SAXSource(new InputSource(this.getClass().getResourceAsStream("/jaxp/resources/bookStore.xsl")))); SAXParserFactory spf = SAXParserFactory.newInstance(); SAXParser parser = spf.newSAXParser(); parser.getXMLReader().setEntityResolver(new EntityResolver() { @Override public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException { if ("bookStore.dtd".equals(publicId)) { InputStream in = this.getClass().getResourceAsStream("/jaxp/resources/bookStore.dtd"); InputSource is = new InputSource(in); return is; } return null; } }); Source source = new SAXSource(parser.getXMLReader(), new InputSource(this.getClass().getResourceAsStream("/jaxp/resources/bookStore.xml"))); Result target = new StreamResult(System.out); tr.transform(source, target); } // "[joey,bill,cat]" will be transformed to // <test><name>joey</name><name>bill</name><name>cat</name></test> public void str2xml() throws Exception { final String[] names = new String[]{"joey","bill","cat"}; Transformer tr = tff.newTransformer(); Source source = new SAXSource(new XMLReader() { private ContentHandler handler; @Override public void parse(InputSource input) throws IOException, SAXException { handler.startDocument(); handler.startElement("", "test", "test", null); for (int i=0; i<names.length; i++) { handler.startElement("", "name", "name", null); handler.characters(names[i].toCharArray(), 0, names[i].length()); handler.endElement("", "name", "name"); } handler.endElement("", "test", "test"); handler.endDocument(); } @Override public void parse(String systemId) throws IOException, SAXException { } @Override public boolean getFeature(String name) throws SAXNotRecognizedException, SAXNotSupportedException { return false; } @Override public void setFeature(String name, boolean value) throws SAXNotRecognizedException, SAXNotSupportedException { } @Override public Object getProperty(String name) throws SAXNotRecognizedException, SAXNotSupportedException { return null; } @Override public void setProperty(String name, Object value) throws SAXNotRecognizedException, SAXNotSupportedException { } @Override public void setEntityResolver(EntityResolver resolver) { } @Override public EntityResolver getEntityResolver() { return null; } @Override public void setDTDHandler(DTDHandler handler) { } @Override public DTDHandler getDTDHandler() { return null; } @Override public void setContentHandler(ContentHandler handler) { this.handler = handler; } @Override public ContentHandler getContentHandler() { return handler; } @Override public void setErrorHandler(ErrorHandler handler) { } @Override public ErrorHandler getErrorHandler() { return null; } }, new InputSource()); Result target = new StreamResult(System.out); tr.transform(source, target); } }