Apache Commons Digester 一（基礎內容、核心API）

時間 2019-11-13

標籤 apache commons digester 基礎內容核心 api 欄目 Apache 简体版

原文原文鏈接

前言

　　在許多須要處理XML格式數據的應用環境中, 若是可以以「事件驅動」的方式來處理XML文檔，好比，當識別出特定的XML元素時，觸發「建立對象」操做事件（或者觸發調用對象的方法事件），這對於應用程序開發來講，是很是有用的；
　　熟悉以SAX(Simple API for XML Parsing)方式來處理XML文檔的開發人員會認識到，Digester爲SAX事件提供了更高層次，對開發者更加友好的接口，它隱藏了大部分導航XML元素層次結構的細節，以便於開發者更加專一於要執行的處理操做；html

使用Digester的基本步驟

建立一個 org.apache.commons.digester3.Digester 類的實例對象。這裏補充說明下，只要咱們已經完成XML解析操做，而且不在多個線程中使用同一個Digester對象，那麼就能夠安全的重複使用咱們預先建立的這個Digester實例；不太重用Digester實例並非很是推薦，最好每一個XML解析對應一個單獨的Digester實例；
爲Digester實例配置屬性值，經過配置屬性值，咱們能夠改變Digester 的解析行爲，具體有哪些屬性值能夠配置，待會會介紹；
可選的, 能夠將咱們的一些初始對象push到Digester棧裏；.
在輸入的XML文檔中，給全部須要觸發規則（rule）處理的元素匹配模式（pattern）註冊規則;針對任何一個模式，你能夠註冊任意數量的規則；補充說明下，若是一個模式對應多個規則，則begin和body事件方法會按照它們註冊的順序依次執行，而end事件方法是倒序執行的；
最後，調用digester.parse()方法，該方法須要傳入XML文件的引用做爲參數，該參數支持多種格式的文件流；另外須要注意的是，該方法會拋出IOException or SAXException異常，以及各類可能的在規則解析處理時遇到的異常，如NoSuchMethodException、IllegalAccessException…

瞭解基本步驟後，來看一個簡單的示例，以下所示，是咱們即將要解析的xml文件：java

<foo name="The Parent">
    <bar id="123" title="The First Child" />
    <bar id="456" title="The Second Child" />
    <bar id="789" title="The Second Child" />
</foo>

首先，建立兩個java bean對應xml中的元素信息：git

Foo類程序員

package apache.commons.digester3.example.pojo;

import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

/**
 * @author http://www.cnblogs.com/chenpi/
 * @version 2017年6月3日
 */
public class Foo
{
    private String name;
    private List<Bar> barList = new ArrayList<Bar>();

    public void addBar(Bar bar)
    {
        barList.add(bar);
    }

    public Bar findBar(int id)
    {
        for (Bar bar : barList)
        {
            if (bar.getId() == id)
            {
                return bar;
            }
        }
        return null;
    }

    public Iterator<Bar> getBars()
    {
        return barList.iterator();
    }

    /**
     * @return the name
     */
    public String getName()
    {
        return name;
    }

    /**
     * @param name the name to set
     */
    public void setName(String name)
    {
        this.name = name;
    }

    /**
     * @return the barList
     */
    public List<Bar> getBarList()
    {
        return barList;
    }

    /**
     * @param barList the barList to set
     */
    public void setBarList(List<Bar> barList)
    {
        this.barList = barList;
    }

}

View Code

Bar類github

package apache.commons.digester3.example.pojo;

/**
 * @author    http://www.cnblogs.com/chenpi/
 * @version   2017年6月3日
 */
public class Bar
{

    private int id;
    private String title;

    /**
     * @return the id
     */
    public int getId()
    {
        return id;
    }

    /**
     * @param id the id to set
     */
    public void setId(int id)
    {
        this.id = id;
    }

    /**
     * @return the title
     */
    public String getTitle()
    {
        return title;
    }

    /**
     * @param title the title to set
     */
    public void setTitle(String title)
    {
        this.title = title;
    }

}

View Code

使用Digester解析xml：web

package apache.commons.digester3.example.simpletest;

import java.io.IOException;

import org.apache.commons.digester3.Digester;
import org.xml.sax.SAXException;

import apache.commons.digester3.example.pojo.Bar;
import apache.commons.digester3.example.pojo.Foo;

/**
 * 
 * @author http://www.cnblogs.com/chenpi/
 * @version 2017年6月3日
 */

public class Main
{

    public static void main(String[] args)
    {

        try
        {
            //一、建立Digester對象實例
            Digester digester = new Digester();

            //二、配置屬性值
            digester.setValidating(false);

            //三、push對象到對象棧
            //digester.push(new Foo());
            
            //四、設置匹配模式、規則
            digester.addObjectCreate("foo", "apache.commons.digester3.example.pojo.Foo");
            digester.addSetProperties("foo");
            digester.addObjectCreate("foo/bar", "apache.commons.digester3.example.pojo.Bar");
            digester.addSetProperties("foo/bar");
            digester.addSetNext("foo/bar", "addBar", "apache.commons.digester3.example.pojo.Bar");

            //五、開始解析
            Foo foo = digester.parse(Main.class.getClassLoader().getResourceAsStream("example.xml"));

            //六、打印解析結果
            System.out.println(foo.getName());
            for (Bar bar : foo.getBarList())
            {
                System.out.println(bar.getId() + "," + bar.getTitle());
            }

        }
        catch (IOException e)
        {

            e.printStackTrace();
        }
        catch (SAXException e)
        {

            e.printStackTrace();
        }
    }
}

結果打印：apache

The Parent
123,The First Child
456,The Second Child
789,The Second Child

注意以上代碼涉及類型的自動轉換，如id屬性，由字符串類型轉爲整型，這裏全部的類型轉換都是由commons-beanutils包的ConvertUtils來完成的。api

`Digester`屬性配置

org.apache.commons.digester3.Digester實例對象包含若干成員屬性，這些屬性值是能夠設置的，以便咱們自定義解析操做；安全

爲了讓這些配置在XML解析前生效，這些屬性值的更改必定要在parse方法調用以前設置；數據結構

以下是一些能夠配置的屬性

Property	Description
classLoader	經過配置這個屬性值，咱們能夠指定`ObjectCreateRule`和`FactoryCreateRule`規則使用的類加載器；另外，在沒指定該值的時候，若是`useContextClassLoader`屬性值設爲true，則會使用當前線程上下文類加載器，不然直接使用加載`Digester`類的同一個類加載器；
errorHandler	經過配置這個屬性值，咱們能夠指定一個SAX `ErrorHandler`，當解析錯誤出現的時候，該`Handler`會收到通知；
namespaceAware	經過設置該boolean值，可讓`Digester在解析的時候，識別出`XML命名空間；
xincludeAware	經過設置該boolean值，可讓`Digester在解析的時候`，`識別出`語法，注意該設置只有在namespaceAware設爲true的前提下才有效.
ruleNamespaceURI	該值通常配合namespaceAware使用，在namespaceAware設爲true的狀況下，設置ruleNamespaceURI，那麼接下來的規則只會匹配ruleNamespaceURI命名空間下的元素；
rules	經過該屬性，咱們能夠額外添加一些自定義解析匹配規則；
useContextClassLoader	若是`useContextClassLoader`屬性值設爲true，則會使用當前線程上下文類加載器，不然直接使用加載`Digester`類的同一個類加載器注意 - 若是設置了classLoader屬性值，改屬性配置將被忽略；
validating	經過設置該boolean值，能夠配置`Digester`是否作DTD檢查；

另外，咱們能夠經過Digester的register方法，讓Digester在遇到DOCTYPE聲明時，使用本地dtd，而不是從網上獲取，以下所示：

        URL url = new URL("/org/apache/struts/resources/struts-config_1_0.dtd");
        digester.register("-//Apache Software Foundation//DTD Struts Configuration 1.0//EN", url.toString());

Digester對象棧

Digester使用的一個核心技術就是動態構建一顆java對象樹，在構建的過程當中，一個重要的輔助數據結構即對象棧；

以以下xml爲例：

<foo name="The Parent">
    <bar id="123" title="The First Child" />
    <bar id="456" title="The Second Child" />
    <bar id="789" title="The Second Child" />
</foo>

在解析的時候：

首先會建立一個foo對象，並壓入對象棧，而後設置foo屬性值name，緊接着，建立bar對象並壓入棧，而後設置bar的屬性值，而後將該bar對象添加的到foo對象的barlist屬性集合中，而後bar對象彈出對象棧；

以此類推，遇到起始標記的元素建立對象入棧，遇到結尾標記的元素作出棧操做，出棧前，須要將出棧對象並關聯到上一個棧頂對象；

最終，解析完xml後，留在棧頂的就關聯了全部在xml解析中建立的動態對象了；

Digester暴露出的與對象棧操做API以下所示：

clear() - 清除對象棧.
peek() - 返回棧頂對象引用，可是不彈出.
pop() - 返回棧頂對象，並彈出.
push() - 入棧操做.

Digester元素匹配模式

Digester的一個關鍵特性是能夠自動識別xml的層次結構，程序員只須要關心遇到匹配到某個元素後須要作哪些操做便可；

以下是一個示例，其中a, a/b, a/b/c爲匹配模式，對應xml中特定位置的元素：

  <a>         -- Matches pattern "a"
    <b>       -- Matches pattern "a/b"
      <c/>    -- Matches pattern "a/b/c"
      <c/>    -- Matches pattern "a/b/c"
    </b>
    <b>       -- Matches pattern "a/b"
      <c/>    -- Matches pattern "a/b/c"
      <c/>    -- Matches pattern "a/b/c"
      <c/>    -- Matches pattern "a/b/c"
    </b>
  </a>

Digester規則處理

當匹配到模式時，會觸發規則處理，具體的規則處理機制是由這個org.apache.commons.digester3.Rule接口封裝的，該接口定義瞭如下幾個方法：

begin() - 匹配到xml元素開始標記時，調用該方法；
body() - 匹配到xml元素body時，調用該方法；
end() - 匹配到xml元素結束標記時，調用該方法；
finish() - 當全部解析方法解析完畢後，調用該方法，用於清楚臨時數據等；

默認狀況下，Digester提供瞭如下Rule接口的實現類，咱們在編碼的時候能夠直接使用，詳見API文檔：

以下是一個SetNextRule規則實現類的示例（兩種寫法）：

            Rule rule = new SetNextRule("addBar",Bar.class);
            digester.addRule("foo/bar", rule );

            //digester.addSetNext("foo/bar", "addBar", Bar.class.getName());

Digester日誌

日誌是調試、排查錯誤很是關鍵的一個環節，Digester記錄了很是詳細的日誌，咱們能夠按以下方式來開啓日誌打印功能；

這裏的日誌實現選擇log4j，

首先，在pom.xml加上以下依賴：

        <!-- https://mvnrepository.com/artifact/log4j/log4j -->
        <dependency>
            <groupId>log4j</groupId>
            <artifactId>log4j</artifactId>
            <version>1.2.17</version>
        </dependency>

而後，編寫一個配置文件log4j.properties放到resources路徑下：

### set log levels ###
log4j.rootLogger = debug, stdout

### \u8F93\u51FA\u5230\u63A7\u5236\u53F0 ###
log4j.appender.stdout = org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target = System.out
log4j.appender.stdout.layout = org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern = %-d{yyyy-MM-dd HH:mm:ss}  [ %t:%r ] - [ %p ]  %m%n

運行程序，發現已經能夠看到DEBUG調試日誌了日誌：

2017-06-04 18:26:33 [ main:51 ] - [ DEBUG ]    Fire body() for SetPropertiesRule[aliases={}, ignoreMissingProperty=true]
2017-06-04 18:26:33 [ main:51 ] - [ DEBUG ]    Popping body text ''
2017-06-04 18:26:33 [ main:51 ] - [ DEBUG ]    Fire end() for SetPropertiesRule[aliases={}, ignoreMissingProperty=true]
2017-06-04 18:26:33 [ main:52 ] - [ DEBUG ]    Fire end() for ObjectCreateRule[className=apache.commons.digester3.example.pojo.Foo, attributeName=null]
2017-06-04 18:26:33 [ main:52 ] - [ DEBUG ] [ObjectCreateRule]{foo} Pop 'apache.commons.digester3.example.pojo.Foo'
2017-06-04 18:26:33 [ main:52 ] - [ DEBUG ] endDocument()
The Parent
123,The First Child
456,The Second Child
789,The Second Child

Digester例子

前面咱們已經舉了一個Digester的簡單使用例子，這裏將繼續展現幾個示例；

解析xml元素body值

以下XML文檔就是咱們要解析內容：

<web-app>
    <servlet>
        <servlet-name>action</servlet-name>
        <servlet-class>org.apache.struts.action.ActionServlet</servlet-class>
            <init-param>
                <param-name>application</param-name>
                <param-value>org.apache.struts.example.ApplicationResources</param-value>
            </init-param>
            <init-param>
                <param-name>config</param-name>
                <param-value>/WEB-INF/struts-config.xml</param-value>
            </init-param>
    </servlet>
</web-app>

首先，定義一個ServletBean，存儲以上xml信息，以下所示：

/*
 * File Name: ServletBean.java
 * Description: 
 * Author: http://www.cnblogs.com/chenpi/
 * Create Date: 2017年6月4日
 */
package apache.commons.digester3.example.pojo;

import java.util.HashMap;
import java.util.Map;

/**
 * 
 * @author    http://www.cnblogs.com/chenpi/
 * @version   2017年6月4日
 */

public class ServletBean
{

    private String servletName;
    private String servletClass;
    private Map<String, String> initParams = new HashMap<String, String>();
    
    
    public void addInitParam(String paramName, String paramValue){
        initParams.put(paramName, paramValue);
    }
    /**
     * @return the servletName
     */
    public String getServletName()
    {
        return servletName;
    }
    /**
     * @param servletName the servletName to set
     */
    public void setServletName(String servletName)
    {
        this.servletName = servletName;
    }
    /**
     * @return the servletClass
     */
    public String getServletClass()
    {
        return servletClass;
    }
    /**
     * @param servletClass the servletClass to set
     */
    public void setServletClass(String servletClass)
    {
        this.servletClass = servletClass;
    }
    /**
     * @return the initParams
     */
    public Map<String, String> getInitParams()
    {
        return initParams;
    }
    /**
     * @param initParams the initParams to set
     */
    public void setInitParams(Map<String, String> initParams)
    {
        this.initParams = initParams;
    }
}

編寫規則解析xml，以下所示：

/*
 * File Name: Main2.java
 * Description: 
 * Author: http://www.cnblogs.com/chenpi/
 * Create Date: 2017年6月4日
 */
package apache.commons.digester3.example.simpletest;

import java.io.IOException;

import org.apache.commons.digester3.Digester;
import org.apache.commons.digester3.Rule;
import org.apache.commons.digester3.SetNextRule;
import org.xml.sax.SAXException;

import apache.commons.digester3.example.pojo.Bar;
import apache.commons.digester3.example.pojo.Foo;
import apache.commons.digester3.example.pojo.ServletBean;

/**
 * 
 * @author http://www.cnblogs.com/chenpi/
 * @version 2017年6月4日
 */

public class WebMain
{

    public static void main(String[] args)
    {
        try
        {
            // 一、建立Digester對象實例
            Digester digester = new Digester();

            // 二、配置屬性值
            digester.setValidating(false);

            // 三、push對象到對象棧

            // 四、設置匹配模式、規則
            digester.addObjectCreate("web-app/servlet", "apache.commons.digester3.example.pojo.ServletBean");
            digester.addCallMethod("web-app/servlet/servlet-name", "setServletName", 0);
            digester.addCallMethod("web-app/servlet/servlet-class", "setServletClass", 0);
            digester.addCallMethod("web-app/servlet/init-param", "addInitParam", 2);
            digester.addCallParam("web-app/servlet/init-param/param-name", 0);
            digester.addCallParam("web-app/servlet/init-param/param-value", 1);

            // 五、開始解析
            ServletBean servletBean = digester
                .parse(ExampleMain.class.getClassLoader().getResourceAsStream("web.xml"));

            // 六、打印解析結果
            System.out.println(servletBean.getServletName());
            System.out.println(servletBean.getServletClass());
            for(String key : servletBean.getInitParams().keySet()){
                System.out.println(key + ": " + servletBean.getInitParams().get(key));
            }

        }
        catch (IOException e)
        {

            e.printStackTrace();
        }
        catch (SAXException e)
        {

            e.printStackTrace();
        }

    }
}

結果打印：

action
org.apache.struts.action.ActionServlet
application: org.apache.struts.example.ApplicationResources
config: /WEB-INF/struts-config.xml

解析XML命名空間

對於沒有使用命名空間的xml來講，Digester默認的處理機制已經足夠知足咱們的需求了。可是當XML文檔使用命名空間的時候，對於不一樣命名空間的元素來講，有時候咱們但願使用不一樣的規則去解析它。
Digester 沒有提供對命名空間的徹底支持, 但已經足夠完成大多數任務了. 開啓Digester的命名空間支持只須要如下幾個步驟便可:
一、經過配置如下屬性值，告訴 Digester，須要開啓命名空間解析:

 digester.setNamespaceAware( true );

二、聲明接下來的規則關聯的命名空間，注意咱們這裏沒有指明任何前綴，XML文檔的做者是可使用任何他們喜歡的前綴的

digester.setRuleNamespaceURI( "http://www.mycompany.com/MyNamespace" );

三、添加該命名空間下的規則, 一般會調用 addObjectCreate() 或者 addSetProperties()這類方法. 注意，這裏匹配的模式是不須要加上前綴的:

    digester.addObjectCreate( "foo/bar", "com.mycompany.MyFoo" );
    digester.addSetProperties( "foo/bar");

4. 重複二、3步驟，解析其它命名空間的元素.
以下，是一個示例，使用以上步驟便可完成解析:

<m:foo
   xmlns:m="http://www.mycompany.com/MyNamespace"
   xmlns:y="http://www.yourcompany.com/YourNamespace">
  <m:bar name="My Name" value="My Value"/>
  <y:bar id="123" product="Product Description"/>
</m:foo>

因爲咱們給Digester指定的命名空間爲http://www.mycompany.com/MyNamespace，因此以上xml只有第一個bar會被解析出來。

以下是一個完整示例（XML及對應解析代碼）：

<m:foo xmlns:m="http://www.mycompany.com/MyNamespace" 
       xmlns:y="http://www.yourcompany.com/YourNamespace" name="The Parent">
    <m:bar id="123" title="The First Child" />
    <y:bar id="456" title="The Second Child" />
    <m:bar id="789" title="The Second Child" />
</m:foo>

/*
 * File Name: Main.java
 * Description: 
 * Author: http://www.cnblogs.com/chenpi/
 * Create Date: 2017年6月3日
 */
package apache.commons.digester3.example.simpletest;

import java.io.IOException;

import org.apache.commons.digester3.Digester;
import org.xml.sax.SAXException;

import apache.commons.digester3.example.pojo.Bar;
import apache.commons.digester3.example.pojo.Foo;

/**
 * 
 * @author http://www.cnblogs.com/chenpi/
 * @version 2017年6月3日
 */

public class ExampleNSMain
{

    public static void main(String[] args)
    {

        try
        {
            Digester digester = new Digester();

            digester.setValidating(false);
            digester.setNamespaceAware(true);
            digester.setRuleNamespaceURI("http://www.mycompany.com/MyNamespace");

            digester.addObjectCreate("foo", Foo.class);
            digester.addSetProperties("foo");
            digester.addObjectCreate("foo/bar", Bar.class);
            digester.addSetProperties("foo/bar");

            digester.addSetNext("foo/bar", "addBar", Bar.class.getName());

            Foo foo = digester
                .parse(ExampleNSMain.class.getClassLoader().getResourceAsStream("example_ns.xml"));

            System.out.println(foo.getName());
            for (Bar bar : foo.getBarList())
            {
                System.out.println(bar.getId() + "," + bar.getTitle());
            }

        }
        catch (IOException e)
        {

            e.printStackTrace();
        }
        catch (SAXException e)
        {

            e.printStackTrace();
        }

    }
}

使用命名空間前綴用於模式匹配

當一個命名空間下的規則集合與另外一個命名空間的規則集合相互獨立的話，使用帶命名空間的規則很是有用，可是當咱們的規則邏輯須要使用到不一樣命名空間下的元素時，那麼使用帶命名空間前綴的模式將會是一個更好的策略；
很簡單，咱們只須要，設置NamespaceAware 屬性爲false，而後在模式前面帶上命名空間前綴便可。
好比, (將 NamespaceAware 設爲false), 那麼模式 m:bar' 只會匹配命名空間前綴爲m的bar元素.

以下是一個完整demo：

/*
 * File Name: Main.java
 * Description: 
 * Author: http://www.cnblogs.com/chenpi/
 * Create Date: 2017年6月3日
 */
package apache.commons.digester3.example.simpletest;

import java.io.IOException;

import org.apache.commons.digester3.Digester;
import org.xml.sax.SAXException;

import apache.commons.digester3.example.pojo.Bar;
import apache.commons.digester3.example.pojo.Foo;

/**
 * 
 * @author http://www.cnblogs.com/chenpi/
 * @version 2017年6月3日
 */

public class ExampleNS2Main
{

    public static void main(String[] args)
    {

        try
        {
            Digester digester = new Digester();

            digester.setValidating(false);
            digester.setNamespaceAware(false);
            //digester.setRuleNamespaceURI("http://www.mycompany.com/MyNamespace");

            digester.addObjectCreate("m:foo", Foo.class);
            digester.addSetProperties("m:foo");
            digester.addObjectCreate("m:foo/m:bar", Bar.class);
            digester.addSetProperties("m:foo/m:bar");
            digester.addSetNext("m:foo/m:bar", "addBar", Bar.class.getName());

            digester.addObjectCreate("m:foo/y:bar", Bar.class);
            digester.addSetProperties("m:foo/y:bar");
            digester.addSetNext("m:foo/y:bar", "addBar", Bar.class.getName());

            Foo foo = digester
                .parse(ExampleNS2Main.class.getClassLoader().getResourceAsStream("example_ns.xml"));

            System.out.println(foo.getName());
            for (Bar bar : foo.getBarList())
            {
                System.out.println(bar.getId() + "," + bar.getTitle());
            }

        }
        catch (IOException e)
        {

            e.printStackTrace();
        }
        catch (SAXException e)
        {

            e.printStackTrace();
        }

    }
}

錯誤排查

Digester 是基於 SAX 開發的. Digestion 會拋出兩種類型的 Exception:

java.io.IOException
org.xml.sax.SAXException

第一個異常不多會拋出，且該異常衆所周知。一般咱們遇到最多的是第二個異常，當SAX解析沒法完成的時候會拋出該異常，因此熟悉SAX的錯誤處理方式對診斷SAXException頗有幫助。
當一個SAX 解析器遇到xml問題時 (哈哈，有時候，會在遇到xml問題以後)，會拋出SAXParseException異常. 該異常是SAXException 的子類，而且包含了一些額外信息（哪裏出錯，出了什麼錯誤），若是咱們捕獲了這類異常，那麼就能夠明確知道問題是XML引發的，而不是Digester或者咱們的解析規則。一般來講，捕獲該異常並記錄詳細信息到日誌對診斷錯誤很是有幫助。
通常狀況下 SAXException 異常內部會組合另外一個異常，換句話說，就是當Digester遇到異常的時候，會首先將該異常封裝成一個SAXException異常，而後將該SAXException從新拋出。因此，捕獲SAXException異常，並仔細檢查被封裝的內部異常，有助於排查錯誤；
錯誤示例：

org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 44; 元素類型 "y:bar" 必須後跟屬性規範 ">" 或 "/>"。
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1239)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:648)
    at org.apache.commons.digester3.Digester.parse(Digester.java:1642)
    at org.apache.commons.digester3.Digester.parse(Digester.java:1701)
    at apache.commons.digester3.example.simpletest.ExampleNS2Main.main(ExampleNS2Main.java:48)