Jaxb如何優雅的處理CData

 

前言

  Jaxb確實是xml和java對象映射互轉的一大利器. 可是在處理CData內容塊的時候, 仍是有些小坑. 結合網上搜索的資料, 本文提供了一種解決的思路, 看看可否優雅地解決CData產出的問題.java

常規作法

  網上最多見的作法是藉助XmlAdapter和CharacterEscapeHandler(sun的api)組合來實現.
  首先定義CDataAdapter類, 用於對象類型轉換.node

public class CDataAdapter extends XmlAdapter<String, String> {

    @Override
    public String unmarshal(String v) throws Exception {
        return v;
    }

    @Override
    public String marshal(String v) throws Exception {
        return new StringBuilder("<![CDATA[").append(v).append("]]>").toString();
    }

}

  其藉助註解XmlJavaTypeAdapter做用於屬性變量上, 以下面的類對象上:apache

@XmlRootElement(name="root")
public static class TNode {
        
     @XmlJavaTypeAdapter(value=CDataAdapter.class)
     @XmlElement(name="text", required = true)
     private String text;
        
}

  使用Marshaller轉爲xml文本的時候, 結果倒是以下:api

<root>
    <text>&lt;![CDATA[李雷愛韓梅梅]]&gt;</text>
</root>

  這和咱們預期的其實有差別, 咱們其實想要的是以下的:app

<root>
    <text><![CDATA[李雷愛韓梅梅]]></text>
</root>

  本質的緣由是Jaxb默認會把字符'<', '>'進行轉義, 爲了解決這個問題, CharacterEscapeHandler就華麗登場了.maven

import com.sun.xml.internal.bind.marshaller.CharacterEscapeHandler;

marshaller.setProperty(
    "com.sun.xml.internal.bind.marshaller.CharacterEscapeHandler",
    new CharacterEscapeHandler() {
        @Override
        public void escape(char[] ch, int start, int length, boolean isAttVal, Writer writer) 
                throws IOException {
            writer.write(ch, start, length);
        }
    }
);

  測試結果, 完美地解決問題. 而後隨之而來的問題, 稍有些尷尬, 使用maven進行編譯打包的時候, 會遇到以下錯誤:ide

[ERROR] Compilation failure
[ERROR] 程序包com.sun.xml.internal.bind.marshaller不存在

  Java工程開發, 通常不建議直接調用內部的api(以com.sun開頭).函數

改進方案:

  參考了很多網友的博文, 大體思路都是同樣的, 就是藉助重載XMLStreamWriter類實現. 更確實的作法是重載writeCharacters方法, 在遇到CData標記(<![CDATA[]]>)包圍的文本時, 選擇調用writeCData函數, 可用如下代碼來大體說明:測試

public class CDataXMLStreamWriter implements XMLStreamWriter {

    // *) 重載writeCharacters, 遇CDATA標記, 則轉而調用writeCData方法
    @Override
    public void writeCharacters(String text) throws XMLStreamException {
        if ( text.startsWith("<![CDATA[") && text.endsWith("]]>") ) {
            writeCData(text.substring(9, text.length() - 3));
        } else {
            writeCharacters(text);
        }
    }
    // *) 演示使用
}

  真實的作法, 不會採用完整的去實現XmlStreamWriter接口的方案, 而是採用代理模式.這邊採用動態代理的方法.ui

private static class CDataHandler implements InvocationHandler {
    // *) 單獨攔截 writeCharacters(String)方法
    private static Method gWriteCharactersMethod = null;
    static {
        try {
            gWriteCharactersMethod = XMLStreamWriter.class
                    .getDeclaredMethod("writeCharacters", String.class);
        } catch (NoSuchMethodException e) {
            e.printStackTrace();
        }
    }

    private XMLStreamWriter writer;

    public CDataHandler(XMLStreamWriter writer) {
        this.writer = writer;
    }

    @Override
    public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        if ( gWriteCharactersMethod.equals(method) ) {
            String text = (String)args[0];
            // *) 遇到CDATA標記時, 則轉而調用writeCData方法
            if ( text != null && text.startsWith("<![CDATA[") && text.endsWith("]]>") ) {
                writer.writeCData(text.substring(9, text.length() - 3));
                return null;
            }
        }
        return method.invoke(writer, args);
    }

}

  具體的Marshaller代碼片斷以下所示:

public static <T> String mapToXmlWithCData(T obj) {

    try {

        StringWriter writer = new StringWriter();
        XMLStreamWriter streamWriter = XMLOutputFactory.newInstance()
                .createXMLStreamWriter(writer);
        // *) 使用動態代理模式, 對streamWriter功能進行干涉調整
        XMLStreamWriter cdataStreamWriter = (XMLStreamWriter) Proxy.newProxyInstance(
                streamWriter.getClass().getClassLoader(),
                streamWriter.getClass().getInterfaces(),
                new CDataHandler(streamWriter)
        );

        JAXBContext jc = JAXBContext.newInstance(obj.getClass());
        Marshaller marshaller = jc.createMarshaller();
        marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
        marshaller.setProperty(Marshaller.JAXB_ENCODING, "UTF-8");

        marshaller.marshal(obj, cdataStreamWriter);
        return writer.toString();

    } catch (JAXBException e) {
        e.printStackTrace();
    } catch (XMLStreamException e) {
        e.printStackTrace();
    }
    return null;

}

  測試的結果, 完美地解決了CData的問題(功能實現+繞過sun api), 不過這裏面還有點小瑕疵, 就是對齊問題, 這段代碼無法控制對齊.

對齊改進

  這邊須要藉助Transformer類實現, 思路是對最終的xml文本進行格式化處理.

// *) 對xml文本進行格式化轉化
public static String indentFormat(String xml) {
    try {
        TransformerFactory factory = TransformerFactory.newInstance();
        Transformer transformer = factory.newTransformer();
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

        StringWriter formattedStringWriter = new StringWriter();
        transformer.transform(new StreamSource(new StringReader(xml)),
                new StreamResult(formattedStringWriter));
        return formattedStringWriter.toString();
    } catch (TransformerException e) {
    }
    return null;
}

  

完整的解決方案

  這邊把上述全部的代碼完整的貼一遍:

import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamWriter;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

// *) XmlAdapter類, 修飾類字段, 達到自動添加CDATA標記的目標
public static class CDataAdapter extends XmlAdapter<String, String> {
    @Override
    public String unmarshal(String v) throws Exception {
        return v;
    }

    @Override
    public String marshal(String v) throws Exception {
        return new StringBuilder("<![CDATA[").append(v).append("]]>")
                .toString();
    }
}

// *) 動態代理
private static class CDataHandler implements InvocationHandler {

    private static Method gWriteCharactersMethod = null;
    static {
        try {
            gWriteCharactersMethod = XMLStreamWriter.class
                    .getDeclaredMethod("writeCharacters", String.class);
        } catch (NoSuchMethodException e) {
            e.printStackTrace();
        }
    }

    private XMLStreamWriter writer;

    public CDataHandler(XMLStreamWriter writer) {
        this.writer = writer;
    }

    @Override
    public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        if ( gWriteCharactersMethod.equals(method) ) {
            String text = (String)args[0];
            if ( text != null && text.startsWith("<![CDATA[") && text.endsWith("]]>") ) {
                writer.writeCData(text.substring(9, text.length() - 3));
                return null;
            }
        }
        return method.invoke(writer, args);
    }

}

// *) 生成xml
public static <T> String mapToXmlWithCData(T obj, boolean formatted) {

    try {

        StringWriter writer = new StringWriter();
        XMLStreamWriter streamWriter = XMLOutputFactory.newInstance()
                .createXMLStreamWriter(writer);
        // *) 使用動態代理模式, 對streamWriter功能進行干涉調整
        XMLStreamWriter cdataStreamWriter = (XMLStreamWriter) Proxy.newProxyInstance(
                streamWriter.getClass().getClassLoader(),
                streamWriter.getClass().getInterfaces(),
                new CDataHandler(streamWriter)
        );

        JAXBContext jc = JAXBContext.newInstance(obj.getClass());
        Marshaller marshaller = jc.createMarshaller();
        marshaller.setProperty(Marshaller.JAXB_ENCODING, "UTF-8");

        marshaller.marshal(obj, cdataStreamWriter);
        // *) 對齊差別處理
        if ( formatted ) {
            return indentFormat(writer.toString());
        } else {
            return writer.toString();
        }

    } catch (JAXBException e) {
        e.printStackTrace();
    } catch (XMLStreamException e) {
        e.printStackTrace();
    }
    return null;

}

// *) xml文本對齊
public static String indentFormat(String xml) {
    try {
        TransformerFactory factory = TransformerFactory.newInstance();
        Transformer transformer = factory.newTransformer();
        // *) 打開對齊開關
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        // *) 忽略掉xml聲明頭信息
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

        StringWriter formattedStringWriter = new StringWriter();
        transformer.transform(new StreamSource(new StringReader(xml)),
                new StreamResult(formattedStringWriter));

        return "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
                + formattedStringWriter.toString();
    } catch (TransformerException e) {
    }
    return null;
}

  編寫具體的測試案例:

@NoArgsConstructor
@AllArgsConstructor
@XmlRootElement(name="root")
public static class TNode {
    @XmlElement(name="key", required = true)
    private String key;

    @XmlJavaTypeAdapter(value=CDataAdapter.class)
    @XmlElement(name="text", required = true)
    private String text;
}

public static void main(String[] args) {
    TNode node = new TNode("key", "李雷愛韓梅梅");
    String xml = mapToXmlWithCData(node, true);
    System.out.println(xml);
}

  測試輸出的結果以下:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <key>key</key>
    <text><![CDATA[李雷愛韓梅梅]]></text>
</root>

 

總結

  總的來講, 改進的方案規避了sun api的編譯限制. 同時能知足以前的功能需求, 值得小小鼓勵一下, ^_^.

相關文章
相關標籤/搜索