php解析xml，並將xml轉換爲層級數組

時間 2020-01-23

標籤 php 解析 xml 轉換層級數組欄目 PHP 简体版

原文原文鏈接

1）xml_parser_create([ string $encoding ] ):創建一個新的xml解析器並返回可被其餘xml函數使用的資源句柄,php

參數$encoding：html

php4,中用來只指定要被解析的xml輸入的字符編碼方式；數組

php5,自動偵測輸入xml的編碼，encoding僅用來指定解析後輸出數據的編碼dom

默認：輸入編碼=輸出編碼ide

php5.0.2+默認編碼utf-8；以前版本，ISO-8859-1函數

2)bool xml_parser_set_option(resource $parser,int $option,mixed $value):爲指定的的xml解析進行選項設置編碼

parser:指向要設置選項信息的xml解析器指針spa

option：要設置選項的名稱指針

value：要設置選項的值xml

設置成功返回true，失敗返回false

選項數據類型描述

XML_OPTION_CASE_FOLDINGint 控制在該xml解析器中大小寫是否有效。默認有效,0原樣輸出，1轉換爲大寫,只控制輸出樣式

XML_OPTION_SKIP_TAGSTARTint 指明在一個標記名前應略過幾個字符

XML_OPTION_SKIP_WHITEint 是否略過由空白字符組成的值

XML_OPTION_TARGET_ENCODINGstring

3)int xml_parse_into_struct(resource $parser,string $data,array &$values [,array &$index]):將xml

文件解析到兩個對應的數組中，index參數含有指向values數組中對應值的指針，該函數返回的是一級數組，不想dom樹那樣有層級關係

失敗返回0，成功返回1

4)eg:

源文件：

<?xml version="1.0" encoding="utf-8"?>

</newdata>

value結果：

Array

(

[0] => Array

(

//標籤名

[tag] => newdata

//節點狀態,open:含有子標籤，起始標籤；close：open的閉合部分；complete:無子標籤

[type] => open

//層級

[level] => 1

)

[1] => Array

(

[tag] => version

[type] => complete

[level] => 2

//節點屬性數組

[attributes] => Array

(

[a] => xxx

)

//節點值

[value] => aaaa

)

[2] => Array

(

[tag] => sample

[type] => complete

[level] => 2

[value] => 0

)

[3] => Array

(

[tag] => all

[type] => complete

[level] => 2

[value] => https

)

[4] => Array

(

[tag] => newdata

[type] => close

[level] => 1

)

索引結果：

Array

(

//節點名稱

[newdata] => Array

(

[0] => 0//節點起始索引

[1] => 4//節點結束索引

)

[version] => Array

(

[0] => 1

)

[sample] => Array

(

[0] => 2

)

[all] => Array

(

[0] => 3

)

5）將xml轉換爲array的函數：

/**

* 將xml字符串轉換爲數組

* @param string $contents

* @param string $encoding

* @param int $get_attrbutes

* @param string $priority

* @param array

public static function xml2Array($contents = NULL, $encoding = 'UTF-8', $get_attributes = 1, $priority = 'tag') {

if(!$contents) {

return array();

}

if(!function_exists('xml_parser_create')) {

return array();

}

//xml解析器

$parser = xml_parser_create('');

xml_parser_set_option($parser,XML_OPTION_TARGET_ENCODING,$encoding);

//將標籤原樣輸出，不轉換成大寫

xml_parser_set_option($parser,XML_OPTION_CASE_FOLDING,0);

//是否忽略空白字符

xml_parser_set_option($parser,XML_OPTION_SKIP_WHITE,1);

//$xml_values,$index引用類型，將文本解析到指定的數組變量中

xml_parse_into_struct($parser, trim($contents), $xml_values/*,$index*/);

//釋放解析器

xml_parser_free($parser);

if(!$xml_values)

return array();

$xml_array = array();

$parents = array();

$opened_tags = array();

$arr = array();

//當前操做結構的指針

$current = & $xml_array;

//同級結構下重複標籤的計數

$repeated_tag_index = array();

foreach ($xml_values as $data) {

//刪除屬性和值，確保每次用到的是新的

unset($attributes, $value);

//將標籤結構數組，釋放到當前的變量域中

extract($data);

//存當前標籤的結果

$result = array();

//存屬性

$attributes_data = array();

//標籤有value

if(isset($value)) {

if($priority == 'tag'){

$result = trim($value);

}else{

$result['value'] = trim($value);

}

//標籤有屬性，且不忽略

if($get_attributes && isset($attributes)) {

foreach ($attributes as $attr => $val) {

if ($priority == 'tag'){//放入單獨記錄屬性的數組中

$attributes_data[$attr] = $val;

}else{//統一放入$result中

$result['attr'][$attr] = $val;

}

//處理節點關係

if ($type == "open") {//有子節點標籤

$parent[$level - 1] = & $current; //$parent[$level - 1],指向複合標籤的起始處

if (!is_array($current) || (!in_array($tag, array_keys($current)))) {//xml複合標籤的第一個

$current[$tag] = $result;//屬性獨立

/*處理結果

[tag] => Array

(

[value] => aaaa,

[attr] => Array

(

[a] => xxx

)

if ($attributes_data){

$current[$tag . '_attr'] = $attributes_data;

/*處理結果

[tag] => xxxx,

[tag_attr] => Array

(

[a] => xxx

)

}

$repeated_tag_index[$tag . '_' . $level] = 1;//記錄同級中該標籤重複的個數

//指針從新指向符合標籤的子標籤

$current = & $current[$tag];

}else {

if (isset($current[$tag][0])) {//第3+個同級複合標籤

$current[$tag][$repeated_tag_index[$tag . '_' . $level]] = $result;

$repeated_tag_index[$tag . '_' . $level] ++;

} else {//第2個同級複合標籤

//在關聯數組外包一層索引數組

$current[$tag] = array(

$current[$tag],

$result

);

$repeated_tag_index[$tag . '_' . $level] = 2;

//此處只記錄第一個重複標籤的屬性，可能有bug，需注意！

//要想區別各子標籤的屬性，須要將$priority設成非'tag'

if (isset($current[$tag . '_attr'])) {

$current[$tag]['0_attr'] = $current[$tag . '_attr'];

unset($current[$tag . '_attr']);

}

//記錄最後一個重複子標籤的索引

$last_item_index = $repeated_tag_index[$tag . '_' . $level] - 1;

//指針指向下一個子標籤

$current = & $current[$tag][$last_item_index];

}

} elseif ($type == "complete") {

//第一個complete類型的標籤

if (!isset($current[$tag])) {

$current[$tag] = $result;

$repeated_tag_index[$tag . '_' . $level] = 1;

if ($priority == 'tag' && $attributes_data)

$current[$tag . '_attr'] = $attributes_data;

}

else {

//第3+個同級子標籤

//此處只有$current[$tag][0]，不行，由於可能索引到字符串的第一個字符

if(isset($current[$tag][0]) && !is_array($current[$tag])){

print_r($current);exit();

}

if(isset($current[$tag][0]) && is_array($current[$tag])) {

$current[$tag][$repeated_tag_index[$tag . '_' . $level]] = $result;

//子標籤的屬性不忽略

if ($get_attributes && $priority == 'tag' && $attributes_data) {

$current[$tag][$repeated_tag_index[$tag . '_' . $level] . '_attr'] = $attributes_data;

}

$repeated_tag_index[$tag . '_' . $level] ++;

}else{//第2個同級子標籤

$current[$tag] = array(

$current[$tag],

$result

);

$repeated_tag_index[$tag . '_' . $level] = 1;

if ($priority == 'tag' && $get_attributes) {

if (isset($current[$tag . '_attr'])) {

$current[$tag]['0_attr'] = $current[$tag . '_attr'];

unset($current[$tag . '_attr']);

}

if ($attributes_data) {

$current[$tag][$repeated_tag_index[$tag . '_' . $level] . '_attr'] = $attributes_data;

}

$repeated_tag_index[$tag . '_' . $level] ++;

}

}elseif($type == 'close'){

//閉合標籤和起始標籤level相同，所以進入complete類型的子標籤後，能夠經過父節點的close標籤，能夠指回到父節點

$current = & $parent[$level - 1];

}

return $xml_array;

}

6)關於xml中的CDATA：

在XML文檔中的全部文本都會被解析器解析。只有在CDATA部件以內的文本會被解析器忽略

一個 CDATA 部件以"< ![CDATA[" 標記開始，以"]]>"標記結束:

    < script>
    < ![CDATA[
    function matchwo(a,b)
    {
        if (a < b && a < 0) then
        {
            return 1
        }
        else
        {
            return 0
        }
      }
     ]]>
    < /script>

在前面的例子中，全部在CDATA部件之間的文本都會被解析器忽略。

CDATA注意事項:
CDATA部件之間不能再包含CDATA部件（不能嵌套）。若是CDATA部件包含了字符"]]>" 或者"< ! [CDATA[" ，將頗有可能出錯哦。

一樣要注意在字符串"]]>"之間沒有空格或者換行符

參考地址：http://www.cnblogs.com/chenqingwei/archive/2010/04/21/1717237.html

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。