xpath獲取標籤對自己含內容

時間 2019-12-09

標籤 xpath 獲取標籤自己內容简体版

原文原文鏈接

一般使用xpath咱們直接定位到標籤後, 使用/text() 或 //text()來獲取標籤對之間的文本值,html

但特殊狀況下咱們也須要獲取標籤自己含文本值, 操做以下:app

文件爲html, 標籤對結構以下:編碼

<table id='1h'> 
　　<tr>
　　 　 <td>Row value 1</td> 
　　　　<td>Row value 2</td> 
　　</tr>
</table>

代碼以下:

from lxml import etree
import requests
from lxml.html import fromstring, tostring
url = "https://www.baidu.com/"

ret = requests.get(url, headers=headers)
code = ret.apparent_encoding  # 獲取url對應的編碼格式
ret.encoding = code
html = ret.text               # html文件內容即示例中的標籤

tree = etree.HTML(html)
result = tree.xpath('//*[@id="lh"]')[0]

print('看結果這裏', tostring(result, encoding=code).decode(code))注: tostring()方法便可把經過xpath定位到的標籤(含該標籤)及其下的全部標籤輸出,　　切記使用decode()方法來進行解碼

相關標籤/搜索