BeautifulSoup

時間 2019-11-25

標籤 beautifulsoup 简体版

原文原文鏈接

安裝：
(Ubuntu) sudo apt-get install python-bs4
或者
pip install beautifulsoup4
或着
easy_install beautifulsoup4


一些基本應用：
from bs4 import BeautifulSoup
import re ,urllib2
url="http://data.eastmoney.com/cjsj/cpi.html"
data=urllib2.urlopen(url)
soup=BeautifulSoup(data,from_encoding="utf8")


++++++標籤的tring，strings，get_text(),contents,get方法。navigablestring對象有屬性string，strings+++++++++++
獲取head
soup.head()
for i in soup.head.strings:
...         print i

居民消費價格指數（CPI） _ 數據中心 _ 東方財富網
var swf_line = "http://g1.dfcfw.com/g1/201012/20101214085507.swf";
var swf_pie = "http://g1.dfcfw.com/g1/201104/20110412125826.swf";
var swf_column = " 

獲取title
print soup.title.get_text()
print soup.title.string
print  soup.title.contents[0]
居民消費價格指數（CPI） _ 數據中心 _ 東方財富網

獲取標籤中含有（class='secondTr'）的對象
soup.find(class_='secondTr') （python對class是敏感詞 ，因此寫成class_）

獲取tr標籤且tr標籤的class屬性以Tr結尾
print soup.find_all('tr',class_=re.compile('Tr$'))

獲取全部超連接
print soup.find_all('a')
獲取全部href以 
for i in  soup.find_all('a',href=re.compile('^:
    print i['href']  #i.get('href')
    
    
    
    
navigablestring能夠很簡單地轉換爲unicode，和string是幾乎同樣的。
實例：
unicode_string = unicode(tag.string)

相關標籤/搜索

beautifulsoup

python+beautifulsoup

webdriver+beautifulsoup

9.beautifulsoup

2.beautifulsoup

urllib.request+beautifulsoup

urllib+beautifulsoup

beautifulsoup+requets

requests+beautifulsoup

request&beautifulsoup

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。