遇到一個編程問題,你必須首先想到的是要簡化它,簡化成一個最簡單的問題後,寫最簡單的代碼來解決它,同時只付出最簡單的測試代價。html
簡單HTML源碼:python
1<!--The loneliest number--> <a>2<!--Can be as bad as one--><b>3
提取上述代碼中的註釋:編程
from bs4 import BeautifulSoup, Comment soup = BeautifulSoup("""1<!--The loneliest number--> <a>2<!--Can be as bad as one--><b>3""") comments = soup.findAll(text=lambda text:isinstance(text, Comment)) for comment in comments: print comment
輸出結果:測試
The loneliest number Can be as bad as one
去掉上面HTML代碼中的註釋:code
from bs4 import BeautifulSoup, Comment soup = BeautifulSoup("""1<!--The loneliest number--> <a>2<!--Can be as bad as one--><b>3""") comments = soup.findAll(text=lambda text:isinstance(text, Comment)) [comment.extract() for comment in comments] print soup
輸出結果:htm
1 <a>2<b>3</b></a>
參考:element
一、How to find the comment tag <!--…--> with BeautifulSoup?get