思路:html
1.觀察網頁,找到img標籤python
2.經過requests和BS庫來提取網頁中的img標籤函數
3.抓取img標籤後,再把裏面的src給提取出來,接下來就能夠下載圖片了學習
4.經過urllib的urllib.urlretrieve來下載圖片而且放進文件夾裏面(第一以前的準備工做就是獲取當前路徑而後新建一個文件夾)url
5.若是有多張圖片,不斷的重複3-4spa
因爲爬蟲寫得少,經過本身的調試,終於寫了出來了調試
下面直接上代碼:code
1 #coding = 'utf-8' 2 import requests 3 from bs4 import BeautifulSoup 4 import urllib 5 import os 6 import sys 7 reload(sys) 8 sys.setdefaultencoding("utf-8") 9 10 if __name__ == '__main__': 11 url = 'http://www.qiushibaike.com/' 12 res = requests.get(url) 13 res.encoding = 'utf-8' 14 soup = BeautifulSoup(res.text, 'html.parser') 15 imgs = soup.find_all("img") 16 17 _path = os.getcwd() 18 new_path = os.path.join(_path , 'pictures') 19 if not os.path.isdir(new_path): 20 os.mkdir(new_path) 21 new_path += '\ ' 22 23 try: 24 x = 1 25 if imgs == []: 26 print "Done!" 27 for img in imgs: 28 link = img.get('src') 29 if 'http' in link: 30 print "It's downloading %s" %x + "th's piture" 31 urllib.urlretrieve(link, new_path + '%s.jpg' %x) 32 x += 1 33 34 except Exception, e: 35 print e 36 else: 37 pass 38 finally: 39 if x : 40 print "It's Done!!!"
接下來上結果:htm
python3中的版本,略有有一點點不一樣,就是下載圖片的方法須要加上request,而後才能使用urlretrieve方法進行下載blog
1 #!/usr/bin/python3 2 #coding = 'utf-8' 3 4 import requests 5 from bs4 import BeautifulSoup 6 import urllib 7 import os 8 import sys 9 #reload(sys) 10 #sys.setdefaultencoding("utf_8") 11 12 if __name__ == '__main__': 13 url = 'http://www.qiushibaike.com/' 14 res = requests.get(url) 15 res.encoding = 'utf-8' 16 print (res) 17 soup = BeautifulSoup(res.text,'html.parser') 18 #imgs = soup.find_all('img', attrs={'class': 'item_img'}) 19 imgs = soup.find_all('img') 20 21 _path = os.getcwd() 22 new_path = os.path.join(_path,'pictures\\')#須要添加斜槓,才能將圖片放進單獨的文件夾裏面 23 print(new_path) 24 25 if not os.path.isdir(new_path): 26 os.mkdir(new_path) 27 28 #new_path = new_path + '\' 29 #print (str(new_path)) 30 31 try: 32 x = 1 33 if imgs == []: 34 print ("Done!") 35 print (len(imgs)) 36 for img in imgs: 37 link = img.get('src') 38 link = 'http:' + link 39 #print (link) 40 if True: 41 print ("It's downloading %s" %x + "th's piture") 42 #python3以下使用urlretrieve 43 #_new111 = new_path + '%s.jpg'%5 44 #print (_new111) 45 urllib.request.urlretrieve(link,new_path + '%s.jpg' %x) 46 x += 1 47 48 except Exception: 49 pass 50 # else: 51 # pass 52 finally: 53 if x: 54 print ("It's Done!")
結果都是同樣,就再也不另外貼結果截圖了
總結:
雖然一開始思路不清晰,並且對怎樣把圖片保存下來,都不是很熟
可是通過本身的思考,只要思路清楚了,肯定了方向就好辦了,至於函數不會用的話,能夠直接百度查,很方便的
總而言之,寫程序以前必定要有思路,邊寫邊想思路是不行的,那樣容易返工
不過最後仍是寫出來了,哈哈
也請你們來共同窗習和指正
----------------------
轉載的話請你們註明出處哦,謝謝了