【爬蟲】邪惡gif圖出處爬蟲

時間 2019-11-29

原文原文鏈接

這是一個罪惡的爬蟲html

爬取 http://www.27gif.net/gifcc 中的gif圖，並以‘神祕代碼’爲它的文件名保存。python

------------------------------------------------------------------------------------------------------url

import requests
from bs4 import BeautifulSoup


page = 1

while True:
    # 請求起始頁，找到每一個圖帖子的鏈接，並自動保存在list中
    star_url = 'http://www.27gif.net/gifcc/page/%s/' % str(page)
    star_html = requests.get(star_url).text
    star_soup = BeautifulSoup(star_html,'lxml')
    gif_list = star_soup.find_all('div',class_='wow fadeInUp')
    # 遍歷全部帖子的list
    for gif_html in gif_list:
        # 找到img標籤中的'alt屬性' 整理獲得gif的url
        try:
            gif_name = gif_html.find('img')['alt'].split('：')[1]
        except TypeError as E:
            continue
        except IndexError as e:
            gif_name = gif_html.find('img')['alt']
        try:
            gif_url = gif_html.find('img')['src'].split('src=')[1].split('&w=')[0]
        except TypeError as E:
            continue
        # 請求gif的url 並保存
        gif_content = requests.get(gif_url).content
        with open(gif_name+'.gif','wb') as f:
            f.write(gif_content)
            print(gif_name+'  OK!')
    if page < 13:

        page += 1
    else:
        break

　運行完畢後，會在當前文件夾保存GIF圖。.net

　使用前請備好紙巾，使用後請及時喝養分快線xml