2015.12.30 日學習總結

時間 2019-11-08

標籤 2015.12.30 學習總結简体版

原文原文鏈接

---------2015.12.30---------------- html

學習心得：學習極客學院的python教程下的定向數據爬蟲 python

學習成果：經過數據爬蟲扒下了喜歡漫畫網站的漫畫(文章最後是代碼)頗有成就感~~~~ 正則表達式

安裝好第三方庫文件 Requests 編程

學習筆記: 學習

安裝python 第三方庫撞牆時網站

選擇下載網站（幾乎全部的第三庫文件） ui

http://www.lfd.uci.edu/~gohlke/pythonlibs/ url

whl文件 改後綴名爲.zip 解壓後 將解壓後的最短文件明拷貝到 python安裝位置的Lib文件夾裏 spa

重點是要記住三個關鍵字 code

Search findall Sub

正則表達式的用法 經常使用的就是這幾個

----[1]-----

# for each in pics_url: #用循環重複一塊兒作就會換行

# print(each) #單獨作不換行

----[2]-----

# # text = re.findall('">(.*?)</a></li>', html, re.S) #加上換行符號re.S 要慎用

----[3]-----

# # links = re.findall('href="(.*?)"', html, re.S)

# # #print(links) #單獨打印報錯

# # SyntaxError: Non-ASCII character '\xe7' in file D:/python_test/hello_word on line 50, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

# # 解決辦法 作#-*-coding:utf8-*-聲明

--------------------

但編程的時候 遇到如下幾個問題 後面學習的話 我將帶着問題去學習 固然若是有朋友能夠幫忙解答，那就更好啦

--<1>---

# # title = re.search('<title>(.*?)</title>', html, re.S).group(1)

#爲何要加group不加就是內存地址？

--<2>---

# text = re.findall('<ul>(.*?)</ul>', html, re.S)[0] #爲何必定要加[0]?

--<3>---

# print('下載漫畫中：',each) #'中文顯示爲\xe4\xb8\x8b\xe8\xbd\xbd\xe6\xbc\xab\xe7\x94\xbb\xe4\xb8\xad\xef\xbc\x9a？？？ 開始還提取了片頭照片

附上個人把漫畫的教程

# -*-coding:utf8-*-  import re  import requests

g = open('shuhui.txt','r')
htmls = g.read()
g.close()
pics_url = re.findall('<img src="(.*?)" alt="', htmls)

i = 1 for each in pics_url: print('now is downloading', each)
    pics = requests.get(each)
    fb = open('pics\\' + str(i) + '.jpg', 'wb')
    fb.write(pics.content)
    fb.close()
    i += 1

1. 今日學習總結
2. 3.7日學習總結
3. 2.21日學習總結
4. 2.26日學習總結
5. 2.25日學習總結
6. 2.20日學習總結
7. 2.24日學習總結
8. 學習ios每日總結
9. 2.22日學習總結
10. 2.27日學習總結
更多相關文章...
• XML 總結下一步學習什麼呢？ - XML 教程
• 您已經學習了 XML Schema，下一步學習什麼呢？ - XML Schema 教程
• Tomcat學習筆記（史上最全tomcat學習筆記）
• 適用於PHP初學者的學習線路和建議

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。