修改urllib2源代碼,定製User-Agent,一勞永逸

我常常用到urllib2這個庫,基本上每次都要添加 User-Agent 爲一個模擬瀏覽器的值。html

 

忽然想到,能不能直接修改源代碼,添加 User-Agent 的值。python

 

google 到 https://docs.python.org/2/library/urllib2.htmlvim

其中有解釋說:瀏覽器

headers should be a dictionary, and will be treated as if add_header() was called with each key and value as arguments. This is often used to 「spoof」 the User-Agent header, which is used by a browser to identify itself – some HTTP servers only allow requests coming from common browsers as opposed to scripts. For example, Mozilla Firefox may identify itself as "Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127Firefox/2.0.0.11", while urllib2‘s default user agent string is "Python-urllib/2.6" (on Python 2.6).ide

User-Agent是有默認值的,並且與版本有關。測試

 

定位urllib2.pygoogle

 

而後直接vim中查找 Python-urllib/url

在310 行找到了,默認是spa

client_version = "Python-urllib/%s" % __version__

其中的 __version__ 就是python的版本號,代碼在120 行,我修改的時候直接忽略了。code

 

修改後:

client_version =  'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36'

很簡單吧。

測試一下

相關文章
相關標籤/搜索