設置爬蟲的User-agent
urllib默認是Python-urllib/2.7,並且不能修改
html
>>> import urllib >>> url = "http://127.0.0.1" >>> request = urllib2.Request(url) >>> print request.get_header('User-agent') Python-urllib/2.7 >>> request.add_head >>> request.add_headers = ('User-agent','Mozilla/5.0') >>> print request.get_header('User-agent') Python-urllib/2.7
urllib默認是空值,能夠修改
python
>>> import urllib2 >>> url = "http://127.0.0.1" >>> request = urllib2.Request(url) >>> print request.get_header('User-agent') None >>> request.add_header('User-agent','Mozilla/5.0') >>> print request.get_header('User-agent') Mozilla/5.0
參考:http://www.cnblogs.com/semmin/archive/2012/05/29/2523983.html
url