[No.003-1]爬蟲網易賠率數據並導入到mysql數據庫

時間 2021-04-11

標籤 python web app url spa .net code blog 隊列欄目網絡爬蟲简体版

原文原文鏈接

#encoding:utf-8
import urllib2
from bs4 import BeautifulSoup

website = "http://caipiao.163.com/order/jczq-hunhe/#from=leftnav"
page = urllib2.urlopen(website)
soup = BeautifulSoup(page)


'''
獲取場次以及分數合集好比
比分對應代碼表：
11對應：1:1
70對應：勝其餘
77對應：平其餘
07對應：負其餘
所以場次和比分結合爲，017-10,017-20,017-21
'''
#場次信息 screening
i = 1
screening = []
for item in soup.findAll("span",{"class":"co1"}):
    screening.append(item.i.string+'\n')
    i+=1

sc = open('sc.txt','w')
sc.writelines(screening)
sc.close()

#比分標題 bifen
bifen=["1:0","2:0","2:1","3:0","3:1","3:2","4:0","4:1","4:2","5:0","5:1","5:2","勝其餘","0:0","1:1","2:2","3:3","平其餘","0:1","0:2","1:2","0:3","1:3","2:3","0:4","1:4","2:4","0:5","1:5","2:5","負其餘"]

#場次+比分：ccbf
ccbf = []
for item_jtip in screening:
    for item_bifen in bifen:
        ccbf.append(item_jtip+item_bifen)

#以後遍歷ccbf
for item in ccbf:
    print item

#獲得結果集如(場次爲3位數字，第一個爲主場比分，中間爲冒號，最後一個爲客場比分)：
0281:1
0282:2
0283:3
028平其餘
0280:1
0280:2
0281:2
0280：3
0281:3
0282:3
0280:4
0281:4
0282:4
0280:5
0281:5

#----------------
'''
這裏由於使用的是python2，因此須要將str->Unicode
具體參考：
http://blog.csdn.net/mindmb/article/details/7898528
'''

#創建比分賠率字典bfpl
#獲取比分賠率
bfpl = []
for item in soup.findAll("td",{"gametype":"bf"}):
    bfpl.append(item.find("div").string+'\n')

#寫入到文件bf.txt
bf = open('bf.txt','w')
bf.writelines(bfpl)
bf.close()

#組合
bfdata = {}
bf = dict(zip(ccbf,bfpl))
#--------------------
#出現錯誤！！！
#bfpl獲取的數量和ccbf數量不一致，從新使用一個隊列，同時獲取場次和比分的賠率數據放置在一個隊列中