看了網上不少的教程都是經過OCR識別的,這種方法的優勢在於通用性強。不一樣的答題活動均可以參加,可是缺點也明顯,速度有限,而且若是經過調用第三方OCR,有次數限制。可是使用本教程提到的數據接口。咱們能很容易的獲取數據,速度快,可是接口是變化的,須要及時更新。javascript
百萬英雄答題是一個最近很火爆的答題軟件,答對12題的人,能夠平分最後的獎金。獎金不錯,筆者參加過幾回,不過得到的都是小獎,最後幾塊錢的那種。對於不難的題目,可以直接百度出答案的題目,若是有個軟件輔助實時給出參考,仍是一件很舒服的事情。想幹就幹,走起!html
先看下部署效果,經過服務器後端處理,經過前端顯示,親測延時3s:前端
爲啥作成這樣呢?由於這樣,別的人也能夠經過瀏覽器進行訪問,獨樂不如衆樂嘛!java
Github開源地址:https://github.com/Jack-Cherish/python-spidernode
對於如何抓包,我想應該都會了,我在手機APP抓包教程中有詳細講解,若有不會的,請暫時移步:http://blog.csdn.net/c406495762/article/details/76850843python
在比賽答題的時候,咱們能夠經過抓包,找到這樣的接口(點擊放大):jquery
能夠看到,參數如上圖所示。其中heartbeat後面的參數是一個隨着場次的增長,逐漸增長的一個數,後面其餘的例如iid和device_id是每一個人的用戶信息,在接口的最後,有個rticket參數,這個是一個時間戳,能夠經過time.time()模擬。git
2018-1-17更新:據朋友反應,url的有效參數只有heartbeat和rticket參數,用戶信息能夠不填寫。github
注意:只有在答題直播開始的時候,才能經過接口抓取到數據,沒有直播的時候,是獲取不到數據的,是亂碼。後端
經過這個接口獲取數據,而後對數據進行解析,在經過百度知道索問題,簡單高效。有了這個思想,就能夠開始寫代碼了。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
|
# -*-coding:utf-8 -*-
import requests
from lxml import etree
from bs4 import BeautifulSoup
import urllib
import time, re, types, os
"""
代碼寫的匆忙,原本想再重構下,完善好註釋再發,可是比較忙,想一想算了,因此自行完善吧!寫法很不規範,勿見怪。
做者: Jack Cui
Website:https://cuijiahua.com
注: 本軟件僅用於學習交流,請勿用於任何商業用途!
"""
class BaiWan():
def __init__(self):
# 百度知道搜索接口
self.baidu = 'http://zhidao.baidu.com/search?'
# 百萬英雄及接口,每一個人的接口都不同,裏面包含的手機信息,所以不公佈,請自行抓包,有疑問歡迎留言:https://cuijiahua.com/liuyan.html
self.api = 'https://api-spe-ttl.ixigua.com/xxxxxxx={}'.format(int(time.time()*1000))
# 獲取答案並解析問題
def get_question(self):
to = True
while to:
list_dir = os.listdir('./')
if 'question.txt' not in list_dir:
fw = open('question.txt', 'w')
fw.write('百萬英雄還沒有出題請稍後!')
fw.close()
go = True
while go:
req = requests.get(self.api, verify=False)
req.encoding = 'utf-8'
html = req.text
print(html)
if '*' in html:
question_start = html.index('*')
try:
question_end = html.index('?')
except:
question_end = html.index('?')
question = html[question_start:question_end][2:]
if question != None:
fr = open('question.txt', 'r')
text = fr.readline()
fr.close()
if text != question:
print(question)
go = False
with open('question.txt', 'w') as f:
f.write(question)
else:
time.sleep(1)
else:
to = False
else:
to = False
temp = re.findall(r'[\u4e00-\u9fa5a-zA-Z0-9\+\-\*/]', html[question_end+1:])
b_index = []
print(temp)
for index, each in enumerate(temp):
if each == 'B':
b_index.append(index)
elif each == 'P' and (len(temp) - index) <= 3 :
b_index.append(index)
break
if len(b_index) == 4:
a = ''.join(temp[b_index[0] + 1:b_index[1]])
b = ''.join(temp[b_index[1] + 1:b_index[2]])
c = ''.join(temp[b_index[2] + 1:b_index[3]])
alternative_answers = [a,b,c]
if '下列' in question:
question = a + ' ' + b + ' ' + c + ' ' + question.replace('下列', '')
elif '如下' in question:
question = a + ' ' + b + ' ' + c + ' ' + question.replace('如下', '')
else:
alternative_answers = []
# 根據問題和備選答案搜索答案
self.search(question, alternative_answers)
time.sleep(1)
def search(self, question, alternative_answers):
print(question)
print(alternative_answers)
infos = {"word":question}
# 調用百度接口
url = self.baidu + 'lm=0&rn=10&pn=0&fr=search&ie=gbk&' + urllib.parse.urlencode(infos, encoding='GB2312')
print(url)
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.86 Safari/537.36',
}
sess = requests.Session()
req = sess.get(url = url, headers=headers, verify=False)
req.encoding = 'gbk'
# print(req.text)
bf = BeautifulSoup(req.text, 'lxml')
answers = bf.find_all('dd',class_='dd answer')
for answer in answers:
print(answer.text)
# 推薦答案
recommend = ''
if alternative_answers != []:
best = []
print('\n')
for answer in answers:
# print(answer.text)
for each_answer in alternative_answers:
if each_answer in answer.text:
best.append(each_answer)
print(each_answer,end=' ')
# print(answer.text)
print('\n')
break
statistics = {}
for each in best:
if each not in statistics.keys():
statistics[each] = 1
else:
statistics[each] += 1
errors = ['沒有', '不是', '不對', '不正確','錯誤','不包括','不包含','不在','錯']
error_list = list(map(lambda x: x in question, errors))
print(error_list)
if sum(error_list) >= 1:
for each_answer in alternative_answers:
if each_answer not in statistics.items():
recommend = each_answer
print('推薦答案:', recommend)
break
elif statistics != {}:
recommend = sorted(statistics.items(), key=lambda e:e[1], reverse=True)[0][0]
print('推薦答案:', recommend)
# 寫入文件
with open('file.txt', 'w') as f:
f.write('問題:' + question)
f.write('\n')
f.write('*' * 50)
f.write('\n')
if alternative_answers != []:
f.write('選項:')
for i in range(len(alternative_answers)):
f.write(alternative_answers[i])
f.write(' ')
f.write('\n')
f.write('*' * 50)
f.write('\n')
f.write('參考答案:\n')
for answer in answers:
f.write(answer.text)
f.write('\n')
f.write('*' * 50)
f.write('\n')
if recommend != '':
f.write('最終答案請自行斟酌!\t')
f.write('推薦答案:' + sorted(statistics.items(), key=lambda e:e[1], reverse=True)[0][0])
if __name__ == '__main__':
bw = BaiWan()
bw.get_question()
|
獲取數據和查找答案就是這樣,很簡單,代碼寫的也較爲凌亂,大牛能夠按照這個思路改一改。
沒作事後端和前端,花了一天時間,現學現賣弄好的,javascript也是現看現用,百度的程序,調試調試而已。可能有不少用法比較low的地方,用法不對,請勿見怪,有大牛感興趣,能夠自行完善。
這是我當時看的一些文章:
Node.js和Socket.IO通訊基礎:菜鳥學習nodejs--Socket.IO即時通信
Node.js逐行讀取txt文件:Line-Reader
Node.js定時任務:Node-Schedule
後端app.js:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
|
var http = require('http');
var fs = require('fs');
var schedule = require("node-schedule");
var message = {};
var count = 0;
var server = http.createServer(function (req,res){
fs.readFile('./index.html',function(error,data){
res.writeHead(200,{'Content-Type':'text/html'});
res.end(data,'utf-8');
});
}).listen(80);
console.log('Server running!');
var lineReader = require('line-reader');
function messageGet(){
lineReader.eachLine('file.txt', function(line, last) {
count++;
var name = 'line' + count;
console.log(name);
console.log(line);
message[name] = line;
});
if(count == 25){
count = 0;
}
else{
for(var i = count+1; i <= 25; i++){
var name = 'line' + i;
message[name] = 'f';
}
count = 0;
}
}
var io = require('socket.io').listen(server);
var rule = new schedule.RecurrenceRule();
var times = [];
for(var i=1; i<1800; i++){
times.push(i);
}
rule.second = times;
schedule.scheduleJob(rule, function(){
messageGet();
});
io.sockets.on('connection',function(socket){
// console.log('User connected' + count + 'user(s) present');
socket.emit('users',message);
socket.broadcast.emit('users',message);
socket.on('disconnect',function(){
console.log('User disconnected');
//socket.broadcast.emit('users',message);
});
});
|
前端index.html:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
|
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<meta http-equiv="refresh" content="2">
<title>Jack Cui答題輔助系統</title>
</head>
<body>
<h1>百萬英雄答題輔助系統</h1>
<p id="line1"></p>
<p id="line2"></p>
<p id="line3"></p>
<p id="line4"></p>
<p id="line5"></p>
<p id="line6"></p>
<p id="line7"></p>
<p id="line8"></p>
<p id="line9"></p>
<p id="line10"></p>
<p id="line11"></p>
<p id="line12"></p>
<p id="line13"></p>
<p id="line14"></p>
<p id="line15"></p>
<p id="line16"></p>
<p id="line17"></p>
<p id="line18"></p>
<p id="line19"></p>
<p id="line20"></p>
<p id="line21"></p>
<p id="line22"></p>
<p id="line23"></p>
<p id="line24"></p>
<p id="line25"></p>
<script src="https://222.222.124.77:9001/jquery.min.js"></script>
<script src="/socket.io/socket.io.js"></script>
<script>
var socket = io.connect('http://你的IP:端口');
var line1 = document.getElementById('line1');
var line2 = document.getElementById('line2');
var line3 = document.getElementById('line3');
var line4 = document.getElementById('line4');
var line5 = document.getElementById('line5');
var line6 = document.getElementById('line6');
var line7 = document.getElementById('line7');
var line8 = document.getElementById('line8');
var line9 = document.getElementById('line9');
var line10 = document.getElementById('line10');
var line11 = document.getElementById('line11');
var line12 = document.getElementById('line12');
var line13 = document.getElementById('line13');
var line14 = document.getElementById('line14');
var line15 = document.getElementById('line15');
var line16 = document.getElementById('line16');
var line17 = document.getElementById('line17');
var line18 = document.getElementById('line18');
var line19 = document.getElementById('line19');
var line20 = document.getElementById('line20');
var line21 = document.getElementById('line21');
var line22 = document.getElementById('line22');
var line23 = document.getElementById('line23');
var line24 = document.getElementById('line24');
var line25 = document.getElementById('line25');
socket.on('users',function(data){
if(data.line1 == 'f'){
line1.innerHTML = ''
}
else{
line1.innerHTML = data.line1
}
if(data.line2 == 'f'){
line2.innerHTML = ''
}
else{
line2.innerHTML = data.line2
}
if(data.line3 == 'f'){
line3.innerHTML = ''
}
else{
line3.innerHTML = data.line3
}
if(data.line4 == 'f'){
line4.innerHTML = ''
}
else{
line4.innerHTML = data.line4
}
if(data.line5 == 'f'){
line5.innerHTML = ''
}
else{
line5.innerHTML = data.line5
}
if(data.line6 == 'f'){
line6.innerHTML = ''
}
else{
line6.innerHTML = data.line6
}
if(data.line7 == 'f'){
line7.innerHTML = ''
}
else{
line7.innerHTML = data.line7
}
if(data.line8 == 'f'){
line8.innerHTML = ''
}
else{
line8.innerHTML = data.line8
}
if(data.line9 == 'f'){
line9.innerHTML = ''
}
else{
line9.innerHTML = data.line9
}
if(data.line10 == 'f'){
line10.innerHTML = ''
}
else{
line10.innerHTML = data.line10
}
if(data.line11 == 'f'){
line11.innerHTML = ''
}
else{
line11.innerHTML = data.line11
}
if(data.line12 == 'f'){
line12.innerHTML = ''
}
else{
line12.innerHTML = data.line12
}
if(data.line13 == 'f'){
line13.innerHTML = ''
}
else{
line13.innerHTML = data.line13
}
if(data.line14 == 'f'){
line14.innerHTML = ''
}
else{
line14.innerHTML = data.line14
}
if(data.line15 == 'f'){
line15.innerHTML = ''
}
else{
line15.innerHTML = data.line15
}
if(data.line16 == 'f'){
line16.innerHTML = ''
}
else{
line16.innerHTML = data.line16
}
if(data.line17 == 'f'){
line17.innerHTML = ''
}
else{
line17.innerHTML = data.line17
}
if(data.line18 == 'f'){
line18.innerHTML = ''
}
else{
line18.innerHTML = data.line18
}
if(data.line19 == 'f'){
line19.innerHTML = ''
}
else{
line19.innerHTML = data.line19
}
if(data.line20 == 'f'){
line20.innerHTML = ''
}
else{
line20.innerHTML = data.line20
}
if(data.line21 == 'f'){
line21.innerHTML = ''
}
else{
line21.innerHTML = data.line21
}
if(data.line22 == 'f'){
line22.innerHTML = ''
}
else{
line22.innerHTML = data.line22
}
if(data.line23 == 'f'){
line23.innerHTML = ''
}
else{
line23.innerHTML = data.line23
}
if(data.line24 == 'f'){
line24.innerHTML = ''
}
else{
line24.innerHTML = data.line24
}
if(data.line25 == 'f'){
line25.innerHTML = ''
}
else{
line25.innerHTML = data.line25
}
});
</script>
</body>
</html>
|
將這些部署到服務器上。這是個人部署效果:
部署好後。使用指令運行Node.js服務:
1
|
node app.js
|
運行python3腳本:
1
|
python3 baiwan.py
|
若是一切都搭建好了,那麼這個百萬英雄答題輔助系統就能夠運行了!