教你用python擼走《百萬英雄》《衝頂大會》獎金。

時間 2019-11-16

標籤 python 百萬英雄衝頂大會獎金欄目 Python 简体版

原文原文鏈接

百萬英雄類答題遊戲的程序員打開方式

最近這類答題app比較火，個人同事wangtonghe爲開源社區貢獻了他的python代碼。如下文章爲他的思路，我只作了部分整理髮佈於掘金社區，分享給你們。html

原由

看了《程序員如何玩轉《衝頂大會》？》大受啓發，不過弱點不少，須要使用付費的OCR接口、再open到百度搜索答案，咱們等待加載而且尋找答案的時候，已經錯失了好的機會，恰好前幾天研究了下微信跳一跳的輔助，正好能夠用上。html5

-初步思路python

思路很明確，把答案截圖pull過來，經過PYTHON OCR 庫進行識別成文字後再放到百度搜索。匹配出現率最頻繁的詞語，記過幾番嘗試後，一些容易搜索的問題仍是是能夠搜索大部分答案的。git

嘗試

目前它是手動的，也就是說每次答案出現，手動執行腳本返回答案。一樣因爲個別題目緣由（如某個詞有多少筆畫）雖然不是百分之百的成功率，可是通常都能進入決賽+一張復活卡基本妥妥‘吃雞’，下面是吃雞截圖：程序員

技術棧

實現語言python,用到的類庫以下：github

PIL
pytesseract(圖片識別庫)
BeautifulSoup（頁面解析）

文字識別引擎需單獨安裝，參見Python人工智能之圖片識別，Python3一行代碼實現圖片文字識別以及mac上文字識別 Tesseract-OCR for macshell

主體代碼以下：微信

import os
from PIL import Image
import pytesseract
from urllib.request import urlopen
import urllib.request
from bs4 import BeautifulSoup

DEFAULT_WIDTH = 720
DEFAULT_HEIGHT = 1280


def main():
    # 720*1280分辨率座標
    left_top_x = 30
    left_top_y = 200
    right_bottom_x = 680
    right_bottom_y = 380

    # 1. 截圖
    os.system('adb shell screencap -p /sdcard/answer.png')
    os.system('adb pull /sdcard/answer.png answer.png')

    # 2. 截取題目並文字識別
    image = Image.open('answer.png')
    crop_img = image.crop((left_top_x, left_top_y, right_bottom_x, right_bottom_y))
    crop_img.save('crop.png')
    text = pytesseract.image_to_string(crop_img, lang='chi_sim')
    print(text)

    # 3. 去百度知道搜索
    text = text[2:]  # 把題號去掉
    # text = '一畝地大約是多少平米'
    wd = urllib.request.quote(text)
    url = 'https://zhidao.baidu.com/search?ct=17&pn=0&tn=ikaslist&rn=10&fr=wwwt&word={}'.format(
        wd)
    print(url)
    result = urlopen(url)
    body = BeautifulSoup(result.read(), 'html5lib')
    good_result_div = body.find(class_='list-header').find('dd')
    second_result_div = body.find(class_='list-inner').find(class_='list')
    if good_result_div is not None:
        good_result = good_result_div.get_text()
        print(good_result.strip())

    if second_result_div is not None:
        second_result = second_result_div.find('dl').find('dd').get_text()
        print(second_result.strip())


if __name__ == '__main__':
    main()

複製代碼