深度學習----實現一個博弈型的AI,從五子棋開始

很久沒有寫過博客了,多久,大概8年???最近從新把寫做這事兒撿起來……最近在折騰AI,寫個AI相關的給團隊的小夥伴們看吧。html

 

搞了這麼多年的機器學習,從分類到聚類,從樸素貝葉斯到SVM,從神經網絡到深度學習,各類神祕的項目裏用了無數次,可是感受乾的各類事情離咱們生活仍是太遠了。最近AlphaGo Zero的發佈,深度學習又火了一把,小夥伴們按捺不住心裏的躁動,要搞一個遊戲AI,好吧,那就從規則簡單、老小皆宜的五子棋開始講起。前端

 

好了,廢話就說這麼多,下面進入第一講,實現一個五子棋。數組

 

小夥伴:此處省去吐槽一萬字,說好的講深度學習,怎麼開始扯實現一個五子棋程序了,大哥你不按套路出牌啊……網絡

我:工欲善其事必先利其器,要實現五子棋的AI,連棋都沒有,AI個錘子!框架

老羅:什麼事?機器學習

……函數

 

五子棋分爲有禁手和無禁手,咱們先實現一個普通版本的無禁手版本做爲例子,由於這個不影響咱們實現一個AI。補充說明一下,無禁手黑棋必勝,通過比賽和各類研究,人們逐漸知道了這個事實就開始想辦法來限制黑棋先手優點。因而出現了有禁手規則,規定黑棋不能下三三,四四和長連。但隨着比賽的結果的研究的繼續進行,發現其實即便是對黑棋有禁手限制,仍是不能阻止黑棋開局必勝的事實,像直指開局中花月,山月,雲月,溪月,寒星等,斜指開局中的名月,浦月,恆星,峽月,嵐月都是黑棋必勝。因而日本人繼續提出了交換和換打的思想,到了後來發展成了國際比賽中三手交換和五手二打規則,防止執黑者下出必勝開局或者在第五手下出必勝打。因此結論是,在不正規的比賽規則或者無禁手狀況下,黑棋必勝是存在的。post

 

(1)五子棋下棋邏輯實現學習

這裏用Python來實現,由於以後的機器學習庫也是Python的,方便一點。this

界面和邏輯要分開,解耦合,這個是毋庸置疑的,而且以後還要訓練AI,分離這是必須的。因此咱們先來實現一個五子棋的邏輯。

咱們先來考慮五子棋是一個15*15的棋盤,棋盤上的每個交叉點(或格子)上一共會有3種狀態:空白、黑棋、白棋,因此先建個文件 consts.py

作以下定義:

複製代碼
from enum import Enum

N = 15

class ChessboardState(Enum):
    EMPTY = 0
    BLACK = 1
    WHITE = 2
複製代碼

 

 

棋盤的狀態,咱們先用一個15*15的二維數組chessMap來表示,建一個類 gobang.py

currentI、currentJ、currentState 分別表示當前這步着棋的座標和顏色,再定義一個get和set函數,最基本的框架就出來了,代碼以下:

複製代碼
from enum import Enum
from consts import *

class GoBang(object):
    def __init__(self):
        self.__chessMap = [[ChessboardState.EMPTY for j in range(N)] for i in range(N)]
        self.__currentI = -1
        self.__currentJ = -1
        self.__currentState = ChessboardState.EMPTY

    def get_chessMap(self):
        return self.__chessMap

    def get_chessboard_state(self, i, j):
        return self.__chessMap[i][j]

    def set_chessboard_state(self, i, j, state):
        self.__chessMap[i][j] = state
        self.__currentI = i
        self.__currentJ = j
        self.__currentState = state
複製代碼

 

 

這樣界面端能夠調用get函數來獲取各個格子的狀態來決定是否繪製棋子,以及繪製什麼樣的棋子;每次下棋的時候呢,在對應的格子上,經過座標來設置棋盤Map的狀態。

因此最基本的展現和下棋,上面的邏輯就夠了,接下來幹什麼呢,得考慮每次下棋以後,set了對應格子的狀態,是否是須要判斷當前有沒有獲勝。因此還須要再加兩個函數來幹這個事情,思路就是從當前位置從東、南、西、北、東南、西南、西北、東北8個方向,4根軸,看是否有連續的大於5顆相同顏色的棋子出現。假設咱們目前落子在棋盤正中,須要判斷的位置以下圖所示的米字形。

 

 

那代碼怎麼寫呢,最最笨的辦法,按照字面意思來翻譯咯,好比橫軸,先看當前位置左邊有多少顆連續同色的,再看右邊有多少顆連續同色的,左邊加右邊,就是當前橫軸上的連續數,若是大於5,則勝利。

複製代碼
    def have_five(self, current_i, current_j):
        #四個方向計數 豎 橫 左斜 右斜
        hcount = 1

        temp = ChessboardState.EMPTY

        #H-左
        for j in range(current_j - 1, -1, -1):  #橫向往左 from (current_j - 1) to 0
            temp = self.__chessMap[current_i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            hcount = hcount + 1#H-右
        for j in range(current_j + 1, N):  #橫向往右 from (current_j + 1) to N
            temp = self.__chessMap[current_i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            hcount = hcount + 1#H-結果
        if hcount >= 5:
            return True
複製代碼

 

 

以此類推,再看豎軸、再看左斜、再看又斜,因而,have_five函數變成這樣了:

複製代碼
    def have_five(self, current_i, current_j):
        #四個方向計數 豎 橫 左斜 右斜
        hcount = 1
        vcount = 1
        lbhcount = 1
        rbhcount = 1

        temp = ChessboardState.EMPTY

        #H-左
        for j in range(current_j - 1, -1, -1):  #橫向往左 from (current_j - 1) to 0
            temp = self.__chessMap[current_i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            hcount = hcount + 1
        #H-右
        for j in range(current_j + 1, N):  #橫向往右 from (current_j + 1) to N
            temp = self.__chessMap[current_i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            hcount = hcount + 1
        #H-結果
        if hcount >= 5:
            return True
#V-上
        for i in range(current_i - 1, -1, -1):  # from (current_i - 1) to 0
            temp = self.__chessMap[i][current_j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            vcount = vcount + 1
        #V-下
        for i in range(current_i + 1, N):  # from (current_i + 1) to N
            temp = self.__chessMap[i][current_j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            vcount = vcount + 1
        #V-結果
        if vcount >= 5:
            return True
#LB-上
        for i, j in zip(range(current_i - 1, -1, -1), range(current_j - 1, -1, -1)):  
            temp = self.__chessMap[i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            lbhcount = lbhcount + 1
        #LB-下
        for i, j in zip(range(current_i + 1, N), range(current_j + 1, N)):  
            temp = self.__chessMap[i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            lbhcount = lbhcount + 1
        #LB-結果
        if lbhcount >= 5:
            return True
#RB-上
        for i, j in zip(range(current_i - 1, -1, -1), range(current_j + 1, N)):  
            temp = self.__chessMap[i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            rbhcount = rbhcount + 1
        #RB-下
        for i, j in zip(range(current_i + 1, N), range(current_j - 1, -1, -1)):  
            temp = self.__chessMap[i][j]
            if temp == ChessboardState.EMPTY or temp != self.__currentState:
                break
            rbhcount = rbhcount + 1
        #LB-結果
        if rbhcount >= 5:
            return True
複製代碼

 

 

這樣是否是就寫完了,五子棋的邏輯所有實現~ 

NO,別高興得太早,我想說,我好惡心,上面那個代碼,簡直醜爆了,再看一眼,重複的寫了這麼多for,這麼多if,這麼多重複的代碼塊,讓我先去吐會兒……

好了,想一想辦法怎麼改,至少分了4根軸,是重複的對不對,而後每根軸分別從正負兩個方向去統計,最後加起來,兩個方向,也是重複的對不對。

因而咱們能不能只寫一個方向的代碼,分別調2次,而後4根軸,分別再調4次,2*4=8,一共8行代碼搞定試試。

由於有45°和135°這兩根斜軸的存在,因此方向上應該分別從x和y兩個軸來控制正負,因而能夠這樣,先寫一個函數,按照方向來統計:

xdirection=0,ydirection=1       表示從y軸正向數;

xdirection=0,ydirection=-1     表示從y軸負向數;

xdirection=1,ydirection=1       表示從45°斜軸正向數;

……

不一一列舉了,再加上邊界條件的判斷,因而有了如下函數:

複製代碼
    def count_on_direction(self, i, j, xdirection, ydirection, color):
        count = 0
        for step in range(1, 5): #除當前位置外,朝對應方向再看4步
            if xdirection != 0 and (j + xdirection * step < 0 or j + xdirection * step >= N):
                break
            if ydirection != 0 and (i + ydirection * step < 0 or i + ydirection * step >= N):
                break
            if self.__chessMap[i + ydirection * step][j + xdirection * step] == color:
                count += 1
            else:
                break
        return count
複製代碼

 

 

因而乎,前面的have_five稍微長的好看了一點,能夠變成這樣:

複製代碼
def have_five(self, i, j, color):
        #四個方向計數 豎 橫 左斜 右斜
        hcount = 1
        vcount = 1
        lbhcount = 1
        rbhcount = 1

        hcount += self.count_on_direction(i, j, -1, 0, color)
        hcount += self.count_on_direction(i, j, 1, 0, color)
        if hcount >= 5:
            return True

        vcount += self.count_on_direction(i, j, 0, -1, color)
        vcount += self.count_on_direction(i, j, 0, 1, color)
        if vcount >= 5:
            return True

        lbhcount += self.count_on_direction(i, j, -1, 1, color)
        lbhcount += self.count_on_direction(i, j, 1, -1, color)
        if lbhcount >= 5:
            return True

        rbhcount += self.count_on_direction(i, j, -1, -1, color)
        rbhcount += self.count_on_direction(i, j, 1, 1, color)
        if rbhcount >= 5:
            return True
複製代碼

 

 

仍是一大排重複的代碼呀,我仍是以爲它醜啊,我真的不是處女座,可是這個函數是真醜啊,能不能讓它再帥一點,固然能夠,4個重複塊再收成一個函數,循環調4次,是否是能夠,好,就這麼幹,因而have_five就又漂亮了一點點:

複製代碼
    def have_five(self, i, j, color):
        #四個方向計數 豎 橫 左斜 右斜
        directions = [[(-1, 0), (1, 0)], \
                      [(0, -1), (0, 1)], \
                      [(-1, 1), (1, -1)], \
                      [(-1, -1), (1, 1)]]

        for axis in directions:
            axis_count = 1
            for (xdirection, ydirection) in axis:
                axis_count += self.count_on_direction(i, j, xdirection, ydirection, color)
                if axis_count >= 5:
                    return True

        return False
複製代碼

 

 

嗯,感受好多了,這下判斷是否有5顆相同顏色棋子的邏輯也有了,再加一個函數來給界面層返回結果,邏輯部分的代碼就差很少了:

    def get_chess_result(self):
        if self.have_five(self.__currentI, self.__currentJ, self.__currentState):
            return self.__currentState
        else:
            return ChessboardState.EMPTY

 

 

因而,五子棋邏輯代碼就寫完了,完整代碼 gobang.py 以下:

複製代碼
#coding:utf-8

from enum import Enum
from consts import *

class GoBang(object):
    def __init__(self):
        self.__chessMap = [[ChessboardState.EMPTY for j in range(N)] for i in range(N)]
        self.__currentI = -1
        self.__currentJ = -1
        self.__currentState = ChessboardState.EMPTY

    def get_chessMap(self):
        return self.__chessMap

    def get_chessboard_state(self, i, j):
        return self.__chessMap[i][j]

    def set_chessboard_state(self, i, j, state):
        self.__chessMap[i][j] = state
        self.__currentI = i
        self.__currentJ = j
        self.__currentState = state

    def get_chess_result(self):
        if self.have_five(self.__currentI, self.__currentJ, self.__currentState):
            return self.__currentState
        else:
            return ChessboardState.EMPTY

    def count_on_direction(self, i, j, xdirection, ydirection, color):
        count = 0
        for step in range(1, 5): #除當前位置外,朝對應方向再看4步
            if xdirection != 0 and (j + xdirection * step < 0 or j + xdirection * step >= N):
                break
            if ydirection != 0 and (i + ydirection * step < 0 or i + ydirection * step >= N):
                break
            if self.__chessMap[i + ydirection * step][j + xdirection * step] == color:
                count += 1
            else:
                break
        return count

    def have_five(self, i, j, color):
        #四個方向計數 豎 橫 左斜 右斜
        directions = [[(-1, 0), (1, 0)], \
                      [(0, -1), (0, 1)], \
                      [(-1, 1), (1, -1)], \
                      [(-1, -1), (1, 1)]]

        for axis in directions:
            axis_count = 1
            for (xdirection, ydirection) in axis:
                axis_count += self.count_on_direction(i, j, xdirection, ydirection, color)
                if axis_count >= 5:
                    return True

        return False
複製代碼

 

背景音:大哥,憋了半天,就憋出這麼不到60行代碼?

我:代碼再也不多,實現則靈……

 

明天來給它加個render,前端界面就有了,就是一個簡單的完整遊戲了,至於AI,別急嘛。

好吧,就這樣…

 

出處:http://www.cnblogs.com/erwin/p/7828956.html

相關文章
相關標籤/搜索