2048 A.I. 在 stackoverflow 上有個討論:http://stackoverflow.com/questions/22342854/what-is-the-optimal-algorithm-for-the-game-2048html
得票最高的回答是基於 Min-Max-Tree + alpha beta 剪枝,啓發函數的設計很優秀。node
其實也能夠不用設計啓發函數就寫出 A.I. 的,我用的方法是圍棋 A.I. 領域的經典算法——Monte Carlo 局面評估 + UCT 搜索。git
算法的介紹見我幾年前寫的一篇博文:http://www.cnblogs.com/qswang/archive/2011/08/28/2360489.htmlgithub
簡而言之就兩點:算法
針對2048遊戲,我對算法作了一個改動——把 Minx-Max-Tree 改成 Random-Max-Tree,由於增長數字是隨機的,而不是理性的博弈方,因此猜測 Min-Max-Tree 容易傾向過度保守的博弈策略,而不敢追求更大的成果。dom
UCT搜索的代碼:函數
Orientation UctPlayer::NextMove(const FullBoard& full_board) const { int mc_count = 0; while (mc_count < kMonteCarloGameCount) { FullBoard current_node; Orientation orientation = MaxUcbMove(full_board); current_node.Copy(full_board); current_node.PlayMovingMove(orientation); NewProfit(¤t_node, &mc_count); } return BestChild(full_board); }
NewProfit函數用於更新該節點到某葉子節點的記錄,是遞歸實現的:ui
float UctPlayer::NewProfit(board::FullBoard *node, int* mc_count) const { float result; HashKey hash_key = node->ZobristHash(); auto iterator = transposition_table_.find(hash_key); if (iterator == transposition_table_.end()) { FullBoard copied_node; copied_node.Copy(*node); MonteCarloGame game(move(copied_node)); if (!HasGameEnded(*node)) game.Run(); result = GetProfit(game.GetFullBoard()); ++(*mc_count); NodeRecord node_record(1, result); transposition_table_.insert(make_pair(hash_key, node_record)); } else { NodeRecord *node_record = &(iterator->second); int visited_times = node_record->VisitedTimes(); if (HasGameEnded(*node)) { ++(*mc_count); result = node_record->AverageProfit(); } else { AddingNumberRandomlyPlayer player; AddingNumberMove move = player.NextMove(*node); node->PlayAddingNumberMove(move); Orientation max_ucb_move = MaxUcbMove(*node); node->PlayMovingMove(max_ucb_move); result = NewProfit(node, mc_count); float previous_profit = node_record->AverageProfit(); float average_profit = (previous_profit * visited_times + result) / (visited_times + 1); node_record->SetAverageProfit(average_profit); } node_record->SetVisitedTimes(visited_times + 1); } return result; }
起初用結局的最大數字做爲得分,後來發現當跑到512後,Monte Carlo棋局的結果並不會出現更大的數字,各個節點變得沒有區別。因而做了改進,把移動次數做爲得分,大爲改善。this
整個程序的設計分爲 board、player、game 三大模塊,board 負責棋盤邏輯,player 負責移動或增長數字的邏輯,game把board和player連起來。spa
Game類的聲明以下:
class Game { public: typedef std::unique_ptr<player::AddingNumberPlayer> AddingNumberPlayerUniquePtr; typedef std::unique_ptr<player::MovingPlayer> MovingPlayerUniquePtr; Game(Game &&game) = default; virtual ~Game(); const board::FullBoard& GetFullBoard() const { return full_board_; } void Run(); protected: Game(board::FullBoard &&full_board, AddingNumberPlayerUniquePtr &&adding_number_player, MovingPlayerUniquePtr &&moving_player); virtual void BeforeAddNumber() const { } virtual void BeforeMove() const { } private: board::FullBoard full_board_; AddingNumberPlayerUniquePtr adding_number_player_unique_ptr_; MovingPlayerUniquePtr moving_player_unique_ptr_; DISALLOW_COPY_AND_ASSIGN(Game); };
Run函數的實現:
void Game::Run() { while (!HasGameEnded(full_board_)) { if (full_board_.LastForce() == Force::kMoving) { BeforeAddNumber(); AddingNumberMove move = adding_number_player_unique_ptr_->NextMove(full_board_); full_board_.PlayAddingNumberMove(move); } else { BeforeMove(); Orientation orientation = moving_player_unique_ptr_->NextMove(full_board_); full_board_.PlayMovingMove(orientation); } } }
這樣就能夠經過繼承 Game 類,實現不一樣的構造函數,組合出不一樣的 Game,好比 MonteCarloGame 的構造函數:
MonteCarloGame::MonteCarloGame(FullBoard &&full_board) : Game(move(full_board), std::move(Game::AddingNumberPlayerUniquePtr( new AddingNumberRandomlyPlayer)), std::move(Game::MovingPlayerUniquePtr(new MovingRandomlyPlayer))) {}
一個新的2048棋局,會先放上兩個數字,新棋局應該能方便地build。默認應該隨機地增長兩個數字,builder 類能夠這麼寫:
template<class G> class NewGameBuilder { public: NewGameBuilder(); ~NewGameBuilder() = default; NewGameBuilder& SetLastForce(board::Force last_force); NewGameBuilder& SetAddingNumberPlayer(game::Game::AddingNumberPlayerUniquePtr &&initialization_player); G Build() const; private: game::Game::AddingNumberPlayerUniquePtr initialization_player_; }; template<class G> NewGameBuilder<G>::NewGameBuilder() : initialization_player_(game::Game::AddingNumberPlayerUniquePtr( new player::AddingNumberRandomlyPlayer)) { } template<class G> NewGameBuilder<G>& NewGameBuilder<G>::SetAddingNumberPlayer( game::Game::AddingNumberPlayerUniquePtr &&initialization_player) { initialization_player_ = std::move(initialization_player); return *this; } template<class G> G NewGameBuilder<G>::Build() const { board::FullBoard full_board; for (int i = 0; i < 2; ++i) { board::AddingNumberMove move = initialization_player_->NextMove(full_board); full_board.PlayAddingNumberMove(move); } return G(std::move(full_board)); }
好久之前,高效的 C++ 代碼不提倡在函數中 return 靜態分配內存的對象,如今有了右值引用就方便多了。
main 函數:
int main() { InitLogConfig(); AutoGame game = NewGameBuilder<AutoGame>().Build(); game.Run(); }
./fool2048:
這個A.I.的移動不像基於人爲設置啓發函數的A.I.那麼有規則,不會把最大的數字固定在角落,但最後也能有相對不錯的結果,遊戲過程更具觀賞性~
項目地址:https://github.com/chncwang/fool2048
最後發個招聘連接:http://www.kujiale.com/about/join
我這塊的工做主要是站內搜索、推薦算法等,歡迎牛人投簡歷到hr郵箱~