讀書筆記: 博弈論導論 - 12 - 不完整信息的靜態博弈貝葉斯博弈

時間 2020-07-21

原文原文鏈接

讀書筆記: 博弈論導論 - 12 - 不完整信息的靜態博弈貝葉斯博弈

貝葉斯博弈(Bayesian Games)

本文是Game Theory An Introduction (by Steven Tadelis) 的學習筆記。html

不完整信息的靜態博弈(Incomplete information static games)

不完整信息博弈意味着玩家之間缺少共識(common knowledge)，具體指的是其它對手的行動集、結果集和收益函數等信息。對不完整信息博弈的處理方法來自於Harsanyi。他引進了兩個概念來解決這個問題。 type space: 將對手隱藏的信息(行動集、結果集和收益函數等)轉化爲多個types，每一個type中的信息都是可知的。 belief: 因爲不知道對手的具體type是什麼，所以使用分佈機率表示對手選擇某個type的可能性。這樣就能夠經過幾率統計來計算可能的收益。函數

靜態不完整信息貝葉斯博弈(static Bayesian game of incomplete information)的normal-form描述 $$ \left \langle N, { A_i }{i=1}^n, { \Theta_i }{i=1}^n, { v_i(\cdot; \theta_i), \theta_i \in \Theta_i }{i=1}^n, { \phi_i }{i=1}^n \right \rangle \ where \ N = { 1,2,\cdots, n} \text{ : is the set of players} \ A_i \text{ : the action set of player i} \ \Theta_i \text{ : the type space of player i} \ v_i : A \times \Theta_i \to \mathbb{R} \text{ : type dependent pay of function of player i} \ \phi \text{ : the belief of player i with respect to the uncertainty over the other players' types} \ \phi(\theta_{-i} | \theta_i) \text{ : the posterior conditional distribution on } \theta_{-i} $$post
靜態不完整信息貝葉斯博弈處理流程：學習
1. 天然選擇一個類型組合(profile of types)$\theta_1, \theta_2, \cdots, \theta_n$。
2. 每一個玩家知道本身$\theta_i$，使用先前的$\phi_i$來造成對對手type的分佈機率。
3. 玩家選擇行動。
4. 根據玩家們的行動$a = (a_i, a_2, \cdots, a_n)$，能夠或者收益$v_i(a; \theta)$.
條件機率(conditional probability) 當事件S發生時，事件H發生的條件機率爲： $$ \Pr{H|S} = \frac{\phi(S \land H)}{\phi(S)} $$ui
靜態不完整信息貝葉斯博弈 - 純策略 $$ \left \langle N, { A_i }{i=1}^n, { \Theta_i }{i=1}^n, { v_i(\cdot; \theta_i), \theta_i \in \Theta_i }{i=1}^n, { \phi_i }{i=1}^n \right \rangle \ $$ 玩家i的一個純策略$s_i(\theta_i) \to a_i$spa
靜態不完整信息貝葉斯博弈 - 混合策略玩家i的一個混合策略是一個在純策略之上的機率分佈。orm
靜態不完整信息貝葉斯博弈 - 純策略貝葉斯納什均衡(pure-strategy Bayesian Nash equilibrium) 一個純策略貝葉斯納什均衡$s^* = (s_1^, \cdots, s_n^)$，若是對於每一個玩家i，每一個玩家的類型$\theta_i \in \Theta_i$，每一個行動$a_i \in A_i$，知足： $$ \sum_{\theta_{-i} \in \Theta_{-i}} \phi_i(\theta_{-i}|\theta_i) v_i(s_i^(\theta_i), s_{-i}^(\theta_{-i});\theta_i) \geq \sum_{\theta_{-i} \in \Theta_{-i}} \phi_i(\theta_{-i}|\theta_i) v_i(a_i, s_{-i}^(\theta_{-i});\theta_i) \ where \ v_i(a_i, s_{-i}^(\theta_{-i});\theta_i) \text{ : only on type } \theta_i \text{, the player i's payoff function} $$ 其含義：對於每一個玩家，其行動$s_i^*(\theta_i)$的分佈機率收益總和老是最大的。htm