ijcccv11n4.dvi INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL ISSN 1841-9836, 11(4):472-479, August 2016. Direct Evolutionary Search for Nash Equilibria Detection R.I. Lung Rodica Ioana Lung Babeş-Bolyai University of Cluj-Napoca Faculty of Economics and Business Administration T. Mihali 58-60, Cluj-Napoca rodica.lung@econ.ubbcluj.ro Abstract: A Direct method of computing mixed form Nash equilibria of a normal form game by using a simple evolutionary algorithm is proposed. The Direct Evo- lutionary Search algorithm (DES) uses a generative relation for Nash equilibria with binary tournament selection and uniform mutation. Numerical experiments are used to illustrate the efficiency of the method. Keywords: Mixed form Nash equilibria, Evolutionary algorithms, generative relation 1 Introduction The problem of computing Nash equilibria of normal form games is one of the most challenging problems faced by computer scientists. The complexity of this problem - varying hugely from a type of game to another - is still studied. Normal form games can be solved by modern heuristics by transforming them into an op- timization or fixed point problem [2]. An interesting challenge consists on designing a method that solves the game directly, as a game and not as a corresponding optimization problem. The main problem in directly approaching the Nash equilibria is caused by the fact that there does not exist an order (or even preorder) relation defined for game situations that can guide a search operator towards the Nash equilibrium. While any kind of optimization endeavor is driven by a corresponding order (or preorder) relation defined in the objective space, in the case of Nash equilibria the lack of such a relation limits the design possibilities of computational heuristics. However, recently [1] a generative relation for Nash equilibria - called the Nash ascendancy relation - has been proposed. The Nash ascendancy relation is defined on strategy profiles. Even if it does not induce an actual order, numerical results indicate that it is capable of guiding a search operator towards Nash equilibria. Nash equilibria - definition and generative relation Nash equilibrium [4] is the most popular solution concept in noncooperative game theory. A finite strategic game is defined by a set o players, a set of strategies available to each player and a set of payoff functions for each player and denoted by Γ = (N,S,U) where: • N represents the set of players, N = {1, ....,n}, n is the number of players; • for each player i ∈ N, Si represents the set of actions available to him, Si = {si1,si2, ...,simi} where mi represents the number of strategies available to player i and S = S1×S2×...×SN is the set of all possible situations of the game; • for each player i ∈ N denote by pij the probability that he selects its j-th action, j ∈ {1, ...,mi}. Then Pi = (pi1, ...,pimi) represents a probability distribution over the set of actions of player i and P = (P1, ...,Pn) represents a mixed strategy profile for the game, where pij ∈ [0,1] and ∑mi j=1 pij = 1, ∀i = 1,N and ∀j = 1,mi; Copyright © 2006-2016 by CCC Publications Direct Evolutionary Search for Nash Equilibria Detection 473 • for each player i ∈ N, ui(P) represents the expected payoff for the mixed strategy P ; Denote by (Qi,P∗−i) the strategy profile obtained from P ∗ by replacing the probability dis- tribution of player i with Qi i.e. (Qi,P ∗ −i) = (P ∗ 1 ,P ∗ 2 , ...,P ∗ i−1,Qi,P ∗ i+1, ...,P ∗ n). A strategy profile P ∈ S for the game Γ represents a Nash equilibrium [2, 4] if no player has anything to gain by unilaterally changing his own strategy while the others do not modify theirs. Several methods to compute NE of a game have been developed. For a review on computing techniques for the NE see [2] and [6]. Nash ascendancy relation A generative relation for Nash equilibria is a relation between two strategy profiles that enables their comparison with respect to the Nash solution concept, i.e. it evaluates which one is "closer" to equilibrium. In [1] such a generative relation has been introduced and shown that solutions that are non-dominated/ascended with respect to this relation are exactly the Nash equilibria of the game. Consider two mixed strategy profiles P and Q and the operator κ that associates the cardi- nality of the set κ(P,Q) = |{i ∈{1, ...,n}|ui(Qi,P−i) ≥ ui(P),Qi 6= Pi}| to the pair (P,Q), i.e. the number of players i that would benefit from unilaterally switching their strategies from Pi to Qi. The operator κ can be used to induce a relation on the set of mixed strategy profiles in the following manner: we can say that the strategy profile P Nash ascends the strategy profile Q in and we write P ≺N Q if the inequality κ(P,Q) < κ(Q,P) holds, i.e. there are les players that can increase their payoffs by switching their strategy from P to Q than vice-versa. It can be said that strategy profile P is more stable (closer to equilibrium) then strategy Q. Regarding the Nash ascendancy relation, two strategy profiles P and Q may be in the fol- lowing situation: 1. P ascends Q: P ≺N Q (κ(P,Q) < κ(Q,P)) 2. Q ascends P : Q ≺N P (κ(P,Q) > κ(Q,P)) 3. or κ(P,Q) = κ(Q,P) and P and Q are considered indifferent (neither P ascends Q nor Q ascends P). The strategy profile P∗ is called non-ascended in Nash sense (NAS) if there does not exist any mixed strategy profile Q such that Q 6= P∗ and Q ≺N P∗. A very important result in [1] shows that for pure strategies all non-ascended strategies are NE and also all NE are non-ascended strategies. The proof in the case of mixed strategies is 474 R.I. Lung similar and direct. Thus the Nash ascendancy relation can be used to characterize the equilibria of a game and can be considered as a generative relation for NEs. The Nash ascendancy concept was introduced with the purpose to compare two strategy profiles [1] during the search of an evolutionary algorithm in order to compute NEs of a game. Consider two strategy profiles P∗ and P from ∆. Then k : ∆ × ∆ → N associates the pair (P∗,P) the cardinality of the set k(P∗,P) = card{i ∈{1, ...,n}|ui(Pi,P∗−i) > ui(P∗),Pi 6= P∗i }. This set is composed by the players i that would benefit if - given the strategy profile P∗ - would change their strategy from P∗i to Pi. It is obvious that for any P∗,P ∈ S, we have 0 ≤ k(P∗,P) ≤ n. Definition 1. Let P,Q ∈ ∆. We say the strategy profile P Nash ascends Q and we write P ≺ Q if the inequality k(P,Q) < k(Q,P), holds. Remark 1.1. Two strategy profiles P,Q ∈ ∆ can have the following relation: 1. either P Nash ascends Q, 2. either Q Nash ascends P, 3. if k(P,Q) = k(Q,P) then P and Q are indifferent Definition 2. A strategy profile P∗ ∈ S is called non-dominated with respect to the Nash ascendancy relation (NNS) if ∄Q ∈ ∆,Q 6= P∗such that Q ≺ P∗. Definition 3. The set of all Nash nondominated strategy profiles with respect to the Nash ascendancy relation is the set containing all nondominated strategies i.e. NND = {s ∈ S|s Nash non-dominated with respect to the Nash ascendancy relation} Proposition 1. A strategy profile P∗ ∈ ∆ is a NE iff the equality k(P∗,Q) = 0,∀Q ∈ ∆, holds. Proof: Let P∗ ∈ ∆ be a NE. Suppose there exists Q ∈ ∆ such that k(P∗,Q) = w, w ∈ {1, ...,n}. Therefore there exists i ∈ {1, ...,n} such that ui(Qi,P∗−i) > ui(P∗) and Qi 6= P∗i , which contradicts the definition of NE. For the second implication, let P∗ ∈ ∆ such that ∀Q ∈ ∆,k(P∗,Q) = 0. This means that for all i ∈{1, ...,n} and for any Qi ∈Pi we have ui(Qi,P∗−i) ≤ ui(P∗). It follows that P∗ is a NE. ✷ Proposition 2. All NE are Nash nondominated solutions (NND) i.e. NE ⊆ NND. Direct Evolutionary Search for Nash Equilibria Detection 475 Proof: Let P∗ ∈ ∆ be a NE. Suppose that there exists a strategy profile P ∈ ∆ such that P ≺ P∗. It follows that k(P,P∗) < k(P∗,P). But k(P∗,P) = 0, therefore we must have k(P,P∗) < 0 which is not possible since k(P,P∗) denotes the cardinality of a set. ✷ Proposition 3. All Nash nondominated solutions are NE, i.e. NND ⊆ NE. Proof: Let P∗ be a nondominated strategy profile. Suppose P∗ is not NE. Therefore there must exist (at least) one i ∈{1, ...,n} and a strategy Pi ∈Pi such that ui(Pi,P ∗ −i) > ui(P ∗), holds. Let’s denote by Q = (Pi,P∗−i). It means that k(P ∗,Q) = 1. But k(Q,P∗) = 0. Therefore k(Q,P∗) < k(P∗,Q) which means that Q ≺ P∗ thus the hypothesis that P∗ is nondominated is contradicted. ✷ Using propositions 2 and 3 it is obvious that the next result holds: Proposition 4. The following relation holds: NE = NND, i.e. all NE are also Nash nondominated and also all Nash nondominated strategies are NE. Direct evolutionary search The Direct Evolutionary Search (DES) algorithm is a simple evolutionary algorithm based on tournament selection and uniform mutation designed for Nash equilibria detection. Individuals represent strategy profiles of the game. Real valued encoding is used. Each individual is represented as a vector composed of ( ∑n i=1 mi) components between 0 and 1. For each individual n corresponding payoffs are computed. Selection Binary tournament selection is used in the following manner: For each individual i, another one k is selected randomly. If individual k Nash ascends individual i, k is selected and copied in a separate population of children P ′. Mutation Uniform mutation with probability pm is applied to all children in P ′. With prob- ability pm all strategies are modified (±) with ε. If the resulting value is lower than 0, it is set to 0. If it is higher than 1, it is set to 1. Termination condition DES runs either a maximum number of generations which is a pa- rameter of the algorithm and depends on the problem, either until no child can replace a parent for 100 generations. For this the variables CountReplacements and control in Algorithm 1 are used. 476 R.I. Lung Algorithm 1 Direct Evolutionary Search algorithm Randomly generate population; Evaluate population; CountReplacements = 1; control = true; for nrgen = 0; ((nrgen < MaxNoGenerations) and (control));nrgen + + do Apply Selection; Apply Mutation(pm); Replace Children; CountReplacements+ =no of children that replace a parent; if nrgen = 100 then if CountReplacements = 0 then control = false; else CountReplacements = 0; end if end if end for Return Non-Ascended Solutions; =0 Table 1: Description of the seven games tested Name of game No. of players No. of strategies Type of NE GAME1 2 2,2 totally mixed GAME2 2 2,2 totally mixed Matching pennies 2 2,2 totally mixed G1 3 2,2,2 totally mixed G2 3 3,3,3 mixed O’Neill 2 4,4 totally mixed Poker 2 4,2 totally mixed Direct Evolutionary Search for Nash Equilibria Detection 477 Table 2: Descriptive statistics of numerical results obtained by DES using the following param- eters: population size 50, pm = 0.5, ε = 0.5. GAME1 Avg. dist: 0 St dev: 0 Avg. no. gen.: 844.33 St dev: 476.57 Avg. eval.: 447,117.47 St dev: 271909.88 GAME3 Avg. dist: 7.75094E-17 St dev: 3.13E-17 Avg. no. gen.: 501.00 St dev: 141.42 Avg. eval.: 243,736.93 St dev: 74416.31 Matching pennies Avg. dist: 0 St dev: 0 Avg. no. gen.: 497.67 St dev: 125.12 Avg. eval.: 233,854.53 St dev: 56641.84 G1 Avg. dist: 4.04061E-11 St dev: 2.25E-17 Avg. no. gen.: 11,941 St dev: 13424.39 Avg. eval.: 15,541,134 St dev: 17317631 G2 Avg. dist: 4.04061E-11 St dev: 6.98E-18 Avg. no. gen.: 931 St dev: 324.70 Avg. eval.: 1,076,041 St dev: 374858.2 Oneill Avg. dist: 2.79146E-05 St dev: 0.0001 Avg. no. gen.: 8,217.63 St dev: 26628.89 Avg. eval.: 6,556,652.7 St dev: 21598601 Poker Avg. dist: 3.70074E-18 St dev: 1.99E-17 Avg. no. gen.: 794.33 St dev: 240.73 Avg. eval.: 419,827.93 St dev: 126713.48 2 Numerical results DES was first tested on a set of simple well known 2 or 3-players with up to 4 actions available to them. The games have been chosen from the GAMBIT [3] set of normal form games. The characteristics of the games are presented in Table 1. Table 2 presents descriptive statistics of the results obtained using DES for the seven games: average and standard deviation values for the minimum distance to the NE of the game for the Non Ascended solutions in the final population, number of evaluations and of generations until DES stopped the search. The second set of games that was used to test if DES was generated by using the GAMUT distribution [5]. The distribution names, characteristics and rates of success of DES are presented in Table 3. 478 R.I. Lung Table 3: GAMUT distributions and results obtained by DES. Rate of success represents percent of runs in which DES detected a NE. Parameters of DES: population size 100, pm = 0.1 Distribution No. of players No. of actions Rate of success Bertrand 2 10/player 100% Ologopoly 2 20/player 100% 2 50/player 100% 2 100/player 100% 5 2/player 100% 5 4/player 100% 5 6/player 100% Bidirectional LEG 2 10/player 100% Complete Graph 2 20/player 100% 2 50/player 100% 2 100/player 100% 5 2/player 100% 5 4/player 100% 5 6/player 100% Bidirectional LEG 2 10/player 100% Random Graph 2 20/player 100% 2 50/player 100% 2 100/player 100% 5 2/player 100% 5 4/player 100% 5 6/player 100% Bidirectional LEG 2 10/player 100% Star Graph 2 20/player 100% 2 50/player 100% 2 100/player 100% 5 2/player 100% 5 4/player 100% 5 6/player 100% Covariance Game 2 10/player 100% ρ = 0.9 2 20/player 100% 2 50/player 100% 2 100/player 100% 5 2/player 100% 5 4/player 100% 5 6/player 100% Direct Evolutionary Search for Nash Equilibria Detection 479 Conclusion The presented results indicate that it it possible for an evolutionary algorithm to directly detect Nash equilibria of normal form games. A population composed of situations of the game is randomly generated; using selection and mutation operators and the Nash ascendancy relation individuals of the populations are guided towards the Nash equilibrium of a normal form game. Two sets of problems have been chosen to test the method: the first one consists of games presenting a mixed Nash equilibrium. Results indicate that DES is capable of locating mixed equilibria for the selected games. The second group of games is used to test the scalability of the method: the number of actions available to each player, as well as the number of players are increased for five distributions available in the GAMUT package. Results also indicate the potential of the method, inspite of well known scalability issues associated with evolutionary algorithms. Acknowledgment This work was supported by a grant of the Romanian National Authority for Scientific Re- search and Innovation, CNCS - UEFISCDI, project number PN-II-RU-TE-2014-4-2560. Bibliography [1] Rodica Ioana Lung and D. Dumitrescu (2008); Computing nash equilibria by means of evolu- tionary computation, Intrenational Journal of Computers Communications & Control, Suppl. issue, 3(5):364-368. [2] Richard D. McKelvey and Andrew McLennan (1996); Computation of equilibria in finite games. In H. M. Amman, D. A. Kendrick, and J. Rust, editors, Handbook of Computational Economics, volume 1 of Handbook of Computational Economics, chapter 2, pp. 87-142, Elsevier, 1996. [3] Richard D. McKelvey, Andrew M. McLennan, and Theodore L. Turocy (2010); Gambit: Software tools for game theory, Technical report. [4] John F. Nash (1951) Non-cooperative games, Annals of Mathematics, 54:286-295. [5] Eugene Nudelman, Jennifer Wortman, Yoav Shoham, and Kevin Leyton-Brown(2004); Run the gamut: A comprehensive approach to evaluating game-theoretic algorithms, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, Washington, DC, USA, 2004. IEEE Computer Society, AAMAS04, 2: 880-887. [6] Ryan Porter, Eugene Nudelman, and Yoav Shoham (2008); Simple search methods for finding a nash equilibrium, Games and Economic Behavior, 63(2): 642-662.