Int. J. of Computers, Communications & Control, ISSN 1841-9836, E-ISSN 1841-9844 Vol. V (2010), No. 5, pp. 693-700 Meta-Rationality in Normal Form Games D. Dumitrescu, R.I. Lung, T.-D. Mihoc Dan Dumitru Dumitrescu, Rodica Ioana Lung, Tudor Dan Mihoc “Babes Bolyai” University Romania, Cluj-Napoca, St. Universitatii 5, E-mail: {ddumitr, mihoct}@cs.ubbcluj.ro, rodica.lung@econ.ubbcluj.ro Abstract: A new generative relation for Nash equilibrium is proposed. Dif- ferent types of equilibria are considered in order to incorporate players different rationality types for finite non cooperative generalized games with perfect in- formation. Proposed equilibria are characterized by use of several generative relations with respect to players rationality. An evolutionary technique for detecting approximations for equilibria is used. Numerical experiments show the potential of the method. Keywords: non-cooperative games, evolutionary equilibrium detection, gen- erative relations, Nash-Pareto, meta-strategy. 1 Introduction The most common solutions proposed in Game Theory are the equilibrium concepts. Within the present day approaches each equilibrium concept is addressed separately, meaning that in a particular game players interact accordingly to a unique equilibrium concept. This restriction induces unrealistic results. For example, the concept of Nash equilibrium, alone, sometimes can lead to deceptive results so we need to cope with more complex situations. In real life players can be more or less cooperative, more or less competitive and more or less rational, therefore agents guided by the different kind of equilibrium concepts should be allowed to interact. We consider a generalized game where players are allowed to have different behaviours accord- ing to their rationality type. Players can have different behaviours/rationality types resulting in an adequate meta-strategy concept. Game equilibria can be characterized using appropriate generative relations [4]. Thus Nash equilibrium is characterized by the ascendancy relation [6] and Pareto equilibrium by the Pareto domination. Combining the two relations may lead to different types of joined Nash–Pareto equilibria. We introduce a new generative relation for Nash equilibrium and we use it to compose a new joined Nash-Pareto equilibria. An evolutionary technique for detecting the two joined Nash–Pareto equilibria for generalized games is used. 2 Generalized games In order to cope with different rationality types the concept of generalized game is defined [4]. Definition 1. A finite strategic generalized game is defined as a system by G = (N, M, U) where: • N = {1, ..., n}, represents the set of players, n is the number of players; Copyright c⃝ 2006-2010 by CCC Publications 694 D. Dumitrescu, R.I. Lung, T.-D. Mihoc • for each player i ∈ N, Si represents the set of actions available to him, Si = {si1, si2, ..., simi }; S = S1 × S2 × ... × Sn is the set of all possible situations of the game; • for each player i ∈ N, Mi represents the set of available meta-strategies, a meta-strategy is a system (si|ri) where si ∈ Si and ri is the ith player rationality type; • M = M1 × M2 × ... × MN is the set of all possible situations of the generalized game and (s1|r1, s2|r2, ..., sn|rn) ∈ M is a meta-strategy profile. • for each player i ∈ N , ui : S → R represents the payoff function. U = {u1, ..., un}. Remark 2. In a generalized game the set of all possible meta-strategies represents the meta- strategy search space. The rationality type of a player usually represents the player bias towards a certain equilib- rium. 3 Generative relations for generalized games Three generative relations are considered in this section. Two of them correspond to Pareto and Nash equilibria. The third induces a new type of joined Nash–Pareto equilibrium. 3.1 nP–strict Pareto domination We consider the nP–strict Pareto domination in order to be able to combine several concepts of Nash and Pareto domination. In a finite strategic generalized game consider the set of players Pareto biased IP = {j ∈ {1, ..., n}|rj = Pareto} and nP = cardIP. Let us consider two meta strategy profiles x and y from M. Definition 3. The meta strategy profile x nP–strict Pareto dominates the meta strategy profile y if the payoff of each Pareto biased player from IP using meta strategy x is strictly greater than the payoff associated to the meta strategy y, i.e. ui(x) > ui(y), ∀i ∈ IP. Remark 4. The set of non dominated meta strategies with respect to the nP–strict Pareto dom- ination relation when nP = n is a subset of the Pareto front. 3.2 Nash - ascendancy Similar to Pareto equilibrium a particular relation between strategy profiles can be used in order to describe Nash rationality. This relation is called Nash-ascendancy (NA). A strategy is called Nash equilibrium [5] if each player has no incentive to unilaterally deviate i.e. it can not improve the payoff by modifying its strategy while the others do not modify theirs. We denote by (sij, s ∗ −i) the strategy profile obtained from s ∗ by replacing the strategy of player i with sij i.e. (sij, s ∗ −i) = (s ∗ 1, s ∗ 2, ..., s ∗ i−1, sij, s ∗ i+1, ..., s ∗ n). Meta-Rationality in Normal Form Games 695 Definition 5. The strategy profile x Nash-ascends the strategy profile y, and we write x uj(y), x ̸= y} and respectively kN(x, y) = card{i ∈ IN|ui(yi, x−i) ≥ ui(x), xi ̸= yi}. Remark 8. kP(x, y) measures the relative efficiency of the meta strategies x and y with respect to Pareto rationality and kN(x, y) measures the relative efficiency of the meta strategies x and y with respect to Nash rationality. Definition 9. The meta strategy x N–P dominates the meta strategy y if and only if the following statements hold 1. kP(x, y) = nP 2. kN(x, y) < kN(y, x) In what follows we consider that efficiency relation induces a new type of equilibrium called joined Nash-Pareto equilibrium. 696 D. Dumitrescu, R.I. Lung, T.-D. Mihoc 3.5 Joint Differential Nash Pareto domination A new domination relation with respect to Nash-Pareto equilibrium is introduced by using differential generative relation of Nash equilibrium. Definition 10. The meta strategy x DGN–P dominates the meta strategy y if and only if the following statements hold 1. kP(x, y) = nP 2. m(x, y) < m(y, x) 4 Detecting joint N–P equilibria in generalized games Consider a three player non-cooperative game. Let ri be the rationality type of player i. If r1 = r2 = r3 = Nash then all players are Nash biased and the corresponding solution concept is the Nash equilibrium. If r1 = r2 = r3 = Pareto then all players are Pareto biased and the corresponding equilibria are described by the set of strictly non dominated strategies (Pareto front). We also intend to explore the joint cases where one of the players is Nash biased and others are Pareto and the one where one is Pareto and the others are Nash biased. In order to detect the joined Nash–Pareto equilibria of the generalized game an evolutionary approach is used. For a certain equilibrium the corresponding generative relation allows the comparison of two meta-strategies. This comparison may guide the search towards the game equilibrium. Let us consider an initial population of meta strategies for the generalized three player game. Each member of the population has the form x = (s1|r1, s2|r2, s3|r3). Non domination (with respect to a generative relation) is considered for fitness assignment purposes. Evolutionary Multiobjective Optimization Algorithms [3] are efficient tools for evolving strategies based on a non domination relation. The state of the art NSGA2 [2] has been considered to illustrate how generative relations can be used for evolutionary detection of proposed equilibria. A population of 100 strategies has been evolved using a rank based fitness assignment tech- nique. In all experiments the process converges in less than 30 generations. 5 Numerical experiments In order to illustrate the proposed concepts the oligopoly Cournot model is considered (see for instance [4]). Let q1, q2 and q3 denote the quantities of an homogeneous product - produced by three companies respectively. The market clearing price is P(Q) = a − Q, where Q = q1 + q2 + q3, is the aggregate quantity on the market. Hence we have P(Q) = { a − Q, for Q < a, 0, for Q ≥ a. Let us assume that the total cost for the company i of producing quantity qi is Ci (qi) = ciqi. Therefore, there are no fixed costs and the marginal cost ci is constant, ci < a. Suppose that Meta-Rationality in Normal Form Games 697 the companies choose their quantities simultaneously. The payoff for the company i is its profit, which can be expressed as: πi(qi, qj) = qiP(Q) − Ci(qi) = qi [a − (qi + qj) − ci] . Several experiments have been performed for this game by using RED technique [4]. The symmetric Cournot model with parameters a = 24 and c1 = c2 = c3 = 9 is considered. According to the data from the Table 1 in less than 30 generations the algorithm converges to the Nash equilibrium point (14.00, 14.00, 14.00) for each relation. We observe that the differential Nash domination provides more accurate results than the Nash ascendency. We must consider however the particular nature of this Cournot game. For other types of games a normalisation of the deviations must be done in order to sum them. Figure 1: The payoffs for the Nash-Nash- Pareto front detected in less than 30 itera- tions for the symmetric Cournot game with the Nash–Pareto generative relation Figure 2: The payoffs for the Nash-Nash- Pareto front detected in less than 30 iterations for the symmetric Cournot game with the dif- ferential Nash–Pareto generative relation Figure 3: The payoffs for the Nash-Pareto- Pareto front detected in less than 30 itera- tions for the symmetric Cournot game with the Nash–Pareto generative relation. Figure 4: The payoffs for the Nash-Pareto- Pareto front detected in less than 30 iterations for the symmetric Cournot game with the dif- ferential Nash–Pareto generative relation. The resulting front in the Nash-Nash-Pareto case spreads from the standard Nash equilib- rium corresponding to the two player–Cournot game (25.00, 25.00) to the Nash equilibrium corresponding to the three player–Cournot game, and from there to the edges of Pareto front for 698 D. Dumitrescu, R.I. Lung, T.-D. Mihoc Table 1: Average payoff and standard deviation of the final populations in 30 runs with 100 meta-strategies after 30 generations for the symmetric Cournot model where all three players are Nash biased using Nash ascendency and differential Nash generative relations. N-N-N Average payoff St. dev. Maximum payoff Minimum payoff player p1 p2 p3 p1 p2 p3 p1 p2 p3 p1 p2 p3 Nash ascendency relation Average 14.05 14.06 14.05 0.03 0.04 0.04 14.85 15.57 15.00 12.25 12.49 12.45 St. Dev. 0.02 0.02 0.02 0.08 0.09 0.08 1.39 2.80 1.83 3.25 3.00 3.05 Differential Nash relation Average 14.06 14.06 14.06 0.00 0.00 0.00 14.06 14.06 14.06 14.06 14.06 14.06 St. Dev. 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Table 2: Average payoff and standard deviation of the final populations in 30 runs with 100 meta-strategies after 30 generations for the symmetric Cournot model where two player are Nash biased and one is Pareto for both joint Nash–Pareto and joint Differential Nash–Pareto generative relations. N-N-P Average payoff St. dev. Maximum payoff Minimum payoff player p1 p2 p3 p1 p2 p3 p1 p2 p3 p1 p2 p3 Joint Nash–Pareto relation Average 10.99 11.01 29.80 52.81 53.02 182.28 25.92 25.71 56.24 0.00 0.00 0.49 St. Dev. 0.36 0.33 0.78 1.75 2.33 17.62 0.92 0.88 0.00 0.00 0.00 1.67 Joint Differential Nash–Pareto relation Average 8.50 8.48 35.53 22.76 22.67 165.84 18.47 18.92 56.24 0.00 0.00 7.60 St. Dev. 0.35 0.31 0.85 0.25 0.25 0.54 2.88 2.93 0.00 0.00 0.00 5.60 the Nash–Pareto equilibria (see Figure 1). For differential Nash-Pareto (see Figure 2) the front spreads from vicinity of the Nash equilibrium for Cournot game to the edge of the Pareto front corresponding to the Pareto player. The numerical results are presented in Table 2. As we can see in the Figure 3 in the Nash-Pareto-Pareto case for Nash–Pareto generative relation the result is similar to the Pareto front. In the same case for differential Nash–Pareto generative relation (Figure 4) the Pareto front is deformed in the Nash player’s corresponding edge. The numerical results are presented in Table 3. 6 Conclusions and future work A new generative relation for Nash equilibrium based on differences between perturbations is introduced. Generative relations between meta strategies induce corresponding solutions con- cepts named Joined Nash–Pareto equilibrium, respectively joint differential Nash–Pareto equi- librium. An evolutionary technique for detecting approximations of the generalized equilibria is used. The ideas are exemplified for Cournot games with three players and two types of rational- ity.Results indicate the potential of the proposed technique. Future work will address generalized games having other rationality types and other methods Meta-Rationality in Normal Form Games 699 Table 3: Average payoff and standard deviation of the final populations in 30 runs with 100 meta-strategies after 30 generations for the symmetric Cournot model where one player is Nash biased and the other two Pareto for both joint Nash–Pareto and joint Differential Nash–Pareto generative relations. N-P-P Average payoff St. dev. Maximum payoff Minimum payoff player p1 p2 p3 p1 p2 p3 p1 p2 p3 p1 p2 p3 Joint Nash–Pareto relation Average 17.74 18.52 18.44 242.42 247.83 247.97 56.23 56.24 56.24 0.00 0.00 0.00 St. Dev. 0.40 0.36 0.42 7.36 6.98 6.82 0.04 0.00 0.00 0.00 0.00 0.00 Joint Differential Nash–Pareto relation Average 15.13 19.83 19.76 154.37 250.62 248.44 48.90 56.24 56.24 0.00 0.00 0.00 St. Dev. 0.86 0.77 0.56 0.78 0.29 0.32 3.02 0.00 0.01 0.00 0.00 0.00 of combining them. 7 Acknowledgements This research is supported partially by the CNCSIS Grant ID508 "New Computational paradigms for dynamic complex problems" funded by the MEC and from the SECTORAL OP- ERATIONAL PROGRAMME HUMAN RESOURCES DEVELOPMENT, Contract POSDRU 6/1.5/S/3 "Doctoral studies: through science towards society", Babeş - Bolyai University, Cluj - Napoca, România. Bibliography [1] Bade, S., Haeringer, G., Renou, L.: More strategies, more Nash equilibria, Working Paper 2004-15, School of Economics University of Adelaide University, 2004. [2] Deb, K., Agrawal, S., Pratab, A., Meyarivan, T.: A Fast Elitist Non-Dominated Sorting Ge- netic Algorithm for Multi-Objective Optimization: NSGA-II, Marc Schoenauer, Kalyanmoy Deb, Günter Rudolph, Xin Yao, Evelyne Lutton, Juan Julian Merelo, and Hans-Paul Schwe- fel, editors, Proceedings of the Parallel Problem Solving from Nature VI Conference, Paris, France, 2000. Springer, Lecture Notes in Computer Science, 1917, 849-858. [3] Deb, K.: Multi-objective optimization using evolutionary algorithms, Wiley, 2001. [4] Dumitrescu, D., Lung, R.I., Mihoc, T.D.: Evolutionary Equilibria Detection in Non- cooperative Games, Book Series: LNCS, Publisher Springer Berlin / Heidelberg, Volume 5484 / 2009, Book: Applications of Evolutionary Computing, 2009, 253-262. [5] Lung, R. I., Muresan, A. S., and Filip, D. A.: Solving multi-objective optimization problems by means of natural computing with application in finance, In Aplimat 2006 (Bratislava, February 2006), pp. 445-452. [6] Lung, R., I., Dumitrescu, D.: Computing Nash Equilibria by Means of Evolutionary Compu- tation, Int. J. of Computers, Communications & Control, 2008, 364-368 700 D. Dumitrescu, R.I. Lung, T.-D. Mihoc [7] Maskin, E. : The theory of implementation in Nash equilibrium:A survey, in: L. Hurwicz, D. Schmeidler and H. Sonnenschein, eds., Social Goals and Social Organization (Cambridge University Press), 1985,173-204 [8] McKelvey, R., D., McLennan, A.: Computation of equilibria in finite games, In H. M. Amman, D. A. Kendrick, and J. Rust, editors, Handbook of Computational Economics, Elsevier,1996. [9] Nash.,J.,F.: Non-cooperative games, Annals of Mathematics, 54:286-295, 1951. [10] Osborne, M. J., Rubinstein, A.: A Course in Game Theory, MIT Press, Cambridge, MA, 1994