Int. J. of Computers, Communications & Control, ISSN 1841-9836, E-ISSN 1841-9844
Vol. V (2010), No. 5, pp. 693-700

Meta-Rationality in Normal Form Games

D. Dumitrescu, R.I. Lung, T.-D. Mihoc

Dan Dumitru Dumitrescu, Rodica Ioana Lung, Tudor Dan Mihoc
“Babes Bolyai” University
Romania, Cluj-Napoca, St. Universitatii 5,
E-mail: {ddumitr, mihoct}@cs.ubbcluj.ro, rodica.lung@econ.ubbcluj.ro

Abstract: A new generative relation for Nash equilibrium is proposed. Dif-
ferent types of equilibria are considered in order to incorporate players different
rationality types for finite non cooperative generalized games with perfect in-
formation. Proposed equilibria are characterized by use of several generative
relations with respect to players rationality. An evolutionary technique for
detecting approximations for equilibria is used. Numerical experiments show
the potential of the method.
Keywords: non-cooperative games, evolutionary equilibrium detection, gen-
erative relations, Nash-Pareto, meta-strategy.

1 Introduction

The most common solutions proposed in Game Theory are the equilibrium concepts. Within
the present day approaches each equilibrium concept is addressed separately, meaning that in a
particular game players interact accordingly to a unique equilibrium concept. This restriction
induces unrealistic results. For example, the concept of Nash equilibrium, alone, sometimes can
lead to deceptive results so we need to cope with more complex situations.

In real life players can be more or less cooperative, more or less competitive and more or less
rational, therefore agents guided by the different kind of equilibrium concepts should be allowed
to interact.

We consider a generalized game where players are allowed to have different behaviours accord-
ing to their rationality type. Players can have different behaviours/rationality types resulting in
an adequate meta-strategy concept.

Game equilibria can be characterized using appropriate generative relations [4]. Thus Nash
equilibrium is characterized by the ascendancy relation [6] and Pareto equilibrium by the Pareto
domination. Combining the two relations may lead to different types of joined Nash–Pareto
equilibria.

We introduce a new generative relation for Nash equilibrium and we use it to compose a new
joined Nash-Pareto equilibria.

An evolutionary technique for detecting the two joined Nash–Pareto equilibria for generalized
games is used.

2 Generalized games

In order to cope with different rationality types the concept of generalized game is defined [4].

Definition 1. A finite strategic generalized game is defined as a system by G = (N, M, U) where:

• N = {1, ..., n}, represents the set of players, n is the number of players;

Copyright c⃝ 2006-2010 by CCC Publications


694 D. Dumitrescu, R.I. Lung, T.-D. Mihoc

• for each player i ∈ N, Si represents the set of actions available to him, Si = {si1, si2, ..., simi };
S = S1 × S2 × ... × Sn is the set of all possible situations of the game;

• for each player i ∈ N, Mi represents the set of available meta-strategies, a meta-strategy
is a system (si|ri) where si ∈ Si and ri is the ith player rationality type;

• M = M1 × M2 × ... × MN is the set of all possible situations of the generalized game and
(s1|r1, s2|r2, ..., sn|rn) ∈ M is a meta-strategy profile.

• for each player i ∈ N , ui : S → R represents the payoff function.
U = {u1, ..., un}.

Remark 2. In a generalized game the set of all possible meta-strategies represents the meta-
strategy search space.

The rationality type of a player usually represents the player bias towards a certain equilib-
rium.

3 Generative relations for generalized games

Three generative relations are considered in this section. Two of them correspond to Pareto
and Nash equilibria. The third induces a new type of joined Nash–Pareto equilibrium.

3.1 nP–strict Pareto domination

We consider the nP–strict Pareto domination in order to be able to combine several concepts
of Nash and Pareto domination.

In a finite strategic generalized game consider the set of players Pareto biased

IP = {j ∈ {1, ..., n}|rj = Pareto}

and nP = cardIP.
Let us consider two meta strategy profiles x and y from M.

Definition 3. The meta strategy profile x nP–strict Pareto dominates the meta strategy profile
y if the payoff of each Pareto biased player from IP using meta strategy x is strictly greater than
the payoff associated to the meta strategy y, i.e.

ui(x) > ui(y), ∀i ∈ IP.

Remark 4. The set of non dominated meta strategies with respect to the nP–strict Pareto dom-
ination relation when nP = n is a subset of the Pareto front.

3.2 Nash - ascendancy

Similar to Pareto equilibrium a particular relation between strategy profiles can be used in
order to describe Nash rationality. This relation is called Nash-ascendancy (NA).

A strategy is called Nash equilibrium [5] if each player has no incentive to unilaterally deviate
i.e. it can not improve the payoff by modifying its strategy while the others do not modify theirs.

We denote by (sij, s
∗
−i) the strategy profile obtained from s

∗ by replacing the strategy of
player i with sij i.e.

(sij, s
∗
−i) = (s

∗
1, s

∗
2, ..., s

∗
i−1, sij, s

∗
i+1, ..., s

∗
n).


Meta-Rationality in Normal Form Games 695

Definition 5. The strategy profile x Nash-ascends the strategy profile y, and we write x <NA y
if there are less players i that can increase their payoffs by switching their strategy from xi to yi
then vice versa.

In [6] is introduced an operator
k : S × S → N,

k(y, x) = card{i ∈ {1, ..., n}|ui(xi, y−i) ≥ ui(y), xi ̸= yi}.
k(y, x) denotes the number of players which benefit by switching from y to x.

Proposition 6. The strategy x Nash-ascends y (x is NA-preferred to y), and we write x <NA y,
if the inequality

k(x, y) < k(y, x),

holds.

According to [6] the set of all strategies from S non-dominated by respect of Nash ascendancy
relation equals the set of Nash equilibria.

This result proves that the Nash ascendancy is the generative relation for the Nash equilib-
rium.

3.3 Differential generative relation of Nash equilibrium (DGN)

A new generative relation for Nash equilibrium is proposed. This relation relies on the payoff
difference between perturbed and non perturbed strategies.

We introduce the measure

m(y, x) =
∑
i∈N

(ui(xi, y−i) − ui(y))

Definition 7. The strategy x dominates y, and we write x <DGN y, if the inequality

m(x, y) < m(y, x),

holds.

3.4 Joint Nash–Pareto domination

Let us consider two meta-strategies x = (x1|r1, x2|r2, ..., xn|rn) and y = (y1|r1, y2|r2, ..., yn|rn).
Let us denote by IN the set of Nash biased players (N-players) and by IP the set of Pareto

biased players (P-players). Therefore we have IN = {i ∈ {1, ..., n}|ri = Nash}.
We consider the operators kP and kN defined as:

kP(x, y) = card{j ∈ IP|uj(x) > uj(y), x ̸= y} and respectively kN(x, y) = card{i ∈ IN|ui(yi, x−i) ≥
ui(x), xi ̸= yi}.
Remark 8. kP(x, y) measures the relative efficiency of the meta strategies x and y with respect
to Pareto rationality and kN(x, y) measures the relative efficiency of the meta strategies x and
y with respect to Nash rationality.

Definition 9. The meta strategy x N–P dominates the meta strategy y if and only if the
following statements hold

1. kP(x, y) = nP

2. kN(x, y) < kN(y, x)

In what follows we consider that efficiency relation induces a new type of equilibrium called
joined Nash-Pareto equilibrium.


696 D. Dumitrescu, R.I. Lung, T.-D. Mihoc

3.5 Joint Differential Nash Pareto domination

A new domination relation with respect to Nash-Pareto equilibrium is introduced by using
differential generative relation of Nash equilibrium.

Definition 10. The meta strategy x DGN–P dominates the meta strategy y if and only if the
following statements hold

1. kP(x, y) = nP

2. m(x, y) < m(y, x)

4 Detecting joint N–P equilibria in generalized games

Consider a three player non-cooperative game. Let ri be the rationality type of player i. If
r1 = r2 = r3 = Nash then all players are Nash biased and the corresponding solution concept
is the Nash equilibrium. If r1 = r2 = r3 = Pareto then all players are Pareto biased and the
corresponding equilibria are described by the set of strictly non dominated strategies (Pareto
front).

We also intend to explore the joint cases where one of the players is Nash biased and others
are Pareto and the one where one is Pareto and the others are Nash biased.

In order to detect the joined Nash–Pareto equilibria of the generalized game an evolutionary
approach is used. For a certain equilibrium the corresponding generative relation allows the
comparison of two meta-strategies. This comparison may guide the search towards the game
equilibrium.

Let us consider an initial population of meta strategies for the generalized three player game.
Each member of the population has the form

x = (s1|r1, s2|r2, s3|r3).

Non domination (with respect to a generative relation) is considered for fitness assignment
purposes. Evolutionary Multiobjective Optimization Algorithms [3] are efficient tools for evolving
strategies based on a non domination relation.

The state of the art NSGA2 [2] has been considered to illustrate how generative relations can
be used for evolutionary detection of proposed equilibria.

A population of 100 strategies has been evolved using a rank based fitness assignment tech-
nique. In all experiments the process converges in less than 30 generations.

5 Numerical experiments

In order to illustrate the proposed concepts the oligopoly Cournot model is considered (see
for instance [4]).

Let q1, q2 and q3 denote the quantities of an homogeneous product - produced by three
companies respectively. The market clearing price is P(Q) = a − Q, where Q = q1 + q2 + q3, is
the aggregate quantity on the market. Hence we have

P(Q) =

{
a − Q, for Q < a,

0, for Q ≥ a.

Let us assume that the total cost for the company i of producing quantity qi is Ci (qi) = ciqi.
Therefore, there are no fixed costs and the marginal cost ci is constant, ci < a. Suppose that


Meta-Rationality in Normal Form Games 697

the companies choose their quantities simultaneously. The payoff for the company i is its profit,
which can be expressed as:

πi(qi, qj) = qiP(Q) − Ci(qi)

= qi [a − (qi + qj) − ci] .

Several experiments have been performed for this game by using RED technique [4].
The symmetric Cournot model with parameters a = 24 and c1 = c2 = c3 = 9 is considered.
According to the data from the Table 1 in less than 30 generations the algorithm converges to

the Nash equilibrium point (14.00, 14.00, 14.00) for each relation. We observe that the differential
Nash domination provides more accurate results than the Nash ascendency. We must consider
however the particular nature of this Cournot game. For other types of games a normalisation
of the deviations must be done in order to sum them.

Figure 1: The payoffs for the Nash-Nash-
Pareto front detected in less than 30 itera-
tions for the symmetric Cournot game with the
Nash–Pareto generative relation

Figure 2: The payoffs for the Nash-Nash-
Pareto front detected in less than 30 iterations
for the symmetric Cournot game with the dif-
ferential Nash–Pareto generative relation

Figure 3: The payoffs for the Nash-Pareto-
Pareto front detected in less than 30 itera-
tions for the symmetric Cournot game with the
Nash–Pareto generative relation.

Figure 4: The payoffs for the Nash-Pareto-
Pareto front detected in less than 30 iterations
for the symmetric Cournot game with the dif-
ferential Nash–Pareto generative relation.

The resulting front in the Nash-Nash-Pareto case spreads from the standard Nash equilib-
rium corresponding to the two player–Cournot game (25.00, 25.00) to the Nash equilibrium
corresponding to the three player–Cournot game, and from there to the edges of Pareto front for


698 D. Dumitrescu, R.I. Lung, T.-D. Mihoc

Table 1: Average payoff and standard deviation of the final populations in 30 runs with 100
meta-strategies after 30 generations for the symmetric Cournot model where all three players are
Nash biased using Nash ascendency and differential Nash generative relations.

N-N-N Average payoff St. dev. Maximum payoff Minimum payoff

player p1 p2 p3 p1 p2 p3 p1 p2 p3 p1 p2 p3
Nash ascendency relation

Average 14.05 14.06 14.05 0.03 0.04 0.04 14.85 15.57 15.00 12.25 12.49 12.45
St. Dev. 0.02 0.02 0.02 0.08 0.09 0.08 1.39 2.80 1.83 3.25 3.00 3.05

Differential Nash relation

Average 14.06 14.06 14.06 0.00 0.00 0.00 14.06 14.06 14.06 14.06 14.06 14.06
St. Dev. 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Table 2: Average payoff and standard deviation of the final populations in 30 runs with 100
meta-strategies after 30 generations for the symmetric Cournot model where two player are
Nash biased and one is Pareto for both joint Nash–Pareto and joint Differential Nash–Pareto
generative relations.

N-N-P Average payoff St. dev. Maximum payoff Minimum payoff

player p1 p2 p3 p1 p2 p3 p1 p2 p3 p1 p2 p3
Joint Nash–Pareto relation

Average 10.99 11.01 29.80 52.81 53.02 182.28 25.92 25.71 56.24 0.00 0.00 0.49
St. Dev. 0.36 0.33 0.78 1.75 2.33 17.62 0.92 0.88 0.00 0.00 0.00 1.67

Joint Differential Nash–Pareto relation

Average 8.50 8.48 35.53 22.76 22.67 165.84 18.47 18.92 56.24 0.00 0.00 7.60
St. Dev. 0.35 0.31 0.85 0.25 0.25 0.54 2.88 2.93 0.00 0.00 0.00 5.60

the Nash–Pareto equilibria (see Figure 1). For differential Nash-Pareto (see Figure 2) the front
spreads from vicinity of the Nash equilibrium for Cournot game to the edge of the Pareto front
corresponding to the Pareto player. The numerical results are presented in Table 2.

As we can see in the Figure 3 in the Nash-Pareto-Pareto case for Nash–Pareto generative
relation the result is similar to the Pareto front. In the same case for differential Nash–Pareto
generative relation (Figure 4) the Pareto front is deformed in the Nash player’s corresponding
edge. The numerical results are presented in Table 3.

6 Conclusions and future work

A new generative relation for Nash equilibrium based on differences between perturbations
is introduced. Generative relations between meta strategies induce corresponding solutions con-
cepts named Joined Nash–Pareto equilibrium, respectively joint differential Nash–Pareto equi-
librium.

An evolutionary technique for detecting approximations of the generalized equilibria is used.
The ideas are exemplified for Cournot games with three players and two types of rational-
ity.Results indicate the potential of the proposed technique.

Future work will address generalized games having other rationality types and other methods


Meta-Rationality in Normal Form Games 699

Table 3: Average payoff and standard deviation of the final populations in 30 runs with 100
meta-strategies after 30 generations for the symmetric Cournot model where one player is Nash
biased and the other two Pareto for both joint Nash–Pareto and joint Differential Nash–Pareto
generative relations.

N-P-P Average payoff St. dev. Maximum payoff Minimum payoff

player p1 p2 p3 p1 p2 p3 p1 p2 p3 p1 p2 p3
Joint Nash–Pareto relation

Average 17.74 18.52 18.44 242.42 247.83 247.97 56.23 56.24 56.24 0.00 0.00 0.00
St. Dev. 0.40 0.36 0.42 7.36 6.98 6.82 0.04 0.00 0.00 0.00 0.00 0.00

Joint Differential Nash–Pareto relation

Average 15.13 19.83 19.76 154.37 250.62 248.44 48.90 56.24 56.24 0.00 0.00 0.00
St. Dev. 0.86 0.77 0.56 0.78 0.29 0.32 3.02 0.00 0.01 0.00 0.00 0.00

of combining them.

7 Acknowledgements

This research is supported partially by the CNCSIS Grant ID508 "New Computational
paradigms for dynamic complex problems" funded by the MEC and from the SECTORAL OP-
ERATIONAL PROGRAMME HUMAN RESOURCES DEVELOPMENT, Contract POSDRU
6/1.5/S/3 "Doctoral studies: through science towards society", Babeş - Bolyai University, Cluj -
Napoca, România.

Bibliography

[1] Bade, S., Haeringer, G., Renou, L.: More strategies, more Nash equilibria, Working Paper
2004-15, School of Economics University of Adelaide University, 2004.

[2] Deb, K., Agrawal, S., Pratab, A., Meyarivan, T.: A Fast Elitist Non-Dominated Sorting Ge-
netic Algorithm for Multi-Objective Optimization: NSGA-II, Marc Schoenauer, Kalyanmoy
Deb, Günter Rudolph, Xin Yao, Evelyne Lutton, Juan Julian Merelo, and Hans-Paul Schwe-
fel, editors, Proceedings of the Parallel Problem Solving from Nature VI Conference, Paris,
France, 2000. Springer, Lecture Notes in Computer Science, 1917, 849-858.

[3] Deb, K.: Multi-objective optimization using evolutionary algorithms, Wiley, 2001.

[4] Dumitrescu, D., Lung, R.I., Mihoc, T.D.: Evolutionary Equilibria Detection in Non-
cooperative Games, Book Series: LNCS, Publisher Springer Berlin / Heidelberg, Volume
5484 / 2009, Book: Applications of Evolutionary Computing, 2009, 253-262.

[5] Lung, R. I., Muresan, A. S., and Filip, D. A.: Solving multi-objective optimization problems
by means of natural computing with application in finance, In Aplimat 2006 (Bratislava,
February 2006), pp. 445-452.

[6] Lung, R., I., Dumitrescu, D.: Computing Nash Equilibria by Means of Evolutionary Compu-
tation, Int. J. of Computers, Communications & Control, 2008, 364-368


700 D. Dumitrescu, R.I. Lung, T.-D. Mihoc

[7] Maskin, E. : The theory of implementation in Nash equilibrium:A survey, in: L. Hurwicz,
D. Schmeidler and H. Sonnenschein, eds., Social Goals and Social Organization (Cambridge
University Press), 1985,173-204

[8] McKelvey, R., D., McLennan, A.: Computation of equilibria in finite games, In H. M. Amman,
D. A. Kendrick, and J. Rust, editors, Handbook of Computational Economics, Elsevier,1996.

[9] Nash.,J.,F.: Non-cooperative games, Annals of Mathematics, 54:286-295, 1951.

[10] Osborne, M. J., Rubinstein, A.: A Course in Game Theory, MIT Press, Cambridge, MA,
1994