Unlike in the extensive form of games I discussed in Part 91, the strategic or 'normal' form of games does not involve details like
'positions' and 'moves'. Rather, we speak in terms of 'strategy' and 'payoff'. The best known example of such a game is chess, but there are many
other examples as well.
One specifies a strategy set (also called action
space) Ai for each player i (i = 1, 2, . . . n). Thus, each
player can choose a strategy from this possible set of strategies, after
considering all the players and their strategies. All players make their
choices of strategy simultaneously. At the end of the game, each player
receives some payoff. The choice of strategy made by each player may influence
the final outcome for all the players.
If the players are assumed to be
rational, they make choices which result in the outcome that they prefer the
most, given what their opponents do. For example, a player may have two
strategies A and B such that, given any combination of strategies of the other
players, the outcome from strategy A is better than the outcome from B. Then
strategy A is said to dominate B. A rational player will always play a dominant
strategy if possible.
The payoffs are generally modelled as having numerical values. For
player 1, the strategy set is, say, A1. Suppose he/she chooses
strategy a1 from this set.
Similarly, any player i may choose a strategy ai
from
the set Ai. Then the payoff to a player j (j = 1, 2, . . . n) is a
function fj(a1, a2, . . . an), called the payoff function for player j.
The strategic form of a game is defined by the set N of players, the sequence A1, A2, ... An of the set of strategies of the players, and the sequence f1(a1, a2, ... an), f2(a1, a2, ... an), ... fn(a1, a2, ... an) of the set of
payoff functions for the n players.
A game in strategic form is called a zero-sum
game if
the sum of the payoffs to all the players adds up to zero.
Two-player zero-sum games are the best investigated. Each such game has
a value, and the two players have optimal strategies that guarantee the value.
Because of the minimax theorem, such games have clear-cut solutions.
Two-person non-zero-sum games can be
tricky business. Usually such games do not have values or optimal strategies
for the players. Two types of theoretical approaches can be used for dealing
with such games: Noncooperative theory, and cooperative theory.
In the noncooperative form of the game, the players
may not form binding agreements. This can happen, for example, during
negotiations between sovereign nations when there is no overseeing body to
enforce agreements. Similarly, in business dealings, trading companies may be
forbidden by law to enter into certain types of agreements.
John Nash, John Harsanyi and
Reinhard Selten were awarded the Nobel Prize in economics in 1995 for work in
this area. In their formalism, the concept of strategic equilibrium or Nash equilibrium replaces value and optimal strategy: a group of
players is said to be in Nash equilibrium if each of them is making the best
decision that he or she can, taking into account the decisions of the others.
http://www.econ.canterbury.ac.nz/personal_pages/john_fountain/econ223/week6/lect112010v1handout.html
In cooperative game theory (the
other approach to non-zero-sum games), the players can form binding agreements.
Thus there is a strong incentive to cooperate for maximizing the total payoff.
But now the problem is about how to split the payoffs between the players.
There are two kinds of
cooperative games. The game may have transferable utility, or it may have non-transferable utility. Transferable utility is relevant when the players measure utility of
the payoff in the same units and there is a means of exchange of utility, such
as side payments.
The payoffs in the strategic form
of a game can be presented in the form of a table, each cell of which
represents a distinct strategy combination. Such a tabular presentation becomes
possible because the players choose their strategies simultaneously. Let us
consider an example.
There are two prisoners
suspected of a serious crime. There is no judicial evidence for the crime
except if one of the prisoners testifies against the other. The two are
interrogated separately. If one of them incriminates the other, he will be let
off without any punishment (amounting to a payoff of, say, 4), whereas the
other prisoner will be awarded a long prison sentence (payoff 0). If both
testify against each other, their punishment will be somewhat less severe
(payoff 2 for each). However, if they both ‘cooperate’ with each other by not
testifying at all (i.e. by choosing to remain silent), they will only be given
a very short term in jail (due to lack of strong evidence), for example for
possession of illegal weapons (payoff 3 for each, with a total payoff 6). This
is a non-zero-sum strategic game: The total payoff is 4 if both incriminate
each other. It is 4 if any one of them testifies against the other; the
defecting prisoner is rewarded with immunity from prosecution, and the other
gets a long term in jail. If they cooperate with each other (unknowingly, of
course), both get only a very short term of imprisonment, and the total payoff
is 6.
In this
two-player game, let the first prisoner be called player I, and the other one
is player II. And let C or c stand for cooperation, and D or d for defection: C
and D are the options for player I, and c and d those for player II. The payoff
table for the PD game is shown in the payoff matrix below.
II→
I↓
|
c
|
d
|
C
|
3
3
|
4
0
|
D
|
0
4
|
2
2
|
Player I can
choose either row C or row D. At the same time, player II chooses column c or
column d. The possible strategy combinations for the two players are: (C, c),
(D, d), (C, d), and (D, c). For (C, c) the payoff for each player is 3, and
that for (D, d) it is 2 for each player.
If only player
I defects, i.e. strategy (D, c), the payoff to player I is 4, and that to
player II is 0. If only player II defects (strategy (C, d)), he gets a payoff
of 4, and player I gets 0.
We note that the
‘defect’ strategy dominates over the ‘cooperate’ strategy. Consider player I.
If II chooses c, the payoff for I is 4 for D and 3 for C. If II chooses d, the
payoff for I is 2 for D and 0 for C. In either case, D dominates over C for
player I. And the symmetry of the payoff matrix implies that for player II
also, ‘defect’ dominates over ‘cooperate.’
Thus the
selfishness of the independently acting players in the PD game makes defection
a dominant strategy, and cooperation a dominated strategy resulting in a lower
total payoff for the two players. Cooperation would have given them the best
total payoff of 6.
There are many
real-life versions of the PD game wherein individual defections at the expense
of others lead to outcomes which are, on the whole, less profitable. Here are
some examples: arms races; litigation instead of settlement; environmental
pollution. Treaties or laws are introduced sometimes to enforce cooperation.
The PD game
gets fundamentally changed if played more than once by the same players. In
such a repeated game, patterns of
cooperation emerge as rational behaviour. The fear of punishment in the future
outweighs the gain from immediate defection.
I shall dwell
on this some more in the next post, after discussing the Traveller's Dilemma (TD)
game, crafted by Kaushik Basu. PD is a special case of TD.
No comments:
Post a Comment