In zero-sum
two-player games a single number, i.e. the amount won by the
first player, and therefore the amount lost by the second player (or vice
versa), determines the payoff. If it is a finite game, its normal form is a matrix of payoffs, such as that shown below
(Owen 1992).
1 3
5
2* 4
3
0 1 -3
The entry with
a star is a 'saddle point' (see below).
Here each row
corresponds to the strategies of player I, and each column to those of player
II. Suppose player I chooses a strategy corresponding to row 2. The payoffs are
2, 4, or 3, depending on the strategy chosen by player 2. Thus, player I can be
certain of winning at least 2 units. By contrast, if player I chooses row 1,
what he can win at least is 1 unit. Similarly, if he chooses row 3, what he can
win at the most is 1 unit, or he may lose as much as 3 units. Thus row-2
strategy is his 'maximin strategy' (it maximizes his minimum
winnings), and is called the security
level of this player. The security level of a player is the largest amount
he can be sure of winning, no matter what games the other player plays.
Now consider
player 2. For columns 1, 2, and 3, his maximum payoffs are 2, 4, and 5,
respectively. Thus column 1 corresponds to his 'minimax strategy'
(it gives the minimum of the three maximum payoffs).
In this
example, 2 units is both the maximin and the minimax payoff. By choosing row 2,
player I is sure to win at least 2 units, and by choosing column 1, player II
is sure to lose no more than 2 units. It is a zero-sum game, and row 2 and
column 1 are optimal strategies. A
game of this type in which the maximin of one player coincides with the minimax
of the other player has a saddle point
in the payoff matrix; the saddle-point payoff is 2 units in our example.
Suppose we
make a 3-dimensional plot of the payoff to player I as a function of the two
strategies plotted along the other two axes. The maximin strategy will be seen
at the valley in this plot. Now suppose we plot the payoff for player II
as a function of the two strategies. The minimax strategy will show as a peak
in this plot. When the maximin and the minimax coincide, we see a saddle point
in the plot. If the two players can be presumed to be rational, they would both
hit the saddle point without any need to
communicate with each other, except through the consequences of the games
played by each.
This is a
notable result from the point of view of emergence of swarm intelligence and complex behaviour. Each bee in a beehive is
obviously ‘rational’ in the sense that it has no emotions to speak of. Its
actions are determined by the innate tendencies (simple rules) dictated by its genes. Although there is an apparent
visual ‘communication’ with other bees (the waggle dance), the fact remains
that even the response to the dance is nothing more than an operation of simple
local rules in a rational manner. The existence of the saddle point, although
arrived at above in the context of a zero-sum two-player game, points to the
existence of a sustainable level of survival of the bee species, each bee
playing the role of a rational player at the individual level.
It was von
Neumann who showed in his pioneering work on game theory (cf. Part 91) that for
two-person zero-sum games, the idea of rational behaviour can be extended by
using the concept of individual
rationality.
In general,
the maximin and minimax strategies may not give the same payoff, and this is
illustrated below in the payoff matrix for a rudimentary version of the normal
form of an elementary game of poker (Owen 1995). Here each
row corresponds to the strategies of player 1, and each column to those of
player 2. Player 1 gets to choose from four possible strategies, and player II
must choose from two possible strategies. Here the maximin is -0.6 in row 2,
and the minimax is -0.2 in column 1. The difference is not zero, and represents
an indeterminacy of the game.
-1.8 1
-0.2 -0.6
-2.2 1
-0.6 -0.6
In such
situations it is advisable for the two players to adopt mixed strategies. The idea is
to prevent the other player from guessing the strategy by choosing a row or a
column according to some randomizing scheme. For example, suppose player 1
plays row 1 with probability 1/8, row 2 with probability 7/8, and does not play
rows 3 and 4 at all. The four choices can be represented by a vector x =
(1/8, 7/8, 0, 0), and give player 1 the winnings of at least -0.4. This means
that he expects to lose no more than 0.4 units, irrespective of what game the
opponent plays. Similarly, player 2 may be advised to play either column with a
probability 1/2 (y = (1/2, 1/2)). The expected payoff for this approach is
at the most 0.4.
With this
mixed strategy, player 1 expects to lose no more than 0.4, and player 2 expects
to win at least 0.4. The number -0.4 is the
value of the game.
We can
generalize. Consider a game in which the payoff matrix A is an m x n matrix. The mixed-strategy for player 1
is a vector x = (x1, x2, . . ., xm).
The components of this vector, which are probabilities, are nonnegative, and
add up to 1. For player 2 there is a similar mixed-strategy vector y = (y1,
y2, ... yn). The expected payoff for mixed
strategies used by the two players can be written as
P(x,
y) = xAyt
The maximin in
this general case is
VI =
maxx miny xAyt
And the
minimax is
VII = miny
maxx xAyt
The minimax
theorem
The minimax
theorem says that for all matrix games,
VI = VII.
The common value of the payoff is called the value of the game. The maximization of x and the
minimization of y are optimal
mixed strategies.
Linear programming is the most common technique used for the
computation of optimal strategies in a matrix game.
Chess is an
example of a zero-sum two-player game. There is a theorem in game theory
according to which any finite, zero-sum, two-player game like chess has an
optimal solution. That is, there is an optimal way of choosing moves in
chess that will make a player do better than any other set of moves. But the
total possible number of moves in chess is huge: ~10120. The problem
is that no one knows what that optimal solution is. One never plays the same
game twice.
Suppose there
are more than two players in a game; i.e. it is a plural game. In any game, whether dual or plural, the total payoff
may be zero (zero-sum games) or nonzero (nonzero-sum games). For the latter
category, the interests of the players may not be opposite always; i.e. both
noncooperation and cooperation are conceivable. I shall focus on noncooperative games in the next post.