From Optimality to Equilibrium

Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
From Optimality to Equilibrium
Lecture 4
From Optimality to Equilibrium
Lecture 4, Slide 1
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Lecture Overview
1
Recap
2
Pareto Optimality
3
Best Response and Nash Equilibrium
4
Mixed Strategies
From Optimality to Equilibrium
Lecture 4, Slide 2
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Non-Cooperative Game Theory
What is it?
mathematical study of interaction between rational,
self-interested agents
Why is it called non-cooperative?
while it’s most interested in situations where agents’ interests
conflict, it’s not restricted to these settings
the key is that the individual is the basic modeling unit, and
that individuals pursue their own interests
cooperative/coalitional game theory has teams as the central
unit, rather than agents
From Optimality to Equilibrium
Lecture 4, Slide 3
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Defining Games
Finite, n-person game: hN, A, ui:
N is a finite set of n players, indexed by i
A = A1 × . . . × An , where Ai is the action set for player i
a ∈ A is an action profile, and so A is the space of action
profiles
u = hu1 , . . . , un i, a utility function for each player, where
ui : A 7→ R
Writing a 2-player game as a matrix:
row player is player 1, column player is player 2
rows are actions a ∈ A1 , columns are a0 ∈ A2
cells are outcomes, written as a tuple of utility values for each
player
From Optimality to Equilibrium
Lecture 4, Slide 4
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Prisoner’s dilemma
3
Competition and Coordination: Normal form games
Prisoner’s dilemma is any game
C
D
C
a, a
b, c
D
c, b
d, d
Figure 3.3 Any c > a > d > b define an instance of Prisoner’s Dilemma.
with c > a > d > b.
To fully understand the role of the payoff numbers we would need to enter into
a discussion of utility theory. Here, let us just mention that for most purposes, the
analysis of any game is unchanged if the payoff numbers undergo any positive affine
From Optimality to Equilibrium
Lecture 4, Slide 5
competition;
one
player’s gain must come
at the expense of the other player.
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
As in the case of common-payoff games, we can use an abbreviated matrix form to
represent zero-sum
in which we write only one payoff value in each cell. This
Games
of Puregames,
Competition
value represents the payoff of player 1, and thus the negative of the payoff of player 2.
Note, though, that whereas the full matrix representation is unambiguous, when we use
Players have exactly opposed interests
the abbreviation we must explicit state whether this matrix represents a common-payoff
must
game or aThere
zero-sum
one.be precisely two players (otherwise they can’t
A classical
example
a zero-sum
game is the game of matching pennies. In this
have exactlyofopposed
interests)
game, each
of
the
two
players
has
a
penny,
independently
to display either
For all action profiles a ∈ A, and
u1 (a)
+ u2 (a) =chooses
c for some
heads or tails.
The
two
players
then
compare
their
pennies.
If
they
are
the same then
constant c
player 1 pockets both, and otherwise player 2 pockets them. The payoff matrix is
Special case: zero sum
shown in Figure 3.5.
Heads
Tails
Heads
1
−1
Tails
−1
1
Figure 3.5
From Optimality to Equilibrium
Matching Pennies game.
Lecture 4, Slide 6
ui (a)
Recap= uj (a). Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Common-payoff
games are also called pure coordination games, since in such games
Games
of Cooperation
the agents have no conflicting interests; their sole challenge is to coordinate on an
action that is maximally beneficial to all.
Because of their special nature, we often represent common value games with an
Players have exactly the same interests.
abbreviated form of the matrix in which we list only one payoff in each of the cells.
no conflict:
all two
players
want
the towards
same things
As an example,
imagine
drivers
driving
each other in a country without
traffic rules,
and
who
must
independently
decide
whether
to drive on the left or on the
∀a ∈ A, ∀i, j, ui (a) = uj (a)
right. If the players choose the same side (left or right) they have some high utility, and
otherwise they have a low utility. The game matrix is shown in Figure 3.4.
Left
Right
Left
1
0
Right
0
1
Figure 3.4 Coordination game.
At the other end of the spectrum from pure coordination games lie zero-sum games,
4, Slide 7
which (bearing in mind the comment we made earlier about positive affine Lecture
transforma-
From Optimality to Equilibrium
Recap
and Nash Equilibrium
0 Best Response−1
1
Pareto Optimality
Rock
Mixed Strategies
General Games: Battle of the Sexes
1
0
−1
−1
1
0
Paper
Scissors
The most interesting games combine elements of cooperation and
Figure 3.6 Rock, Paper, Scissors game.
competition.
B
F
B
2, 1
0, 0
F
0, 0
1, 2
Figure 3.7
Battle of the Sexes game.
Strategies in normal-form games
We have so far defined the actions available to each player in a game, butLecture
not yet
his
4, Slide 8
From Optimality to Equilibrium
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Lecture Overview
1
Recap
2
Pareto Optimality
3
Best Response and Nash Equilibrium
4
Mixed Strategies
From Optimality to Equilibrium
Lecture 4, Slide 9
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Analyzing Games
We’ve defined some canonical games, and thought about how
to play them. Now let’s examine the games from the outside
From the point of view of an outside observer, can some
outcomes of a game be said to be better than others?
From Optimality to Equilibrium
Lecture 4, Slide 10
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Analyzing Games
We’ve defined some canonical games, and thought about how
to play them. Now let’s examine the games from the outside
From the point of view of an outside observer, can some
outcomes of a game be said to be better than others?
we have no way of saying that one agent’s interests are more
important than another’s
intuition: imagine trying to find the revenue-maximizing
outcome when you don’t know what currency has been used to
express each agent’s payoff
Are there situations where we can still prefer one outcome to
another?
From Optimality to Equilibrium
Lecture 4, Slide 10
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Pareto Optimality
Idea: sometimes, one outcome o is at least as good for every
agent as another outcome o0 , and there is some agent who
strictly prefers o to o0
in this case, it seems reasonable to say that o is better than o0
we say that o Pareto-dominates o0 .
From Optimality to Equilibrium
Lecture 4, Slide 11
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Pareto Optimality
Idea: sometimes, one outcome o is at least as good for every
agent as another outcome o0 , and there is some agent who
strictly prefers o to o0
in this case, it seems reasonable to say that o is better than o0
we say that o Pareto-dominates o0 .
An outcome o∗ is Pareto-optimal if there is no other outcome
that Pareto-dominates it.
From Optimality to Equilibrium
Lecture 4, Slide 11
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Pareto Optimality
Idea: sometimes, one outcome o is at least as good for every
agent as another outcome o0 , and there is some agent who
strictly prefers o to o0
in this case, it seems reasonable to say that o is better than o0
we say that o Pareto-dominates o0 .
An outcome o∗ is Pareto-optimal if there is no other outcome
that Pareto-dominates it.
can a game have more than one Pareto-optimal outcome?
From Optimality to Equilibrium
Lecture 4, Slide 11
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Pareto Optimality
Idea: sometimes, one outcome o is at least as good for every
agent as another outcome o0 , and there is some agent who
strictly prefers o to o0
in this case, it seems reasonable to say that o is better than o0
we say that o Pareto-dominates o0 .
An outcome o∗ is Pareto-optimal if there is no other outcome
that Pareto-dominates it.
can a game have more than one Pareto-optimal outcome?
does every game have at least one Pareto-optimal outcome?
From Optimality to Equilibrium
Lecture 4, Slide 11
equences
in Optimality
Figure 3.1. YourBestoptions
the Equilibrium
two rows, and
Recap are shown
Pareto
Response are
and Nash
ue’s options are the columns. In each cell, the first number represents
or, Pareto
minus yourOptimal
delay), and the
second number
your
colleague’s
Outcomes
in represents
Example
Games
C
D
C
−1, −1
−4, 0
D
0, −4
−3, −3
Mixed Strategies
Figure 3.1 The TCP user’s (aka the Prisoner’s) Dilemma.
e options what should you adopt, C or D? Does it depend on what you
lleague will do? Furthermore, from the perspective of the network operaof behavior can he expect from the two users? Will any two users behave
n presented with this scenario? Will the behavior change if the network
ws the users to communicate with each other before making a decision?
hanges to the delays would the users’ decisions still be the same? How
rs behave if they have the opportunity to face this same decision with the
part multiple times? Do answers to the above questions depend on how
From Optimality to Equilibrium
Lecture 4, Slide 12
As
example,
two drivers
towards
other in a cou
equences
in an
Figure
3.1. imagine
YourBestoptions
thedriving
two rows,
and each
Recap are shown
Pareto
Optimality
Response are
and Nash
Equilibrium
Mixed Strategies
traffic
rules,
and
who
must
independently
decide
whether
to
drive on the l
ue’s options are the columns. In each cell, the first number represents
right.
If
the
players
choose
the
same
side
(left
or
right)
they
have
some hig
or, Pareto
minus yourOptimal
delay), and the
second number
your
colleague’s
Outcomes
in represents
Example
Games
otherwise they have a low utility. The game matrix is shown in Figure 3.4
C
D
C
−1, −1
−4, 0
D
0, −4
−3, −3
Left
Right
Left
1
0
Right
0
1
3.4 Coordination game.
Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure
Dilemma.
game
At you
the other
of the
from pure
coordination
games lie zero
eero-sum
options
what should
adopt,end
C or
D? spectrum
Does it depend
on what
you
which
(bearing
in
mind
the
comment
we
made
earlier
about
positive affine
lleague will do? Furthermore, from the perspective of the network operaonstant-sum
tions)
are
more
properly
called
constant-sum
games.
Unlike
common-pa
of behavior can he expect from the two users? Will any two users behave
ames
n presented with this scenario? Will the behavior change if the network
c before making
Shoham
and Leyton-Brown,
ws the users to communicate with each other
a decision? 2006
hanges to the delays would the users’ decisions still be the same? How
rs behave if they have the opportunity to face this same decision with the
part multiple times? Do answers to the above questions depend on how
From Optimality to Equilibrium
Lecture 4, Slide 12
As
example,
two drivers
towards
other in a cou
equences
in an
Figure
3.1. imagine
YourBestoptions
thedriving
two rows,
and each
Recap are shown
Pareto
Optimality
Response are
and Nash
Equilibrium
Mixed Strategies
traffic
rules,
and
who
must
independently
decide
whether
to
drive on the l
ue’s options are the columns. In each cell, the first number represents
right.
If
the
players
choose
the
same
side
(left
or
right)
they
have
some hig
Rock
Paper
Scissors
or, Pareto
minus yourOptimal
delay), and the
second number
your
colleague’s
Outcomes
in represents
Example
Games
otherwise they have a low utility. The game matrix is shown in Figure 3.4
0
Rock
Paper
Scissors
C
−1
D
1
Left
Right
C
1
−1, −1
0
−4, 0
−1 Left
1
0
D
−1
0, −4
1
−3, −3
0 Right
0
1
Figure 3.6 Rock, Paper, Scissors game.
3.4 Coordination game.
Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure
Dilemma.
B
F
game
At you
the other
of the
from pure
coordination
games lie zero
eero-sum
options
what should
adopt,end
C or
D? spectrum
Does it depend
on what
you
which
(bearing
in
mind
the
comment
we
made
earlier
about
positive affine
lleague will do? Furthermore, from the perspective of the network operaB are2,more
1 properly
0, 0
onstant-sum
tions)
called
constant-sum
games.
Unlike
common-pa
of behavior can he expect from the two users? Will any two users behave
ames
n presented with this scenario? Will the behavior change if the network
c before making
Shoham
and Leyton-Brown,
ws the users to communicate
each
a decision? 2006
F
0, 0with 1,
2 other
hanges to the delays would the users’ decisions still be the same? How
rs behave if they have the opportunity to face this same decision with the
Figure 3.7 Battle of the Sexes game.
part multiple times? Do answers to the above questions depend on how
From Optimality to Equilibrium
Lecture 4, Slide 12
As
example,
imagine
two drivers
driving
towards
other
in a cou
equences
in an
Figure
3.1.
YourBestoptions
are
the
twoat rows,
and each
Recap are shown
Pareto
Optimality
Response
and Nash
Equilibrium
Mixed
Strategies
competition;
one player’s
gain must
come
the expense
of the
other play
traffic
rules,
and
who
must
independently
decide
whether
to
drive
on the l
ue’s options are the columns.
In each
cell, the first number
As in the case
of common-payoff
games, represents
we can use an abbreviated m
right.
If
the
players
choose
the
same
side
(left
or
right)
they
have
some
hig
Rock
Paper
Scissors
or, Pareto
minus yourOptimal
delay),
and the
second number
your
colleague’s
Outcomes
in represents
Example
Games
represent
zero-sum
games,
in
which we
write
only one payoff value in ea
otherwise they have a low utility. The game matrix is shown in Figure 3.4
value represents the payoff of player 1, and thus the negative of the payof
representation is unambiguous,
Rock Note,
0 though, that
−1 whereas the
1 full matrix
Left
Right
C
D must explicit state whether
the abbreviation
we
this matrix represents a com
game or a zero-sum one.
Paper
1classical example
0
−1 Left game
1 is the0game of matching pen
C A−1,
−1
−4, 0 of a zero-sum
game, each of the two players has a penny, and independently chooses to d
then compare their pennies. If they are th
Scissors heads
−1 or tails. The
1 two players
0 Right
0
1
Dplayer
0, −4
−3,both,
−3 and otherwise
1 pockets
player 2 pockets them. The pay
shown in Figure 3.5.
Figure 3.6 Rock, Paper, Scissors game.
3.4 Coordination game.
Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure
Dilemma.
Heads Tails
B
F
game
At you
the other
of the
from pure
coordination
games lie zero
eero-sum
options
what should
adopt,end
C or
D? spectrum
Does it depend
on what
you
which
(bearing
in
mind
the
comment
we
made
earlier
about
positive affine
lleague will do? Furthermore, from the perspective of the network operaHeads
1 games.
−1 Unlike common-pa
B are2,more
1 properly
0, 0
onstant-sum
tions)
called
constant-sum
of behavior can he expect from the two users? Will any two users behave
ames
n presented with this scenario? Will the behavior change if the network
c before
Shoham
and Leyton-Brown,
Tailsmaking
−1a decision?
1 2006
ws the users to communicate
each
F
0, 0with 1,
2 other
hanges to the delays would the users’ decisions still be the same? How
rs behave if they have the opportunity to faceFigure
this same
decision with the
3.5 Matching Pennies game.
Figure 3.7 Battle of the Sexes game.
part multiple times? Do answers to the above questions depend on how
From Optimality to Equilibrium
Lecture 4, Slide 12
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Lecture Overview
1
Recap
2
Pareto Optimality
3
Best Response and Nash Equilibrium
4
Mixed Strategies
From Optimality to Equilibrium
Lecture 4, Slide 13
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Best Response
If you knew what everyone else was going to do, it would be
easy to pick your own action
From Optimality to Equilibrium
Lecture 4, Slide 14
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Best Response
If you knew what everyone else was going to do, it would be
easy to pick your own action
Let a−i = ha1 , . . . , ai−1 , ai+1 , . . . , an i.
now a = (a−i , ai )
Best response: a∗i ∈ BR(a−i ) iff
∀ai ∈ Ai , ui (a∗i , a−i ) ≥ ui (ai , a−i )
From Optimality to Equilibrium
Lecture 4, Slide 14
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Nash Equilibrium
Now let’s return to the setting where no agent knows
anything about what the others will do
What can we say about which actions will occur?
From Optimality to Equilibrium
Lecture 4, Slide 15
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Nash Equilibrium
Now let’s return to the setting where no agent knows
anything about what the others will do
What can we say about which actions will occur?
Idea: look for stable action profiles.
a = ha1 , . . . , an i is a (“pure strategy”) Nash equilibrium iff
∀i, ai ∈ BR(a−i ).
From Optimality to Equilibrium
Lecture 4, Slide 15
ue’s options are the columns. In each cell, the first number represents
Recap
Pareto Optimality
Best Response and Nash Equilibrium
or, minus your delay), and the second number represents your colleague’s
Mixed Strategies
Nash Equilibria of Example Games
C
D
C
−1, −1
−4, 0
D
0, −4
−3, −3
Figure 3.1 The TCP user’s (aka the Prisoner’s) Dilemma.
e options what should you adopt, C or D? Does it depend on what you
lleague will do? Furthermore, from the perspective of the network operaof behavior can he expect from the two users? Will any two users behave
n presented with this scenario? Will the behavior change if the network
ws the users to communicate with each other before making a decision?
hanges to the delays would the users’ decisions still be the same? How
rs behave if they have the opportunity to face this same decision with the
part multiple times? Do answers to the above questions depend on how
gents are and how they view each other’s rationality?
Optimality
to Equilibrium
ry From
gives
answers
to many of these questions. It tells us that any rational
Lecture 4, Slide 16
ue’s options are the columns. In each cell, the first number represents
Recap
Pareto Optimality
Best Response
and Nash
Equilibrium
Mixed
Strategies
right.
If the
the second
players number
choose
the
same
side
(left
or right) they
have
some hig
or, minus your delay),
and
represents
your
colleague’s
otherwise they have a low utility. The game matrix is shown in Figure 3.4
Nash Equilibria of Example Games
C
D
C
−1, −1
−4, 0
D
0, −4
−3, −3
Left
Right
Left
1
0
Right
0
1
3.4 Coordination game.
Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure
Dilemma.
game
At you
the other
of the
from pure
coordination
games lie zero
eero-sum
options
what should
adopt,end
C or
D? spectrum
Does it depend
on what
you
which
(bearing
in
mind
the
comment
we
made
earlier
about
positive affine
lleague will do? Furthermore, from the perspective of the network operaonstant-sum
tions)
are
more
properly
called
constant-sum
games.
Unlike
common-pa
of behavior can he expect from the two users? Will any two users behave
ames
n presented with this scenario? Will the behavior change if the network
c before making
Shoham
and Leyton-Brown,
ws the users to communicate with each other
a decision? 2006
hanges to the delays would the users’ decisions still be the same? How
rs behave if they have the opportunity to face this same decision with the
part multiple times? Do answers to the above questions depend on how
gents are and how they view each other’s rationality?
Optimality
to Equilibrium
ry From
gives
answers
to many of these questions. It tells us that any rational Lecture 4, Slide 16
ue’s options are the columns. In each cell, the first number represents
Recap
Pareto Optimality
Best Response
and Nash
Equilibrium
Mixed
Strategies
right.
If the
the second
players
choose
the
same
side
(left
or right) they
have
some hig
Rock
Papernumber
Scissors
or, minus your delay),
and
represents
your
colleague’s
otherwise they have a low utility. The game matrix is shown in Figure 3.4
Nash Equilibria of Example Games
0
Rock
Paper
Scissors
C
−1
D
1
Left
Right
C
1
−1, −1
0
−4, 0
−1 Left
1
0
D
−1
0, −4
1
−3, −3
0 Right
0
1
Figure 3.6 Rock, Paper, Scissors game.
3.4 Coordination game.
Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure
Dilemma.
B
F
game
At you
the other
of the
from pure
coordination
games lie zero
eero-sum
options
what should
adopt,end
C or
D? spectrum
Does it depend
on what
you
which
(bearing
in
mind
the
comment
we
made
earlier
about
positive affine
lleague will do? Furthermore, from the perspective of the network operaB are2,more
1 properly
0, 0
onstant-sum
tions)
called
constant-sum
games.
Unlike
common-pa
of behavior can he expect from the two users? Will any two users behave
ames
n presented with this scenario? Will the behavior change if the network
c before making
Shoham
and Leyton-Brown,
ws the users to communicate
each
a decision? 2006
F
0, 0with 1,
2 other
hanges to the delays would the users’ decisions still be the same? How
rs behave if they have the opportunity to face this same decision with the
Figure 3.7 Battle of the Sexes game.
part multiple times? Do answers to the above questions depend on how
gents are and how they view each other’s rationality?
Optimality
to Equilibrium
ry From
gives
answers
to many of these questions. It tells us that any rational Lecture 4, Slide 16
ue’s options are the columns.
In each
cell, the first number
represents
As in the case
of common-payoff
games,
we can use an
abbreviated
m
Recap
Pareto Optimality
Best Response
and Nash
Equilibrium
Mixed
Strategies
right.
If the
the second
players
choose
the
same
side
(left
or right) they
have
some hig
Rock
Papernumber
Scissors
or, minus your delay),
and
represents
your
colleague’s
represent
zero-sum
games,
in which we
write
only one payoff value in ea
otherwise they have a low utility. The game matrix is shown in Figure 3.4
value represents
the payoff
of player 1, and thus the negative of the payof
Nash Equilibria
of Example
Games
Note,
though,
that
whereas
the
representation is unambiguous,
Rock
0
−1
1 full matrix
Left
Right
C
D must explicit state whether
the abbreviation
we
this matrix represents a com
game or a zero-sum one.
Paper
1classical example
0
−1 Left game
1 is the0game of matching pen
C A−1,
−1
−4, 0 of a zero-sum
game, each of the two players has a penny, and independently chooses to d
then compare their pennies. If they are th
Scissors heads
−1 or tails. The
1 two players
0 Right
0
1
Dplayer
0, −4
−3,both,
−3 and otherwise
1 pockets
player 2 pockets them. The pay
shown in Figure 3.5.
Figure 3.6 Rock, Paper, Scissors game.
3.4 Coordination game.
Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure
Dilemma.
Heads Tails
B
F
game
At you
the other
of the
from pure
coordination
games lie zero
eero-sum
options
what should
adopt,end
C or
D? spectrum
Does it depend
on what
you
which
(bearing
in
mind
the
comment
we
made
earlier
about
positive affine
lleague will do? Furthermore, from the perspective of the network operaHeads
1 games.
−1 Unlike common-pa
B are2,more
1 properly
0, 0
onstant-sum
tions)
called
constant-sum
of behavior can he expect from the two users? Will any two users behave
ames
n presented with this scenario? Will the behavior change if the network
c before
Shoham
and Leyton-Brown,
Tailsmaking
−1a decision?
1 2006
ws the users to communicate
each
F
0, 0with 1,
2 other
hanges to the delays would the users’ decisions still be the same? How
rs behave if they have the opportunity to faceFigure
this same
decision with the
3.5 Matching Pennies game.
Figure 3.7 Battle of the Sexes game.
part multiple times? Do answers to the above questions depend on how
gents are and how they view each other’s rationality?
The popular
of Rock,
Scissors, also
as R
Optimality
to Equilibrium
Lectureknown
4, Slide 16
ry From
gives
answers
to many
of thesechildren’s
questions.game
It tells
us thatPaper,
any rational
ue’s options are the columns.
In each
cell, the first number
represents
As in the case
of common-payoff
games,
we can use an
abbreviated
m
Recap
Pareto Optimality
Best Response
and Nash
Equilibrium
Mixed
Strategies
right.
If the
the second
players
choose
the
same
side
(left
or right) they
have
some hig
Rock
Papernumber
Scissors
or, minus your delay),
and
represents
your
colleague’s
represent
zero-sum
games,
in which we
write
only one payoff value in ea
otherwise they have a low utility. The game matrix is shown in Figure 3.4
value represents
the payoff
of player 1, and thus the negative of the payof
Nash Equilibria
of Example
Games
Note,
though,
that
whereas
the
representation is unambiguous,
Rock
0
−1
1 full matrix
Left
Right
C
D must explicit state whether
the abbreviation
we
this matrix represents a com
game or a zero-sum one.
Paper
1classical example
0
−1 Left game
1 is the0game of matching pen
C A−1,
−1
−4, 0 of a zero-sum
game, each of the two players has a penny, and independently chooses to d
then compare their pennies. If they are th
Scissors heads
−1 or tails. The
1 two players
0 Right
0
1
Dplayer
0, −4
−3,both,
−3 and otherwise
1 pockets
player 2 pockets them. The pay
shown in Figure 3.5.
Figure 3.6 Rock, Paper, Scissors game.
3.4 Coordination game.
Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure
Dilemma.
Heads Tails
B
F
game
At you
the other
of the
from pure
coordination
games lie zero
eero-sum
options
what should
adopt,end
C or
D? spectrum
Does it depend
on what
you
which
(bearing
in
mind
the
comment
we
made
earlier
about
positive affine
lleague will do? Furthermore, from the perspective of the network operaHeads
1 games.
−1 Unlike common-pa
B are2,more
1 properly
0, 0
onstant-sum
tions)
called
constant-sum
of behavior can he expect from the two users? Will any two users behave
ames
n presented with this scenario? Will the behavior change if the network
c before
Shoham
and Leyton-Brown,
Tailsmaking
−1a decision?
1 2006
ws the users to communicate
each
F
0, 0with 1,
2 other
hanges to the delays would the users’ decisions still be the same? How
rs behave The
if they
have the
opportunity
to faceFigure
this same
decision
with
the
paradox
Prisoner’s
dilemma:
Nash
equilibrium
is game.
the only
Matching
Pennies
Figure
3.7 of
Battle
of the Sexes
game. the3.5
part multiple times? Do answers
to
the
above
questions
depend
on
how
non-Pareto-optimal outcome!
gents are and how they view each other’s rationality?
The popular
of Rock,
Scissors, also
as R
Optimality
to Equilibrium
Lectureknown
4, Slide 16
ry From
gives
answers
to many
of thesechildren’s
questions.game
It tells
us thatPaper,
any rational
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Lecture Overview
1
Recap
2
Pareto Optimality
3
Best Response and Nash Equilibrium
4
Mixed Strategies
From Optimality to Equilibrium
Lecture 4, Slide 17
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Mixed Strategies
It would be a pretty bad idea to play any deterministic
strategy in matching pennies
Idea: confuse the opponent by playing randomly
Define a strategy si for agent i as any probability distribution
over the actions Ai .
pure strategy: only one action is played with positive
probability
mixed strategy: more than one action is played with positive
probability
these actions are called the support of the mixed strategy
Let the set of all strategies for i be Si
Let the set of all strategy profiles be S = S1 × . . . × Sn .
From Optimality to Equilibrium
Lecture 4, Slide 18
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Utility under Mixed Strategies
What is your payoff if all the players follow mixed strategy
profile s ∈ S?
We can’t just read this number from the game matrix
anymore: we won’t always end up in the same cell
From Optimality to Equilibrium
Lecture 4, Slide 19
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Utility under Mixed Strategies
What is your payoff if all the players follow mixed strategy
profile s ∈ S?
We can’t just read this number from the game matrix
anymore: we won’t always end up in the same cell
Instead, use the idea of expected utility from decision theory:
X
ui (s) =
ui (a)P r(a|s)
a∈A
P r(a|s) =
Y
sj (aj )
j∈N
From Optimality to Equilibrium
Lecture 4, Slide 19
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Best Response and Nash Equilibrium
Our definitions of best response and Nash equilibrium generalize
from actions to strategies.
Best response:
s∗i ∈ BR(s−i ) iff ∀si ∈ Si , ui (s∗i , s−i ) ≥ ui (si , s−i )
From Optimality to Equilibrium
Lecture 4, Slide 20
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Best Response and Nash Equilibrium
Our definitions of best response and Nash equilibrium generalize
from actions to strategies.
Best response:
s∗i ∈ BR(s−i ) iff ∀si ∈ Si , ui (s∗i , s−i ) ≥ ui (si , s−i )
Nash equilibrium:
s = hs1 , . . . , sn i is a Nash equilibrium iff ∀i, si ∈ BR(s−i )
From Optimality to Equilibrium
Lecture 4, Slide 20
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Best Response and Nash Equilibrium
Our definitions of best response and Nash equilibrium generalize
from actions to strategies.
Best response:
s∗i ∈ BR(s−i ) iff ∀si ∈ Si , ui (s∗i , s−i ) ≥ ui (si , s−i )
Nash equilibrium:
s = hs1 , . . . , sn i is a Nash equilibrium iff ∀i, si ∈ BR(s−i )
Every finite game has a Nash equilibrium! [Nash, 1950]
e.g., matching pennies: both players play heads/tails 50%/50%
From Optimality to Equilibrium
Lecture 4, Slide 20
0
−1
1
1
0
−1
−1
1
0
Rock
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Paper
Mixed Strategies
Computing Mixed Nash Equilibria: Battle of the Sexes
Scissors
Figure 3.6
B
F
B
2, 1
0, 0
F
0, 0
1, 2
Figure 3.7
3.2.2
pure strategy
Rock, Paper, Scissors game.
Battle of the Sexes game.
It’s hard in general to compute Nash equilibria, but it’s easy
Strategies in normal-form games
when you can guess the support
We have so far defined the actions available to each player in a game, but not yet his
set ofBoS,
strategies,
or his
available
choices.
Certainly onewhere
kind of strategy
is to select
For
let’s
look
for an
equilibrium
all actions
are
a single action and play it; we call such a strategy a pure strategy, and we will use
part
of
the
support
the notation we have already developed for actions to represent it. There is, however,
another, less obvious type of strategy; a player can choose to randomize over the set of
available actions according to some probability distribution; such a strategy is called
mixed strategy
a mixed strategy. Although it may not be immediately obvious why a player should
introduce randomness into his choice of action, in fact in a multi-agent setting the role
of mixed strategies is critical. We will return to this when we discuss solution concepts
From Optimality for
to Equilibrium
Lecture 4, Slide 21
games in the next section.
Recap
Pareto Optimality
Best Response and Nash Equilibrium
1
−1
Scissors
Mixed Strategies
0
Computing Mixed Nash Equilibria: Battle of the Sexes
Figure 3.6
B
F
B
2, 1
0, 0
F
0, 0
1, 2
Figure 3.7
3.2.2
pure strategy
mixed strategy
Rock, Paper, Scissors game.
Battle of the Sexes game.
Let player 2 play B with p, F with 1 − p.
Strategies in normal-form games
If player 1 best-responds with a mixed strategy, player 2 must
We have so far defined the actions available to each player in a game, but not yet his
make
him indifferent
between
F andoneBkind
(why?)
set of strategies,
or his available
choices. Certainly
of strategy is to select
a single action and play it; we call such a strategy a pure strategy, and we will use
the notation we have already developed for actions to represent it. There is, however,
another, less obvious type of strategy; a player can choose to randomize over the set of
available actions according to some probability distribution; such a strategy is called
a mixed strategy. Although it may not be immediately obvious why a player should
introduce randomness into his choice of action, in fact in a multi-agent setting the role
of mixed strategies is critical. We will return to this when we discuss solution concepts
for games in the next section.
We define a mixed strategy for a normal form game as follows.
From Optimality Definition
to Equilibrium
3.2.4
Lecture 4, Slide 21
Let (N, (A1 , . . . , An ), O, µ, u) be a normal form game, and for any
Recap
Pareto Optimality
Best Response and Nash Equilibrium
1
−1
Scissors
Mixed Strategies
0
Computing Mixed Nash Equilibria: Battle of the Sexes
Figure 3.6
B
F
B
2, 1
0, 0
F
0, 0
1, 2
Figure 3.7
3.2.2
pure strategy
mixed strategy
Rock, Paper, Scissors game.
Battle of the Sexes game.
Let player 2 play B with p, F with 1 − p.
Strategies in normal-form games
If player 1 best-responds with a mixed strategy, player 2 must
We have so far defined the actions available to each player in a game, but not yet his
make
him indifferent
between
F andoneBkind
(why?)
set of strategies,
or his available
choices. Certainly
of strategy is to select
a single action and play it; we call such a strategy a pure strategy, and we will use
the notation we have already developed
for actions
it. There is, however,
u1 (B)
= u1to(Frepresent
)
another, less obvious type of strategy; a player can choose to randomize over the set of
available actions according
to some
a strategy is called
2p +
0(1 probability
− p) = distribution;
0p + 1(1such
− p)
a mixed strategy. Although it may not be immediately obvious why a player should
1
introduce randomness into his choice of action, in fact in a multi-agent setting the role
p = when we discuss solution concepts
of mixed strategies is critical. We will return to this
3
for games in the next section.
We define a mixed strategy for a normal form game as follows.
From Optimality Definition
to Equilibrium
3.2.4
Lecture 4, Slide 21
Let (N, (A1 , . . . , An ), O, µ, u) be a normal form game, and for any
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Scissors
1
−1
0
Mixed Strategies
Computing Mixed Nash
Equilibria: Battle of the Sexes
Figure 3.6 Rock, Paper, Scissors game.
B
F
B
2, 1
0, 0
F
0, 0
1, 2
Figure 3.7
3.2.2
pure strategy
mixed strategy
Battle of the Sexes game.
Likewise, player 1 must randomize to make player 2
indifferent.
Strategies in normal-form games
Why is player 1 willing to randomize?
We have so far defined the actions available to each player in a game, but not yet his
set of strategies, or his available choices. Certainly one kind of strategy is to select
a single action and play it; we call such a strategy a pure strategy, and we will use
the notation we have already developed for actions to represent it. There is, however,
another, less obvious type of strategy; a player can choose to randomize over the set of
available actions according to some probability distribution; such a strategy is called
a mixed strategy. Although it may not be immediately obvious why a player should
introduce randomness into his choice of action, in fact in a multi-agent setting the role
of mixed strategies is critical. We will return to this when we discuss solution concepts
for games in the next section.
We define a mixed strategy for a normal form game as follows.
Definition 3.2.4 Let (N, (A1 , . . . , An ), O, µ, u) be a normal form game, and for any
From Optimality to Equilibrium
Lecture 4, Slide 21
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Scissors
1
−1
0
Mixed Strategies
Computing Mixed Nash
Equilibria: Battle of the Sexes
Figure 3.6 Rock, Paper, Scissors game.
B
F
B
2, 1
0, 0
F
0, 0
1, 2
Figure 3.7
3.2.2
Battle of the Sexes game.
Likewise, player 1 must randomize to make player 2
indifferent.
Strategies in normal-form games
Why is player 1 willing to randomize?
We have so far defined the actions available to each player in a game, but not yet his
pure strategy
mixed strategy
Let
1 orplay
B with
q, FCertainly
with 1one−kind
q. of strategy is to select
set ofplayer
strategies,
his available
choices.
a single action and play it; we call such a strategy a pure strategy, and we will use
u2 (B)
u2 (F
)
the notation we have already developed
for =
actions
to represent
it. There is, however,
another, less obvious type of strategy; a player can choose to randomize over the set of
q +to 0(1
q) = 0q
+ 2(1 such
− q)a strategy is called
available actions according
some−
probability
distribution;
a mixed strategy. Although it may not be immediately
obvious
why a player should
2
introduce randomness into his choice of action,
q = in fact in a multi-agent setting the role
of mixed strategies is critical. We will return to this
3 when we discuss solution concepts
for games in the next section.
2 1
Thus
the amixed
strategies
( 3 form
, 3 ),game
( 13 ,as32follows.
) are a Nash
We define
mixed strategy
for a normal
equilibrium.
Definition 3.2.4 Let (N, (A1 , . . . , An ), O, µ, u) be a normal form game, and for any
From Optimality to Equilibrium
Lecture 4, Slide 21
Recap
Pareto Optimality
Best Response and Nash Equilibrium
Mixed Strategies
Interpreting Mixed Strategy Equilibria
What does it mean to play a mixed strategy? Different
interpretations:
Randomize to confuse your opponent
consider the matching pennies example
Players randomize when they are uncertain about the other’s
action
consider battle of the sexes
Mixed strategies are a concise description of what might
happen in repeated play: count of pure strategies in the limit
Mixed strategies describe population dynamics: 2 agents
chosen from a population, all having deterministic strategies.
MS is the probability of getting an agent who will play one PS
or another.
From Optimality to Equilibrium
Lecture 4, Slide 22

Download Report

From Optimality to Equilibrium

Paperzz.com

Your Paperzz