Game playing

Game playing
Kecerdasan Buatan
Pertemuan 5
IT-EEPIS
Kenapa mempelajari games?
•
•
•
•
Kriteria menang atau kalah jelas
Dapat mempelajari permasalahan
Menyenangkan
Biasanya mempunyai search space yang
besar (misalnya game catur mempunyai
35100 nodes dalam search tree dan 1040
legal states)
Seberapa hebat
computer game player?
– Catur:
• Deep Blue mengalahkan Gary Kasparov pada tahun
1997
• Gary Kasparav vs. Deep Junior (Feb 2003): seri
– Checkers:
• Chinook adalah juara dunia
– Go:
• Computer player adalah sangat tangguh
– Bridge:
• computer players mempunyai “Expert-level”
Permainan Catur
Deep Blue
• Deep Blue adalah sebuah komputer catur buatan IBM.
• Deep Blue adalah komputer pertama yang
memenangkan sebuah permainan catur melawan
seorang juara dunia (Garry Kasparov) dalam waktu
standar sebuah turnamen catur. Kemenangan
pertamanya (dalam pertandingan atau babak pertama)
terjadi pada 10 Februari 1996, dan merupakan
permainan yang sangat terkenal. Namun Kasparov
kemudian memenangkan 3 pertandingan lainnya dan
memperoleh hasil remis pada 2 pertandingan
selanjutnya, sehingga mengalahkan Deep Blue dengan
hasil 4-2.
Permainan Catur
Deep Blue
• Deep Blue lalu diupgrade lagi secara besar-besaran
dan kembali bertanding melawan Kasparov pada
Mei 1997. Dalam pertandingan enam babak
tersebut Deep Blue menang dengan hasil 3,5-2,5.
Babak terakhirnya berakhir pada 11 Mei. Deep Blue
menjadi komputer pertama yang mengalahkan juara
dunia bertahan.
• Komputer ini saat ini sudah "dipensiunkan" dan
dipajang di Museum Nasional Sejarah Amerika
(National Museum of American History), Amerika
Serikat.
Permainan Catur
Deep Blue
Garry Kasparov and Deep Blue. © 1997,
GM Gabriel Schwartzman's Chess Camera, courtesy IBM.
Ratings of human and computer chess
champions
January/February 2003
Ciri umum pada game
• 2 pemain
• Kesempatan pemain bergantian
• Zero-sum: kerugian seorang pemain adalah
keuntungan pemain lain
• Perfect information: pemain mengetahui semua
informasi state dari game
• Contoh: Tic-Tac-Toe, Checkers, Chess, Go, Nim,
Othello
• Tidak mengandung probabilistik (seperti dadu)
• Game tidak termasuk Bridge, Solitaire, Backgammon,
dan semisalnya
Bagaimana bermain game?
• Cara bermain game:
– Pertimbangkan semua kemungkinan jalan
– Berikan nilai pada semua kemungkinan jalan
– Jalankan pada kemungkinan yang mempunyai
nilai terbaik
– Tunggu giliran pihak lawan jalan
– Ulangi cara diatas
• Key problems:
– Representasikan “board” atau “state”
– Buatlah next board yang legal
– Lakukan evaluasi pada posisi
Evaluation function
• Evaluation function atau static evaluator digunakan
untuk mengevaluasi nilai posisi yang baik
• Zero-sum assumption membolehkan untuk
menggunakan single evaluation function untuk
mendeskripsikan nilai posisi
–
–
–
–
–
f(n) >> 0: posisi n baik untuk saya dan jelek untuk lawan
f(n) << 0: posisi n jelek untuk saya dan baik untuk lawan
f(n) near 0: posisi n adalah posisi netral/seri
f(n) = +infinity: saya menang
f(n) = -infinity: lawan menang
First three levels of tic-tac-toe state space reduced
by symmetry
The “most wins” heuristic
Heuristically reduced state
space for tic-tac-toe
Consider this position
We are playing X, and it is now our turn.
X = Computer, O = opponent
Let’s write out all possibilities
X move
Each number represents a position after each legal
move we have.
Now let’s look at their options
O move
Here we are looking at all of the opponent responses
to the first possible move we could make.
Now let’s look at their options
Opponent options after our second
possibility. Not good again…
Now let’s look at their options
Struggling…
More interesting case
Now they don’t have a way to win on their next
move. So now we have to consider our responses to
their responses.
Our options
We have a win for any move they make.
So the original position in purple is an X win.
Finishing it up…
They win again if we take our fifth move.
Summary of the Analysis
So which move should we make? ;-)
Game Nim
• Diawali serangkaian batang
• Setiap pemain harus memecah serangkaian batang
menjadi 2 kumpulan dimana jumlah batang di tiap
kumpulan tidak boleh sama dan tidak boleh kosong
+
+
+
A variant of the game nim
• A number of tokens are placed on a table between the
two opponents
• A move consists of dividing a pile of tokens into two
nonempty piles of different sizes
• For example, 6 tokens can be divided into piles of 5 and
1 or 4 and 2, but not 3 and 3
• The first player who can no longer make a move loses
the game
• For a reasonable number of tokens, the state space can
be exhaustively searched
State space for a variant of nim
• Note that state 4-2-1 is repeated. We can simplify the
structure by drawing a general graph.
State space for a variant of nim
Search techniques for 2-person games
• The search tree is slightly different: It is a
two-ply tree where levels alternate between players
• Canonically, the first level is “us” or the player whom we
want to win.
• Each final position is assigned a payoff:
– win (say, 1)
– lose (say, -1)
– draw (say, 0)
•
We would like to maximize the payoff for the first player,
hence the names MAX & MINIMAX
Minimax
• John von Neumann pada tahun 1944
menguraikan sebuah algoritma search
pada game, dikenal dengan nama
Minimax, yang memaksimalkan posisi
pemain dan meminimalkan posisi lawan
The search algorithm
• The root of the tree is the current board position, it is
MAX’s turn to play
• MAX generates the tree as much as it can, and picks the
best move assuming that Min will also choose the moves
for herself.
• This is the Minimax algorithm which was invented by
Von Neumann and Morgenstern in 1944, as part of
game theory.
• The same problem with other search trees: the tree
grows very quickly, exhaustive search is usually
impossible.
Special technique
• MAX generates the full search tree (up to the leaves or
terminal nodes or final game positions) and chooses
the best one:
win or tie
• To choose the best move, values are propogated
upward from the leaves:
– MAX chooses the maximum
– MIN chooses the minimum
• This assumes that the full tree is not prohibitively big
• It also assumes that the final positions are easily
identifiable
• We can make these assumptions for now, so let’s look
at an example
1
MAX
1
MIN
MAX
4
4
B
D
-5
A
-3
1
-5
= terminal position
E
2
1
-7
= agent
C
F
-3
2
G
-3
= opponent
-8
2
1
2
2
7
1
Static evaluator
value
8
2
7
1
8
2
1
2
7
Jalan yang dipilih
oleh Minimax
1
8
2
2
1
MAX
MIN
2
7
1
8
Minimax applied to a hypothetical
state space (Fig. 4.15)
Asumsi
• MIN bermain dulu
• Evaluation function:
– 0  MIN menang
– 1  MAX menang
Complete State Space for Nim
1
MIN
1
MAX
1
6-1
MIN
MIN
MAX
4-3
1
5-1-1
0
1
5-2
0
MAX
7
0
4-2-1
1
3-2-2
3-3-1
1
4-1-1-1
3-1-1-1-1
3-2-1-1
0
2-1-1-1-1-1
0
2-2-2-1
2-2-1-1-1
0
1
1
1
0
1
1
1
0
1
0
1
0
0
1
0
Minimax for Tic Tac Toe
• In our tic tac toe example,
– player 1 is 'X’
– player 2 is 'O’
• the only three scores we will have are
– +1 for a win by 'X',
– -1 for a win by 'O',
– 0 for a draw.
Minimax for Tic Tac Toe (ex 1)
MAX
MIN
MAX
MIN
Minimax for Tic Tac Toe (ex 2)
MAX
0
-1
MIN
0
-1
0
MAX
0
+1
+1
-1
-1
+1
0
0
+1
Special technique
• Use alpha-beta pruning
• Basic idea: if a portion of the tree is obviously
good (bad) don’t explore further to see how
terrific (awful) it is
• Remember that the values are propagated
upward. Highest value is selected at MAX’s
level, lowest value is selected at MIN’s level
• Call the values at MAX levels α values, and the
values at MIN levels β values
The rules
• Search can be stopped below any MIN
node having a beta value less than or
equal to the alpha value of any of its MAX
ancestors(MIN node β≤α)
• Search can be stopped below any MAX
node having an alpha value greater than
or equal to the beta value of any of its MIN
node ancestors (MAX node α≥β)
Example with MAX
α≥3
MAX
MIN
β=3
MAX node α>β
β≤2
MAX
3
4
5
2
(Some of) these
still need to be
looked at
As soon as the node with
value 2 is generated, we
know that the beta value will be
less than 3, we don’t need
to generate these nodes
(and the subtree below them)
Example with MIN
β≤5
MIN
MAX
α=5
MIN node β<α
α≥6
MIN
3
4
5
6
(Some of) these
still need to be
looked at
As soon as the node with
value 6 is generated, we
know that the alpha value will be
larger than 6, we don’t need
to generate these nodes
(and the subtree below them)
A
MAX
<=6
B
C
MIN
6
D
>=8
E
MAX
H
I
J
6
5
8
= agent
K
= opponent
>=6
A
MAX
6
B
<=2
C
MIN
6
D
>=8
E
2
F
G
MAX
H
I
J
6
5
8
= agent
K
L
M
2
1
= opponent
>=6
A
MAX
6
B
2
C
MIN
6
D
>=8
E
2
F
G
MAX
H
I
J
6
5
8
= agent
K
L
M
2
1
= opponent
Alpha-beta Pruning
6
A
MAX
6
B
2
C
beta
cutoff
MIN
6
D
>=8
E
alpha
cutoff
2
F
G
MAX
H
I
J
6
5
8
= agent
K
L
M
2
1
= opponent
Alpha-beta pruning
α≥3
MIN node β<α
β≤3
MAX node α>β
β≤0
β≤2
α=3
α≥5
α=0
α=2
Alpha-beta pruning