Questions - The MIT Press

$$&6'HFLVLRQ0DNLQJXQGHU8QFHUWDLQW\
4XHVWLRQV
3UREDELOLVWLF0RGHOV
'HFLVLRQ3UREOHPV
6HTXHQWLDO3UREOHPV
0RGHO8QFHUWDLQW\
6WDWH8QFHUWDLQW\
3UREDELOLVWLF0RGHOV
>Ĺ@
3UREOHP
&RQVLGHUWKHGHILQLWLRQRIFRQGLWLRQDOSUREDELOLW\
P (A, B) = P (A|B)P (B)
&DQ\RXFRPHXSZLWKDVLPSOHH[SODQDWLRQLQZRUGVDVWRZK\WKLVZRUNV"8VHVLPLODUUHDVRQLQJWRFRPHXS
ZLWKDQH[SUHVVLRQIRUP (A, B|C ) 3UREOHP
RIZRPHQDWDJHIRUW\ZKRSDUWLFLSDWHLQURXWLQHVFUHHQLQJKDYHEUHDVWFDQFHURIZRPHQZLWKEUHDVW
FDQFHUZLOOJHWSRVLWLYHPDPPRJUDSKLHVRIZRPHQZLWKRXWEUHDVWFDQFHUZLOODOVRJHWSRVLWLYH
PDPPRJUDSKLHV$ZRPDQLQWKLVDJHJURXSKDGDSRVLWLYHPDPPRJUDSK\LQDURXWLQHVFUHHQLQJ:KDWLVWKH
SUREDELOLW\WKDWVKHDFWXDOO\KDVEUHDVWFDQFHU"
3UREOHP
7KHUHLVDFKDQFHWKHUHLVERWKOLIHDQGZDWHURQ0DUVDFKDQFHWKHUHLVOLIHEXWQRZDWHUDQGD
FKDQFHWKHUHLVQROLIHDQGQRZDWHU:KDWLVWKHSUREDELOLW\WKDWWKHUHLVOLIHRQ0DUVJLYHQWKDWWKHUHLVZDWHU"
3UREOHP
KWWSZHEVWDQIRUGHGXFODVVDDTXHVWLRQVPG
$$&6'HFLVLRQ0DNLQJXQGHU8QFHUWDLQW\
,QWKHWH[WERRNLWLVVWDWHGWKDWLIDOOYDULDEOHVLQD%D\HVLDQQHWZRUNDUHELQDU\WKHSUREDELOLW\GLVWULEXWLRQRYHU
VRPHYDULDEOHXZLWKn SDUHQWVPaX FDQEHUHSUHVHQWHGE\2n LQGHSHQGHQWSDUDPHWHUV
,PDJLQHWKDWXLVDELQDU\YDULDEOHZLWKWZRSDUHQWYDULDEOHVWKDWDUHQRWQHFHVVDULO\ELQDU\,PDJLQHWKDWWKHILUVW
SDUHQWFDQDVVXPHWKUHHGLIIHUHQWYDOXHVDQGWKDWWKHVHFRQGFDQDVVXPHWZRYDOXHV+RZPDQ\LQGHSHQGHQW
SDUDPHWHUVDUHQHHGHGWRUHSUHVHQWWKLVGLVWULEXWLRQP (X|PaX ) "+RZPDQ\ZRXOGLWEHLI\RXDGGHGDQRWKHU
SDUHQWWKDWFDQDVVXPHIRXUGLIIHUHQWYDOXHV"
1RZDVVXPHWKDWXLWVHOILVQRWELQDU\EXWFDQDVVXPHWKUHHGLIIHUHQWYDOXHVDQGVWLOOKDVWKHWKUHHSDUHQWVDV
VSHFLILHGDERYH+RZPDQ\YDOXHVDUHQHHGHGWRUHSUHVHQWWKLVGLVWULEXWLRQ"&DQ\RXFRPHXSZLWKDJHQHUDO
UXOHIRUWKHQXPEHURILQGHSHQGHQWSDUDPHWHUVQHHGHGWRUHSUHVHQWDGLVWULEXWLRQRYHUVRPHYDULDEOHXZLWK
SDUHQWVPaX "
3UREOHP
LPDJHVSUREOHPSQJ
*LYHQWKHGLVSOD\HG%D\HVQHWGHWHUPLQHZKHWKHUWKHIROORZLQJDUHWUXHRUIDOVH
(B ⊥ D|A)
(B ⊥ D|C )
(B ⊥ D|E)
(B ⊥ C |A)
3UREOHP
KWWSZHEVWDQIRUGHGXFODVVDDTXHVWLRQVPG
$$&6'HFLVLRQ0DNLQJXQGHU8QFHUWDLQW\
,WLVNQRZQWKDWIRRWEOXHZKDOHVFRQVXPHRQDYHUDJHNJRINULOOSHUGD\IRRWHUVFRQVXPHRQ
DYHUDJHNJRINULOOSHUGD\$VVXPHWKDWWKHPHDQGDLO\NULOOFRQVXPSWLRQYDULHVOLQHDUO\ZLWKZKDOHOHQJWK
DQGWKDWWKHGDLO\FRQVXPSWLRQIRUDJLYHQZKDOHIROORZVD*DXVVLDQGLVWULEXWLRQZLWKDVWDQGDUGGHYLDWLRQRI
NJRINULOOSHUGD\'HILQHWKHOLQHDU*DXVVLDQGLVWULEXWLRQP (k ∣ l) UHODWLQJWKHUDWHRINULOOFRQVXPSWLRQkWR
ZKDOHOHQJWKl
3UREOHP
$VVXPLQJDKLGGHQ0DUNRYPRGHOZLWKVWDWHVs0:t DQGREVHUYDWLRQVo0:t SURYHWKHIROORZLQJ
P (st ∣ o0:t ) ∝ P (ot ∣ st , o0:t−1 )P (st ∣ o0:t−1 )
6WDUWLQJIURPWKHSUHYLRXVHTXDWLRQSURYHWKHIROORZLQJ
P (st ∣ o0:t ) ∝ P (ot ∣ st ) ∑ P (st ∣ st−1 ) P (st−1 ∣ o0:t−1 )
s t−1
3UREOHP
:KDWLVWKH0DUNRYEODQNHWIRUVRPHQRGHot RIWKHKLGGHQ0DUNRYPRGHOEHORZ"([SODLQZK\WKLVLVVR
LPDJHVKPPSQJ
3UREOHP
2QHSRVVLEOHUHSUHVHQWDWLRQRIWKHODZRIWRWDOSUREDELOLW\LV
P (A) =
∑
P (A ∣ B)P (B)
B∈Bset
ZKHUHBset LVDVHWRIPXWXDOO\H[FOXVLYHDQGH[KDXVWLYHSURSRVLWLRQV&DQ\RXILQGDVLPLODUH[SUHVVLRQIRU
P (A ∣ C ) "
KWWSZHEVWDQIRUGHGXFODVVDDTXHVWLRQVPG
$$&6'HFLVLRQ0DNLQJXQGHU8QFHUWDLQW\
3UREOHP
:KDWLVDWRSRORJLFDOVRUW":K\LVLWLPSRUWDQWWRSHUIRUPDWRSRORJLFDOVRUWEHIRUHVDPSOLQJIURPD%D\HVLDQ
QHWZRUN"'RHVDWRSRORJLFDOVRUWDOZD\VH[LVW",VDWRSRORJLFDOVRUWDOZD\VXQLTXH"
3UREOHP
)RUPXODWHWKHIROORZLQJ6$7SUREOHPDVD%D\HVLDQQHWZRUN
F (x 1 , x 2 , x 3 , x 4 ) = (x 1 ∨ x 2 ∨ x 3 ) ∧ (¬x 1 ∨ x 2 ∨ ¬x 4 ) ∧ (x 2 ∨ x 3 ∨ x 4 )
7KLVVKRZVWKDWLQIHUHQFHLQ%D\HVLDQQHWZRUNVLVDWOHDVWDVKDUGDV6$7,I6$7LV13FRPSOHWHZKDWGRHV
WKDWPDNHLQIHUHQFHLQ%D\HVLDQQHWZRUNV"
3UREOHP
:KDWDUHWKHGLIIHUHQFHVEHWZHHQLQIHUHQFHSDUDPHWHUOHDUQLQJDQGVWUXFWXUHOHDUQLQJ":KDWDUH\RXORRNLQJ
IRULQHDFKFDVHDQGZKDWLVDVVXPHGWREHNQRZQ":KHQPLJKW\RXXVHHDFKRIWKHP"
3UREOHP
:KDWLVDFODVVLILFDWLRQWDVN"$VVXPH\RXDUHFODVVLI\LQJXVLQJDQDLYH%D\HVPRGHO:KDWDVVXPSWLRQVDUH\RX
PDNLQJ"'UDZDQDLYH%D\HVPRGHOXVLQJWKHFRPSDFWUHSUHVHQWDWLRQVKRZQLQFODVV:KDWLVWKHQDPHRIWKLV
NLQGRIUHSUHVHQWDWLRQ"
3UREOHP
:KDWLVDQLPSRUWDQWGUDZEDFNRIPD[LPXPOLNHOLKRRGHVWLPDWLRQ"
3UREOHP
%D\HVLDQSDUDPHWHUOHDUQLQJHVWLPDWHVDSRVWHULRUp(θ
VRPHRILWVDGYDQWDJHVDQGGUDZEDFNV"
∣ D)
IRUWKHSDUDPHWHUVθ JLYHQWKHGDWDD:KDWDUH
3UREOHP
:KDWLVWKHJDPPDIXQFWLRQΓ ":KDWLVΓ(5) "
3UREOHP
,PDJLQHWKDW\RXZDQWWRHVWLPDWHθ WKHSUREDELOLW\WKDWRQHEDVHEDOOWHDPFDOOWKHP7HDP$EHDWVDQRWKHU
WHDPFDOOWKHP7HDP%$VVXPH\RXNQRZQRWKLQJHOVHDERXWWKHWZRWHDPV:KDWLVDUHDVRQDEOHSULRU
GLVWULEXWLRQ"
1RZLPDJLQHWKDW\RXNQRZWKHWZRWHDPVZHOODQGDUHFRQILGHQWWKDWWKH\DUHHYHQO\PDWFKHG:RXOGDSULRURI
%HWDEHEHWWHUWKDQD%HWDLQWKLVFDVH",IVRZK\"
1RZLPDJLQHWKDW\RXNQRZWKDW7HDP$LVPRUHOLNHO\WRZLQPD\EHWKH\DUHWKH:DUULRUV:KDWNLQGRISULRU
PLJKW\RXXVHLQWKLVFDVH",PDJLQHWKDWWKHWHDPVDUHJRLQJWRSOD\PDQ\JDPHVDJDLQVWHDFKRWKHU:KDWGRHV
WKLVPHDQIRUWKHSULRU\RXVHOHFW"
3UREOHP
&RQVLGHUWKHWZR%HWDGLVWULEXWLRQV%HWDDQG%HWD%HWDJLYHVPXFKPRUHZHLJKWWRθ
\RXH[SODLQLQWXLWLYHO\ZK\WKLVLVVR"
KWWSZHEVWDQIRUGHGXFODVVDDTXHVWLRQVPG
= 0.5
&DQ
$$&6'HFLVLRQ0DNLQJXQGHU8QFHUWDLQW\
3UREOHP
6XSSRVH\RXKDYHDORWRIGDWDDQGDUHWU\LQJWROHDUQWKHVWUXFWXUHRID%D\HVLDQQHWZRUNWKDWILWVWKLVGDWD
&RQVLGHUWZRDUELWUDU\%D\HVLDQQHWZRUNGHVLJQV2QHLVUHODWLYHO\VSDUVHZKHUHDVWKHRWKHUKDVPDQ\
FRQQHFWLRQVEHWZHHQLWVQRGHV
,PDJLQHWKDW\RXUGDWDFRQVLVWVRIYHU\IHZVDPSOHV:KLFK%D\HVLDQQHWZRUNZRXOG\RXH[SHFWWRDFKLHYHD
EHWWHU%D\HVLDQVFRUH"+RZZRXOGWKLVFKDQJHLIWKHUHZHUHPDQ\VDPSOHV"
3UREOHP
+RZPDQ\PHPEHUVDUHWKHUHLQWKH0DUNRYHTXLYDOHQFHFODVVUHSUHVHQWHGE\WKHSDUWLDOO\GLUHFWHGJUDSK
VKRZQEHORZ"
LPDJHVHTXLYDOHQFHBFODVVSQJ
3UREOHP
*LEEVVDPSOLQJRIIHUVDIDVWZD\WRSURGXFHVDPSOHVZLWKZKLFKWRHVWLPDWHDGLVWULEXWLRQ:KDWDUHVRPH
GRZQVLGHVRI*LEEVVDPSOLQJDQGKRZDUHWKH\KDQGOHG"
3UREOHP
:KDWLVDWRSRORJLFDOVRUWLQJRIWKHQRGHVVKRZQLQ4XHVWLRQUHFUHDWHGEHORZ"
KWWSZHEVWDQIRUGHGXFODVVDDTXHVWLRQVPG
$$&6'HFLVLRQ0DNLQJXQGHU8QFHUWDLQW\
LPDJHVSUREOHPSQJ
'HFLVLRQ3UREOHPV
>Ĺ@
3UREOHP
:KDWGRHVLWPHDQWREHUDWLRQDO"
3UREOHP
([SODLQWKHYDOXHRILQIRUPDWLRQLQZRUGV:KDWLVWKHYDOXHRILQIRUPDWLRQRIDQREVHUYDWLRQWKDWGRHVQRW
FKDQJHWKHRSWLPDODFWLRQ",PDJLQHWKDWWKHRSWLPDODFWLRQFKDQJHVDIWHUDQREVHUYDWLRQ:KDWGRHVWKLVVD\
DERXWWKHYDOXHRILQIRUPDWLRQRIWKDWREVHUYDWLRQ"
3UREOHP
7KHSULVRQHUVGLOHPPDLVDQH[DPSOHRIDJDPHZLWKDGRPLQDQWVWUDWHJ\HTXLOLEULXP,PDJLQHWKDWWKHJDPHLV
PRGLILHGVRWKDWLIRQHSULVRQHUWHVWLILHVWKHRWKHURQO\JHWVIRXU\HDUVRISULVRQLQVWHDGRIWHQ'RHVWKLVJDPH
VWLOOKDYHDGRPLQDQWVWUDWHJ\HTXLOLEULXP"$UHWKHUHDQ\RWKHUHTXLOLEULD"
3UREOHP
([SODLQZK\WKHWUDYHOHU
VGLOHPPDKDVDXQLTXH1DVKHTXLOLEULXPRI'UDZWKHXWLOLW\PDWUL[DQGXVHLWWRVKRZ
WKHHTXLOLEULXP
6HTXHQWLDO3UREOHPV
KWWSZHEVWDQIRUGHGXFODVVDDTXHVWLRQVPG
$$&6'HFLVLRQ0DNLQJXQGHU8QFHUWDLQW\
>Ĺ@
3UREOHP
:KDWLVWKH0DUNRYDVVXPSWLRQ":KDWGRHVD0DUNRYGHFLVLRQSURFHVVFRQVLVWRI":KDWLVDVWDWLRQDU\0'3"
'UDZDFRPSDFWUHSUHVHQWDWLRQRIDVWDWLRQDU\0'3
3UREOHP
:KDWLVWKHSXUSRVHRIWKHGLVFRXQWIDFWRULQLQILQLWHKRUL]RQSUREOHPV":KDWLVDQDOWHUQDWLYHWRXVLQJDGLVFRXQW
IDFWRULQLQILQLWHKRUL]RQSUREOHPV":KDWHIIHFWGRHVDVPDOOGLVFRXQWIDFWRUKDYH":KDWDERXWDODUJHRQH"
:KHQLVRQHSUHIHUDEOHWRWKHRWKHU"
3UREOHP
'RHVWKHRSWLPDOSROLF\KDYHWREHXQLTXH"'RHVWKHRSWLPDOYDOXHIRUHDFKVWDWHKDYHWREHXQLTXH"
3UREOHP
:KDWLVWKH%HOOPDQHTXDWLRQ"+RZGRHVLWVLPSOLI\LIWUDQVLWLRQVDUHGHWHUPLQLVWLF"
3UREOHP
7KHSROLF\HYDOXDWLRQHTXDWLRQLQPDWUL[IRUPLV
U
π
π
= (I − γT
)
−1
π
R
ZKHUHU DQGR DUHWKHXWLOLW\DQGUHZDUGIXQFWLRQVUHSUHVHQWHGDVYHFWRUV:KDWLVWKHPHDQLQJRIT "
+RZGRHVWKLVWRUHODWH0DUNRYGHFLVLRQSURFHVVHVDQG0DUNRYFKDLQV"
π
π
π
3UREOHP
:KDWLVG\QDPLFSURJUDPPLQJ"&DQ\RXJLYHDQH[DPSOH":K\LVG\QDPLFSURJUDPPLQJPRUHHIILFLHQWWKDQ
EUXWHIRUFHPHWKRGVIRUVROYLQJ0'3V"
3UREOHP
&DQ\RXH[SODLQZKDWSROLF\LWHUDWLRQDQGYDOXHLWHUDWLRQDUH":KDWDUHWKHLUVLPLODULWLHVDQGGLIIHUHQFHV"
3UREOHP
:KDWLVWKHGLIIHUHQFHEHWZHHQRSHQDQGFORVHGORRSSODQQLQJ"
3UREOHP
&RQVLGHUWKHVLPSOHJULGZRUOGVKRZQEHORZ$QDJHQWLQWKLVZRUOGFDQPRYHWRWKHFHOOWRLWVLPPHGLDWHOHIWRUWR
WKHFHOOWRLWVLPPHGLDWHULJKWDQGWKHWUDQVLWLRQVDUHGHWHUPLQLVWLF0RYLQJOHIWLQs1 JLYHVDUHZDUGRIDQG
WHUPLQDWHVWKHJDPH0RYLQJULJKWLQs4 GRHVQRWKLQJ3HUIRUPYDOXHLWHUDWLRQDQGGHWHUPLQHWKHXWLOLW\RIEHLQJLQ
HDFKVWDWHDVVXPLQJDGLVFRXQWIDFWRURI
KWWSZHEVWDQIRUGHGXFODVVDDTXHVWLRQVPG
$$&6'HFLVLRQ0DNLQJXQGHU8QFHUWDLQW\
LPDJHVVLPSOHJULGSQJ
3UREOHP
+RZGRHVDV\QFKURQRXVYDOXHLWHUDWLRQGLIIHUIURPVWDQGDUGYDOXHLWHUDWLRQ":KDWLVWKHLPSRUWDQFHRIWKHVWDWH
RUGHULQJ"
$SSO\*DXVV6HLGHOYDOXHLWHUDWLRQWRWKHVLPSOHJULGZRUOGIURPWKHSUHYLRXVSUREOHP)LUVWXVHDVWDWHRUGHULQJ
RIs1 s2 s3 s4 7KHQXVHDQRUGHULQJRIs4 s3 s2 s1 +RZPDQ\LWHUDWLRQVGLGHDFKRUGHULQJWDNHWR
FRQYHUJH"
3UREOHP
,QZKDWFDVHVZRXOG\RXSUHIHUWRXVHG\QDPLFSURJUDPPLQJ"$SSUR[LPDWHG\QDPLFSURJUDPPLQJ"2QOLQH
PHWKRGV"
0RGHO8QFHUWDLQW\
>Ĺ@
3UREOHP
)RUZKDWW\SHVRISUREOHPVGRZHXVHUHLQIRUFHPHQWOHDUQLQJ":KDWDUHWKHWZRPDLQDSSURDFKHV"
3UREOHP
:K\LVWKHFRQFHSWRIH[SORUDWLRQYHUVXVH[SORLWDWLRQVRLPSRUWDQWLQUHLQIRUFHPHQWOHDUQLQJ"
:KDWLVDPXOWLDUPHGEDQGLW"'HVFULEHWKHYDULRXVSDUDPHWHUVLQYROYHGLQDPXOWLDUPHGEDQGLWSUREOHP
,PDJLQH\RXKDYHDWZRDUPHGEDQGLWDQGDUHFRQYLQFHGWKDWRQHRIWKHOHYHUV\LHOGVDSD\RXWRIZLWK
SUREDELOLW\<RXKDYHQHYHUSXOOHGWKHRWKHUOHYHUDQGDUHXQVXUHLILWKDVDQ\SD\RXW5HODWHWKLVWRWKH
SUREOHPRIH[SORUDWLRQDQGH[SORLWDWLRQ
3UREOHP
6XSRVHZHKDYHDWZRDUPHGEDQGLW2XUHVWLPDWHRIWKHSD\RXWUDWHRIWKHILUVWOHYHULVDQGRXUHVWLPDWHRI
WKHSD\RXWUDWHIRUWKHVHFRQGOHYHULV7KDWLVρ1 = 0.7 DQGρ2 = 0.6 2XUFRQILGHQFHLQWHUYDOVIRU
θ1 DQGθ2 DUHDQGUHVSHFWLYHO\
:KDWLVWKHGLIIHUHQFHEHWZHHQθi DQGρi "6XSSRVH\RXXVHGDQϵ JUHHG\VWUDWHJ\ZLWKϵ = 0.5 +RZPLJKW
\RXGHFLGHZKDWOHYHUWRSXOO"6XSSRVH\RXXVHGDQLQWHUYDOH[SORUDWLRQVWUDWHJ\ZLWKFRQILGHQFHLQWHUYDOV
:KDWOHYHUZRXOG\RXSXOO"
KWWSZHEVWDQIRUGHGXFODVVDDTXHVWLRQVPG
$$&6'HFLVLRQ0DNLQJXQGHU8QFHUWDLQW\
3UREOHP
:KDWDUH4YDOXHVDQGKRZGRWKH\GLIIHUIURPXWLOLW\YDOXHV8",PDJLQH\RXKDYHDPRGHORIWKHUHZDUGDQG
WUDQVLWLRQIXQFWLRQV,I\RXZHUHWRUXQDYDOXHLWHUDWLRQXVLQJWKH4YDOXHVLQVWHDGRIWKHXWLOLW\YDOXHV8ZKDW
ZRXOGEHWKHXSGDWHHTXDWLRQ"
3UREOHP
:KDWLVWKHFHQWUDOHTXDWLRQEHKLQGLQFUHPHQWDOHVWLPDWLRQ",GHQWLI\WKHWHPSRUDOGLIIHUHQFHHUURUDQGWKH
^ = 3 ,I
OHDUQLQJUDWH,PDJLQH\RXKDYHDQHVWLPDWHRIVRPHUDQGRPYDULDEOHX,PDJLQHWKDWWKLVHVWLPDWHLVx
WKHOHDUQLQJUDWHLVZKDWKDSSHQVWR\RXUHVWLPDWHDIWHUREVHUYLQJDQHZVDPSOHx = 7 ":KDWKDSSHQVLI
WKHOHDUQLQJUDWHLV"&RPPHQWRQWKHHIIHFWWKDWOHDUQLQJUDWHKDVRQLQFUHPHQWDOHVWLPDWLRQ
3UREOHP
:KDWDUHWKHVLPLODULWLHVDQGGLIIHUHQFHVEHWZHHQ4OHDUQLQJDQG6DUVD"
3UREOHP
8VH4YDOXHVWKH%HOOPDQHTXDWLRQDQGWKHLQFUHPHQWDOXSGDWHHTXDWLRQWRGHULYHWKHXSGDWHHTXDWLRQVIRU4
OHDUQLQJDQG6DUVD
3UREOHP
:KDWLVWKHGLIIHUHQFHEHWZHHQ6DUVDDQG6DUVDλ":KDWW\SHVRISUREOHPVFDQEHVROYHGPRUHHIILFLHQWO\
XVLQJHOLJLELOLW\WUDFHV"
3UREOHP
:KDWDUHWKHGLIIHUHQFHVEHWZHHQPRGHOEDVHGUHLQIRUFHPHQWOHDUQLQJDQGPRGHOIUHHUHLQIRUFHPHQWOHDUQLQJLQ
WHUPVRIWKHTXDOLW\RIWKHOHDUQHGSROLF\DQGFRPSXWDWLRQDOFRVW"
6WDWH8QFHUWDLQW\
>Ĺ@
3UREOHP
:KDWLVD320'3DQGKRZGRHVLWGLIIHUIURPDQ0'3"'UDZWKHVWUXFWXUHRID320'3DQGFRPSDUHLWWRWKDWRI
DQ0'3
3UREOHP
([DPLQHWKHWZRJULGZRUOGVVKRZQEHORZ,QWKHOHIWPRVWJULGZRUOG\RXNQRZWKHSRVLWLRQRIWKHDJHQW
UHSUHVHQWHGE\WKHUHGVTXDUH,QWKHULJKWPRVWJULGZRUOG\RXRQO\KDYHDSUREDELOLW\GLVWULEXWLRQRYHUSRVVLEOH
VWDWHV+RZPLJKW\RXUHSUHVHQWWKHVWDWHIRUHDFKFDVH"8VHWKLVWRH[SODLQZK\320'3VDUHVRPHWLPHV
FDOOHGEHOLHIVWDWH0'3VDQGDUHJHQHUDOO\LQWUDFWDEOH
KWWSZHEVWDQIRUGHGXFODVVDDTXHVWLRQVPG
$$&6'HFLVLRQ0DNLQJXQGHU8QFHUWDLQW\
LPDJHVWZRJULGSQJ
3UREOHP
$NH\WRVROYLQJ320'3VLVWKHDELOLW\WRPDLQWDLQDEHOLHIRUSUREDELOLW\GLVWULEXWLRQRYHUVWDWHV:KDWPHWKRGV
FDQEHXVHGWRXSGDWHEHOLHIV":KHQPLJKWRQHEHSUHIHUUHGRYHUWKHRWKHUV"
3UREOHP
'HULYHWKHIROORZLQJHTXDWLRQIRUDGLVFUHWHVWDWHILOWHU
′
′
′
b (s ) ∝ O(o ∣ s , a) ∑ T (s
′
∣ s, a)b(s)
s
IURPWKHGHILQLWLRQRIDEHOLHIXSGDWHb′ (s′ )
= P (s
′
∣ o, a, b)
3UREOHP
:K\ZRXOG\RXXVHDSDUWLFOHILOWHUZLWKUHMHFWLRQ":K\ZRXOG\RXXVHDSDUWLFOHILOWHUZLWKRXWUHMHFWLRQ":K\LVLW
EHWWHUWRXVHDODUJHUQXPEHURISDUWLFOHVLQ\RXUSDUWLFOHILOWHU":KDWLVSDUWLFOHGHSULYDWLRQDQGKRZFDQ\RX
SUHYHQWLW"
3UREOHP
:RUNWKURXJKWKHFU\LQJEDE\H[DPSOHSUHVHQWHGLQWKHWH[WERRN:RUNWKURXJKWKHPDWKXSGDWLQJ\RXUEHOLHI
ZLWKWKHDFWLRQVDQGREVHUYDWLRQVJLYHQ9HULI\WKDW\RXUQXPEHUVPDWFKWKRVHLQWKHWH[W
3UREOHP
,Q0'3VWKHSROLF\LVDPDSSLQJIURPVWDWHVWRDFWLRQV:KDWGRHVD320'3SROLF\ORRNOLNH"+RZGR\RXXVH
WKLVSROLF\WRILQGWKHXWLOLW\RIDEHOLHIVWDWH"
KWWSZHEVWDQIRUGHGXFODVVDDTXHVWLRQVPG
$$&6'HFLVLRQ0DNLQJXQGHU8QFHUWDLQW\
3UREOHP
,PDJLQH\RXKDYHDQH[DPWRPRUURZEXWWKHUHLVDQRQQHJOLJLEOHFKDQFHWKHSURIHVVRUDQG7$VIRUJRWDERXW
WKHH[DP<RXKDYHDFKRLFH\RXFDQVWXG\RU\RXFDQWDNHWKHHYHQLQJRII,I\RXVWXG\DQGWKHUHLVDQH[DP
\RXJHWDUHZDUGRI,I\RXVWXG\DQGWKHUHLVQRH[DP\RXUHFHLYHQRUHZDUG,I\RXWDNHWKHHYHQLQJRII
DQGWKHUHLVQRH[DPWKHHQMR\PHQWRIQRWVWXG\LQJJLYHV\RXDUHZDUGRI%XWLI\RXWDNHWKHHYHQLQJRII
DQGWKHUHLVDQH[DPWKHFHUWDLQ)DQGDVVRFLDWHGVWUHVVJLYH\RXDUHZDUGRI
:ULWHGRZQWKHDOSKDYHFWRUVIRUWKLVSUREOHP+RZVXUHVKRXOG\RXEHWKDWWKHUHZLOOEHQRH[DPEHIRUH\RX
WDNHWKHHYHQLQJRII",PDJLQH\RXKDYHDWKLUGRSWLRQZKLFKLVWRGURSRXWRIVFKRRODQGOLYHLQWKHZLOGHUQHVV
7KLVVLPSOHOLIHVW\OHZRXOGJLYH\RXDUHZDUGRIUHJDUGOHVVRIZKHWKHUWKHH[DPWDNHVSODFHRUQRW:KDWFDQ
\RXVD\DERXWWKLVRSWLRQ":RXOG\RXHYHUWDNHLW"
3UREOHP
,PDJLQHWKDW\RXKDYHDOUHDG\VROYHGIRUWKHSROLF\RIDVWDWH320'3DQG\RXKDYHWKHIROORZLQJDOSKD
YHFWRUV
⎛
300
⎞
⎛
167
⎞
⎛
27
⎞
⎜ 100 ⎟ , ⎜ 10 ⎟ , ⎜ 50 ⎟
⎝
0
⎠
⎝
100
⎠
⎝
50
⎠
7KHILUVWDQGWKLUGDOSKDYHFWRUVFRUUHVSRQGWRDFWLRQDQGWKHVHFRQGDOSKDYHFWRUFRUUHVSRQGVWRDFWLRQ
,VWKLVHYHQDYDOLGSROLF\"&DQ\RXKDYHPXOWLSOHDOSKDYHFWRUVSHUDFWLRQ",IWKHSROLF\LVYDOLGGHWHUPLQHWKH
DFWLRQ\RXZRXOGWDNHJLYHQ\RXKDYHWKHIROORZLQJEHOLHIFKDQFHLQVWDWHFKDQFHLQVWDWH
FKDQFHLQVWDWH
3UREOHP
:KDWGRHVLWPHDQWRVROYHD320'3RIIOLQHYHUVXVVROYLQJLWRQOLQH":KDWDUHWKHDGYDQWDJHVDQG
GLVDGYDQWDJHVRIHDFK"+RZGR40'3),%DQGSRLQWEDVHGYDOXHLWHUDWLRQZRUN":KDWDUHWKHDGYDQWDJHV
DQGGLVDGYDQWDJHVRIHDFK"
3UREOHP
7KHXSGDWHHTXDWLRQIRU40'3LVVKRZQEHORZ
(k+1)
αa
(s) = R(s, a) + γ ∑ T (s
′
∣ s, a) max α
′
s
a
′
(k)
′
a
′
(s )
+RZPDQ\RSHUDWLRQVGRHVHDFKLWHUDWLRQWDNH"&RPSDUHWKLVZLWKWKHQXPEHURIRSHUDWLRQVUHTXLUHGSHU
LWHUDWLRQUHTXLUHGIRU),%ZKRVHXSGDWHHTXDWLRQLVVKRZQEHORZ
(k+1)
αa
′
(s) = R(s, a) + γ ∑ max ∑ O(o ∣ s , a)T (s
′
o
a
s
′
∣ s, a)α
(k)
′
a
′
(s )
′
:ULWHFRGHWKDWDSSOLHVERWK40'3DQG),%WRWKHFU\LQJEDE\SUREOHP
KWWSZHEVWDQIRUGHGXFODVVDDTXHVWLRQVPG
$$&6'HFLVLRQ0DNLQJXQGHU8QFHUWDLQW\
:HEVLWHJHQHUDWHGZLWK0'ZLNLKWWSZZZPGZLNLLQIR‹7LPR'|UUDQGFRQWULEXWRUV
KWWSZHEVWDQIRUGHGXFODVVDDTXHVWLRQVPG