Psych156A/Ling150: AcquisitionofLanguageII Announcements Reviewquestionsavailableforwordmeaning HW1returned Lecture8 WordmeaningI BeworkingonHW2(due5/5/16) Midtermreviewinclasson4/28/16 Midtermexamduringclasson5/3/16 Whatdoes“gavagai”mean? Gavagai! Whatdoes“gavagai”mean? Rabbit? Mammal? grayrabbit? Animal? Carroteater? vegetarian? Ears? Longears? Isitgray? Fluffy? Whatacutie! Thumping Hopping Scurrying Whatdoes“gavagai”mean? hNp://www.thelingspace.com/episode-35 https://www.youtube.com/watch?v=Ci-5dVVvf0U ~2:03-2:32 Stay! Look! Meal! Rabbitonlyuntileaten! Cheeksandleftear! That’snotadog! Sameproblemthechildfaces Alittlemorecontext… “Look!There’sagoblin!” Goblin=???? Themappingproblem Evenifsomethingisexplicitlylabeledintheinput(“Look!There’sa goblin!”),howdoesthechildknowwhatspecificallythatwordrefers to?(Isitthehead?Thefeet?Thestaff?Thecombinationofeyesand hands?Attachedgoblinparts?…) Quine(1960):Aninfinitenumberofhypothesesaboutwordmeaningare possiblegiventheinputthechildhas.Thatis,theinput underspecifiestheword’smeaning. Sohowdochildrenfigureitout?Obviously,theydo…. Evenby6to9months,infantsrecognizemanyfamiliarwordsin theirlanguage,likebodypartsandfooditems—thatis, concreteobjects(Bergelson&Swingley2012,2015). eyes,mouth,hands,… milk,spoon,juice,cookie,… Sohowdochildrenfigureitout?Obviously,theydo…. By10to13monthsold,infantsunderstandwordslike“all gone”,“hug”,“bye”,and“wet”(Bergelson&Swingley2013) gone,hug,bye… wet Computationalproblem “Ilovemydaxandmykleeg.” dax=?? kleeg=?? Onesolution:Fastmapping Onesolution:Fastmapping Childrenbeginbymakinganinitialfastmappingbetweenanewword theyhearanditslikelymeaning.Theyguess,andthenmodifythe guessasmoreinputcomesin. Childrenbeginbymakinganinitialfastmappingbetweenanewword theyhearanditslikelymeaning.Theyguess,andthenmodifythe guessasmoreinputcomesin. Experimentalevidenceoffastmapping (Carey&Bartlett1978,Dollaghan1985,Mervis&Bertrand1994, Medina,Snedecker,Trueswell,&Gleitman2011) Experimentalevidenceoffastmapping (Carey&Bartlett1978,Dollaghan1985,Mervis&Bertrand1994, Medina,Snedecker,Trueswell,&Gleitman2011) ball bear ball kitty bear [unknown] [unknown] Onesolution:Fastmapping Aslightproblem… Childrenbeginbymakinganinitialfastmappingbetweenanewword theyhearanditslikelymeaning.Theyguess,andthenmodifythe guessasmoreinputcomesin. Experimentalevidenceoffastmapping (Carey&Bartlett1978,Dollaghan1985,Mervis&Bertrand1994, Medina,Snedecker,Trueswell,&Gleitman2011) ball bear “CanIhavethezib?” kitty 20months [unknown] “CanIhavetheball?” kitty “…notallopportunitiesforwordlearningareasunclutteredasthe experimentalsettingsinwhichfast-mappinghasbeendemonstrated.In everydaycontexts,therearetypicallymanywords,manypotential referents,limitedcuesastowhichwordsgowithwhichreferents,and rapidattentionalshiftsamongthemanyentitiesinthescene.”-Smith& Yu(2008) Aslightproblem… “…manystudiesfindthatchildrenevenasoldas18monthshavedifficulty inmakingtherightinferencesabouttheintendedreferentsofnovel words…infantsasyoungas13or14months…canlinkanametoanobject givenrepeatedunambiguouspairingsinasinglesession.Overall, however,theseeffectsarefragilewithsmallexperimentalvariationsoften leadingtonolearning.”-Smith&Yu(2008) Cross-situationallearning Differentapproach:infantsaccruestatisticalevidenceacrossmultipletrialsthatare individuallyambiguousbutcanbedisambiguatedwhentheinformationfromthe trialsisaggregated. Howdoeslearningwork? Howdoeslearningwork? Bayesianinferenceisoneway. Bayesianinferenceisoneway. InBayesianinference,thebeliefinaparticularhypothesis(H)(ortheprobabilityof thathypothesis),giventhedataobserved(D)canbecalculatedthefollowing way: InBayesianinference,thebeliefinaparticularhypothesis(H)(ortheprobabilityof thathypothesis),giventhedataobserved(D)canbecalculatedthefollowing way: P(H|D)= P(D|H)*P(H) P(H|D)= P(D|H)*P(H) P(D) P(D) PosteriorprobabilityofhypothesisH,giventhatdataDhavebeenobserved Howdoeslearningwork? Howdoeslearningwork? Bayesianinferenceisoneway. Bayesianinferenceisoneway. InBayesianinference,thebeliefinaparticularhypothesis(H)(ortheprobabilityof thathypothesis),giventhedataobserved(D)canbecalculatedthefollowing way: InBayesianinference,thebeliefinaparticularhypothesis(H)(ortheprobabilityof thathypothesis),giventhedataobserved(D)canbecalculatedthefollowing way: P(H|D)= P(H|D)= P(D|H)*P(H) P(D) Posteriorprobability P(D|H)*P(H) P(D) Posteriorprobability LikelihoodofseeingdataD,giventhatHistrue Likelihood Howdoeslearningwork? PriorprobabilityofhypothesisH Howdoeslearningwork? Bayesianinferenceisoneway. Bayesianinferenceisoneway. InBayesianinference,thebeliefinaparticularhypothesis(H)(ortheprobabilityof thathypothesis),giventhedataobserved(D)canbecalculatedthefollowing way: InBayesianinference,thebeliefinaparticularhypothesis(H)(ortheprobabilityof thathypothesis),giventhedataobserved(D)canbecalculatedthefollowing way: P(H|D)= P(D|H)*P(H) P(H|D)= P(D|H)*P(H) Σh P(D|h)*P(h) P(D) Posteriorprobability Likelihood Prior Probabilityofobservingthedata,no matterwhathypothesisistrue Posteriorprobability Likelihood Prior Probabilityofobservingthedata,no matterwhathypothesisistrue: Calculatebysummingoverallhypotheses Howdoeslearningwork? Bayesianinferenceisoneway. Cross-situationallearning Let’sapplyBayesianinferencetothisscenario. InBayesianinference,thebeliefinaparticularhypothesis(H)(ortheprobabilityof thathypothesis),giventhedataobserved(D)canbecalculatedthefollowing way: P(H|D)= P(D|H)*P(H) Σh P(D|h)*P(h) data Posteriorprobability Likelihood Prior Cross-situationallearning Let’sapplyBayesianinferencetothisscenario. Cross-situationallearning Let’sapplyBayesianinferencetothisscenario. Posteriorprobabilitythat“ball”refersto Observabledata Hypothesis1(H1):“ball”= Hypothesis2(H2):“ball”= Sincetherearetwohypothesesinthehypothesis spaceatthispoint P(H1)=1/2=0.5 P(H2)=1/2=0.5 Cross-situationallearning Let’sapplyBayesianinferencetothisscenario. Cross-situationallearning Let’sapplyBayesianinferencetothisscenario. Observabledata Observabledata Hypothesis1(H1):“ball”= Ifthisistheonlydataavailable, Hypothesis1(H1):“ball”= Hypothesis2(H2):“ball”= P(D|H1)=wouldthisbeobservedifH1were true?Yes.Thereforep(D|H1)=1.0. Hypothesis2(H2):“ball”= Ifthisistheonlydataavailable, P(D)=Σ P(D|h)P(h)= h P(D|H1)*P(H1)=1.0*0.5=0.5 P(D|H2)*P(H2)=1.0*0.5=0.5 P(D|H2)=wouldthisbeobservedifH2were true?Yes.Thereforep(D|H2)=1.0. so Σ P(D|h)P(h)=0.5+0.5=1.0 h Cross-situationallearning Let’sapplyBayesianinferencetothisscenario. Cross-situationallearning Let’sapplyBayesianinferencetothisscenario. Observabledata Observabledata Hypothesis1(H1):“ball”= Ifthisistheonlydataavailable, Hypothesis1(H1):“ball”= Hypothesis2(H2):“ball”= =P(D|H1)*P(H1) P(D) Hypothesis2(H2):“ball”= =1.0*0.5=0.5 1.0 Thisfeelsintuitivelyright,since“ball”couldrefertoeitherobject,giventhisdata point. Hypothesis3(H3):“ball”= Sincetherearethreehypothesesinthehypothesis spaceatthispoint P(H1)=1/3=0.33 P(H2)=1/3=0.33 P(H3)=1/3=0.33 Cross-situationallearning Let’sapplyBayesianinferencetothisscenario. Observabledata Cross-situationallearning Let’sapplyBayesianinferencetothisscenario. Observabledata Hypothesis1(H1):“ball”= Hypothesis2(H2):“ball”= Ifthisistheonlydataavailable, P(D|H1)=wouldthisbeobservedifH1were true?Yes.Thereforep(D|H1)=1.0. Hypothesis3(H3):“ball”= Hypothesis1(H1):“ball”= Hypothesis2(H2):“ball”= Hypothesis3(H3):“ball”= Cross-situationallearning Let’sapplyBayesianinferencetothisscenario. Observabledata Hypothesis1(H1):“ball”= Hypothesis2(H2):“ball”= Hypothesis3(H3):“ball”= Ifthisistheonlydataavailable, P(D|H2)=wouldthisbeobservedifH2were true?No.(Whywould“ball”besaidinthesecond scene?)Thereforep(D|H2)=0.0. P(D|H3)=wouldthisbeobservedifH3were true?No.(Whywould“ball”besaidinthefirst scene?)Thereforep(D|H3)=0.0. Cross-situationallearning Let’sapplyBayesianinferencetothisscenario. Observabledata Ifthisistheonlydataavailable, P(D)=Σ P(D|h)P(h)= h P(D|H1)*P(H1)=1.0*0.33=0.33 P(D|H2)*P(H2)=0.0*0.33=0.0 P(D|H3)*P(H3)=0.0*0.33=0.0 so Σ P(D|h)P(h)=0.33+0.0+0.0=0.33 h Hypothesis1(H1):“ball”= Ifthisistheonlydataavailable, Hypothesis2(H2):“ball”= =P(D|H1)*P(H1) P(D) =1.0*0.33=1.0 0.33 Hypothesis3(H3):“ball”= Thisfeelsintuitivelyright,since“ball”couldonlyrefertotheball,whenthesetwo scenesarereconciledwitheachother. Smith&Yu(2008) Yu&Smith(2007):Adultsseemabletodocross-situationallearning (inexperimentalsetups). Smith&Yu(2008)ask:Can12-and14-month-oldinfantsdothis? (Relevantageforbeginningword-learning.) Smith&Yu(2008):Experiment InfantsweretrainedonsixnovelwordsobeyingphonotacticprobabilitiesofEnglish: bosa,gasser,manu,colat,kaki,regli Thesewordswereassociatedwithsixbrightlycoloredshapes (sadlygreyscaleinthepaper) Figurefrompaper Smith&Yu(2008):Experiment Whattheshapesareprobablymorelike Smith&Yu(2008):Experiment Training:30slideswith2objectsnamedwithtwowords(totaltime:4min) manu colat Testing:12trialswithonewordrepeated4timesand2objects(correctoneand distracter)present manu manu Whichonedoestheinfant thinkismanu?Thatshouldbe manu theonetheinfantprefersto manu lookat. Exampletrainingslides bosa manu Smith&Yu(2008):Experiment Results:Infantspreferentiallylookattargetoverdistracter,and14-month-olds lookedlongerthan12-month-olds.Thismeanstheywereabletotabulate distributionalinformationacrosssituations. Somethingtothinkabout… Therealworldisn’tnecessarilyas simpleastheseexperimental setups-oftentimes,therewill bemanypotentialreferents. (Asimilarissuetotheonefastmappinghas.) Implication:12and14-month-oldinfantscandocross-situationallearning Somethingtothinkabout… Astrategywherelearnershangon toonehypothesisatatimeuntil it’sprovenincorrectandonlythen switchtoadifferentonemaywork betterbecauseofthis.There’s someevidencethatitmatches infantbehavioralresultsquitewell (Stevens,Trueswell,Yang,& Gleitman2013)andmaybemore effectivefornavigatingthe hypothesisspace(Romberg&Yu 2014). Somemorediscussionaboutthis:http://facultyoflanguage.blogspot.com/2013/03/learningfast-and-slow-i-how-children.html Somethingelsetothinkabout… Havingmorereferentsmaynotbeabadthing. Whynot? It’seasierforthecorrectassociationstoemergefromspurious associationswhentherearemoreobject-referentpairing opportunities.Let’sseeanexampleofthis. Whymoremaynotalwaysbeharder… Supposetherearesixobjectstotal,the amountusedintheSmith&Yu (2008)experiment. First,let’sconsidertheircondition, wheretwoobjectsareshownata time.Let’ssaywegetthreeslides/ scenesofdata. “manu” “colat” Supposetherearesixobjectstotal,the amountusedintheSmith&Yu (2008)experiment. “manu” “colat” “bosa” “gasser” Canwetellwhether“manu”refersto or ? “bosa” “gasser” “kaki” “regli” No-bothhypothesesareequally compatiblewiththesedata. “kaki” “regli” Whymoremaynotalwaysbeharder… Supposetherearesixobjectstotal,the amountusedintheSmith&Yu (2008)experiment. Now,let’sconsideramorecompex condition,wherefourobjectsare shownatatime.Let’ssayweget threeslides/scenesofdata. Whymoremaynotalwaysbeharder… “manu” “colat” “bosa” “regli” “bosa” “gasser” “manu” “colat” “manu” “gasser” “kaki” “regli” Whymoremaynotalwaysbeharder… Supposetherearesixobjectstotal,the amountusedintheSmith&Yu (2008)experiment. Canwetellwhether“manu”refersto or oror? Well,thefirstslideisn’thelpfulin distinguishingbetweenthesefour hypotheses… “manu” “colat” “bosa” “regli” “bosa” “gasser” “manu” “colat” “manu” “gasser” “kaki” “regli” Whymoremaynotalwaysbeharder… Supposetherearesixobjectstotal,the amountusedintheSmith&Yu (2008)experiment. Canwetellwhether“manu”refersto or oror? Thesecondslidesuggests“manu” can’tbe-otherwise,that objectwouldappearinthesecond slide. “manu” “colat” “bosa” “regli” “bosa” “gasser” “manu” “colat” “manu” “gasser” “kaki” “regli” Whymoremaynotalwaysbeharder… Supposetherearesixobjectstotal,the amountusedintheSmith&Yu (2008)experiment. Therefore,“manu”is. Thisshowsusthathavingmorethings appear(andbenamed)atonce actuallyoffersmoreopportunities forthecorrectassociationsto emerge. “manu” “colat” “bosa” “regli” “bosa” “gasser” “manu” “colat” “manu” “gasser” “kaki” “regli” Whymoremaynotalwaysbeharder… Supposetherearesixobjectstotal,the amountusedintheSmith&Yu (2008)experiment. Canwetellwhether“manu”refersto or or? Thethirdslidesuggests“manu”can’t beor-otherwise,those objectswouldwouldappearinthe thirdslide. “manu” “colat” “bosa” “regli” “bosa” “gasser” “manu” “colat” “manu” “gasser” “kaki” “regli” Whymoremaynotalwaysbeharder… Let’swalkthroughthisscenario usingBayesianinference. P(H|D)= P(D|H)*P(H) Σ P(D|h)*P(h) We’llseeanexampleofhow sequentialupdatingwouldwork (insteadofcalculatingthe posteriorjustonce,basedonall ofthedata). “manu” “colat” “bosa” “regli” “bosa” “gasser” “manu” “colat” “manu” “gasser” “kaki” “regli” Sequentialupdating datapoint1 Hypothesis1(H1):“manu”= Hypothesis2(H2):“manu”= Hypothesis3(H3):“manu”= Sequentialupdating “manu” “colat” “bosa” “regli” Hypothesis1(H1):“manu”= Hypothesis2(H2):“manu”= Hypothesis3(H3):“manu”= Hypothesis4(H4):“manu”= Hypothesis4(H4):“manu”= Sincetherearefourhypothesesinthehypothesis spaceatthispoint,thepriorsare: Wecancalculatethelikelihoods,giventhisdatapoint: P(H1)=1/4=0.25 P(H2)=1/4=0.25 P(H3)=1/4=0.25 P(H4)=1/4=0.25 P(D|H1)=1 P(D|H2)=1 P(D|H3)=1 P(D|H4)=1 Sequentialupdating Hypothesis1(H1):“manu”= datapoint1 Hypothesis2(H2):“manu”= Hypothesis3(H3):“manu”= “manu” “colat” “bosa” “regli” Hypothesis1(H1):“manu”= Hypothesis2(H2):“manu”= Hypothesis3(H3):“manu”= Hypothesis4(H4):“manu”= Wecancalculatethelikelihood*priorforeachhypothesis: Wecannowcalculatetheposteriorforeachhypothesis: P(D|H1)*P(H1)=1*0.25=0.25 P(D|H2)*P(H2)=1*0.25=0.25 P(D|H3)*P(H3)=1*0.25=0.25 P(D|H4)*P(H4)=1*0.25=0.25 P(H1|D)=0.25/1=0.25 P(H2|D)=0.25/1=0.25 P(H3|D)=0.25/1=0.25 P(H4|D)=0.25/1=0.25 Σ P(D|h)*P(h) “manu” “colat” “bosa” “regli” Sequentialupdating Hypothesis4(H4):“manu”= Thesum(whichwe’llneedforthedenominatoroftheposterior)=1 datapoint1 datapoint1 “manu” “colat” “bosa” “regli” Sequentialupdating Sequentialupdating Hypothesis1(H1):“manu”= Hypothesis1(H1):“manu”= Hypothesis2(H2):“manu”= Hypothesis2(H2):“manu”= Hypothesis3(H3):“manu”= datapoint2 Hypothesis3(H3):“manu”= “bosa” “gasser” “manu” “colat” Hypothesis4(H4):“manu”= Thesebecomethepriorsforthenextdatapoint. P(H1)=0.25 P(H2)=0.25 P(H3)=0.25 P(H4)=0.25 datapoint2 “bosa” “gasser” “manu” “colat” Hypothesis4(H4):“manu”= Wecancalculatethelikelihoods,giventhisdatapoint: P(D|H1)=1 P(D|H2)=1 P(D|H3)=1 P(D|H4)=0(doesn’tappear) Sequentialupdating Sequentialupdating Hypothesis1(H1):“manu”= Hypothesis1(H1):“manu”= Hypothesis2(H2):“manu”= Hypothesis2(H2):“manu”= Hypothesis3(H3):“manu”= Hypothesis4(H4):“manu”= Wecancalculatethelikelihood*priorforeach hypothesis: P(D|H1)*P(H1)=1*0.25=0.25 P(D|H2)*P(H2)=1*0.25=0.25 P(D|H3)*P(H3)=1*0.25=0.25 P(D|H4)*P(H4)=0*0.25=0 Thesum(whichwe’llneedforthedenominatorofthe posterior)=0.75 Σ P(D|h)*P(h) Hypothesis3(H3):“manu”= datapoint2 “bosa” “gasser” “manu” “colat” Hypothesis4(H4):“manu”= Wecannowcalculatetheposteriorforeachhypothesis: P(H1|D)=0.25/0.75=0.33 P(H2|D)=0.25/0.75=0.33 P(H3|D)=0.25/0.75=0.33 P(H4|D)=0/0.75 =0 datapoint2 “bosa” “gasser” “manu” “colat” Sequentialupdating Sequentialupdating Hypothesis1(H1):“manu”= Hypothesis1(H1):“manu”= Hypothesis2(H2):“manu”= Hypothesis2(H2):“manu”= Hypothesis3(H3):“manu”= Hypothesis3(H3):“manu”= Hypothesis4(H4):“manu”= Hypothesis4(H4):“manu”= Thesebecomethepriorsforthenextdatapoint. Wecancalculatethelikelihoods,giventhisdatapoint: datapoint3 “manu” “gasser” “kaki” “regli” P(H1)=0.33 P(H2)=0.33 P(H3)=0.33 P(H4)=0 Sequentialupdating Hypothesis1(H1):“manu”= Hypothesis2(H2):“manu”= Hypothesis2(H2):“manu”= Hypothesis3(H3):“manu”= Hypothesis3(H3):“manu”= Hypothesis4(H4):“manu”= Hypothesis4(H4):“manu”= P(D|H1)*P(H1)=0*0.33=0 P(D|H2)*P(H2)=1*0.33=0.33 P(D|H3)*P(H3)=0*0.33=0 P(D|H4)*P(H4)=1*0=0 Thesum(whichwe’llneedforthedenominatorofthe posterior)=0.33 Σ P(D|h)*P(h) “manu” “gasser” “kaki” “regli” Sequentialupdating Hypothesis1(H1):“manu”= Wecancalculatethelikelihood*priorforeach hypothesis: datapoint3 P(D|H1)=0(doesn’tappear) P(D|H2)=1 P(D|H3)=0(doesn’tappear) P(D|H4)=1 Wecannowcalculatetheposteriorforeachhypothesis: datapoint3 “manu” “gasser” “kaki” “regli” P(H1|D)=0/0.33 =0 P(H2|D)=0.33/0.33=1 P(H3|D)=0/0.33 =0 P(H4|D)=0/0.33 =0 datapoint3 “manu” “gasser” “kaki” “regli” Sequentialupdating Theutilityofprobabilities Hypothesis1(H1):“manu”= Hypothesis2(H2):“manu”= Hypothesis3(H3):“manu”= Hypothesis4(H4):“manu”= Wecannowcalculatetheposteriorforeachhypothesis: P(H1|D)=0/0.33 =0 P(H2|D)=0.33/0.33=1 P(H3|D)=0/0.33 =0 P(H4|D)=0/0.33 =0 datapoint3 “manu” “gasser” “kaki” “regli” Someotherfactorsincross-situationallearning Eveniftherearemore referents,cross-situational learningismoresuccessful whensomereferentsare immediatelyrepeatedfrom situationtosituation (Kachergis,Yu,&Shiffrin 2012). “manu” “colat” “bosa” “regli” “bosa” “gasser” “manu” “colat” “manu” “gasser” “kaki” “regli” Partialknowledgeofsome wordsappearstobevery helpfulforhelpinglearners figureoutthemeaningof wordstheydon’tknowyet (Yurovsky,Fricker,&Yu 2013). “bosa” “gasser” “manu” “colat” P(H1|D)=0.25/0.75=0.33 P(H2|D)=0.25/0.75=0.33 P(H3|D)=0.25/0.75=0.33 P(H4|D)=0/0.75 =0 Someotherfactorsincross-situationallearning Thechild’sperspectiveofreal worldeventsmaymakecrosssituationallearningmore feasible,ascomparedtoa neutralthirdparty(thewaya photographrepresentsthe world).Thisislikelybecause certainthingsaremoresalient fromachild’sperspectivedueto objectforegroundingand degreeofclutterinlineofsight (Yurovsky,Smith,&Yu2013). Recap:Word-meaningmapping Questions? Cross-situationallearning,whichreliesondistributionalinformation acrosssituations,canhelpchildrenlearnwhichwordsreferto whichthingsintheworld. OnewaytoimplementthereasoningprocessbehindcrosssituationallearningisBayesianinference.Itcanbedoneinabatch overallthedataobserved,orsequentiallyasthedataareobserved onebyone. Experimentalevidencesuggeststhatinfantsarecapableofthiskind ofreasoningincontrolledexperimentalsetups. Youshouldbeabletodoupthroughquestion7onHW2andupthrough question5onthewordmeaningreviewquestions.
© Copyright 2026 Paperzz