Variational Autoencoders Write Poetry (Generating Sentences from a Continuous Space) Elsbeth Turcan andFei-TzinLee PaperbySamBowman,LukeVilnis etal. 2016 Motivation – Generativemodelsfornaturallanguagesentences – Machinetranslation – Imagecaptioning – Datasetsummarization – Chatbots – Etc. – Wanttocapturehigh-levelfeaturesoftextsuchastopicandstyleandkeep themconsistentwhengeneratingtext Related work - RNNLM – InthewordsofBowmanetal.,“AstandardRNNlanguagemodelpredictseach wordofasentenceconditionedonthepreviouswordandanevolvinghidden state.” – Inotherwords,itonlylooksattherelationshipsbetweenconsecutivewords, andsodoesnotcontainorobserveanyglobalfeatures – Butwhatifwewantglobalinformation? Other related work – Skip-thought – Generatesentencecodesinthestyleofwordembeddings topredictcontext sentences – Paragraphvector – Avectorrepresenting theparagraphisincorporatedintosingle-word embeddings Autoencoders – TypicallycomposedoftwoRNNs – ThefirstRNNencodesasentenceintoanintermediatevector – ThesecondRNNdecodestheintermediaterepresentationbackintoasentence, ideallythesameastheinput Variational Autoencoders (VAEs) – Regularautoencoders learnonlydiscretemappingsfrompointtopoint – However,ifwewanttolearnholisticinformationaboutthestructureof sentences,weneedtobeabletofillsentencespacebetter – InaVAE,wereplacethehiddenvectorz withaposteriorprobabilitydistribution q(z|x)conditionedontheinput,andsampleourlatentz fromthatdistribution ateachstep – Weensurethatthisdistributionhasatractableformbyenforcingitssimilarity toadefinedpriordistribution,typicallysomeformofGaussian Modified loss function – Theregularautoencoder’s lossfunctionwouldencouragetheVAEtolearn posteriorsasclosetodiscreteaspossible– inotherwords,Gaussiansthatare clusteredextremelytightlyaroundtheirmeans – Inordertoenforceourposterior’ssimilaritytoawell-formedGaussian,we introduceaKLdivergencetermintoourloss,asbelow: Reparameterization trick – Intheoriginalformulation,theencodernetencodesthesentenceintoaprobability distribution(usuallyGaussian);practicallyspeaking,itencodesthesentenceintothe parametersofthedistribution(i.e.µandσ) – However,thisposeschallengesforus whilebackpropagating:wecan’t backpropagate overthejumpfrom µandσ toz,sinceit’srandom – Solution:extracttherandomnessfrom theGaussianbyreformulatingitasa functionofµ,σ,andanotherseparate randomvariable FromStackOverflow. Specific architecture – Single-layerLSTMforencoderanddecoder Issues and fixes – Decodertoostrong,withoutany limitationsjustdoesn’tusez at all – Fix:KLannealing – Fix:worddropout Experiments – Language modeling – UsedVAEtocreatelanguagemodelsonthePennTreebankdataset,with RNNLMasbaseline – Task:trainanLMonthetraining setandhaveitdesignatethetestsetashighly probable – RNNLMoutperformedtheVAEinthetraditionalsetting – However,whenhandicapswereimposedonbothmodels(inputless decoder), theVAEwassignificantlybetterabletoovercomethem Experiments – Imputing missing words – Task:infermissingwordsinasentencegivensomeknownwords(imputation) – PlacetheunknownwordsattheendofthesentencefortheRNNLM – RNNLMandVAEperformedbeamsearch(VAEdecodingbrokenintothree steps)toproducethemostlikelywordstocompleteasentence – Preciseevaluationoftheseresultsiscomputationallydifficult Adversarial evaluation – Instead,createanadversarialclassifier,trainedtodistinguishrealsentences fromgeneratedsentences,andscorethemodelonhowwellitfoolsthe adversary – Adversarialerrorisdefinedasthegapbetween chanceaccuracy(50%)andtherealaccuracyofthe adversary– ideallythiserrorwillbeminimized Experiments - Other – SeveralotherexperimentsintheappendixshowedtheVAEtobeapplicabletoa varietyoftasks – Textclassification – Paraphrasedetection – Question classification Analysis – Worddropout – Keepratetoolow:sentencestructuresuffers – Keepratetoohigh: nocreativity,stiflesthevariation – Effectsoncostfunctioncomponents: Extras: sampling from the posterior and homotopies – Samplingfromtheposterior:examplesofsentencesadjacentinsentencespace – Homotopies:linearinterpolationsinsentencespacebetweenthecodesfortwo sentences Even more homotopies Thanks for listening! – Anyquestions?
© Copyright 2026 Paperzz