Takeaway MSRA at NTCIR-‐10 1CLICK-‐2 ▶ AutomaFc query aHribute extracFon is effecFve especially for celebrity queries. Kazuya Narita (Tohoku University, Japan), Tetsuya Sakai, Zhicheng Dou (MSRA, China), Young-‐In Song (NHN Corpora;on, Korea) System Architecture I. Query Type Classifier Classify by some heurisFc rules For example: Query length QuesFon parFcles GEO clue suffixes at middle “市” (city), “駅” (sta;on), “区” (ward), “町” (town) … FACILITY clue suffixes “園” (park), “学校” (school), “病院” (hospital) … NE tagger Clue words in wiki pages “小説家” (novelist), “選手” (player), “女優” (actress) … Celebrity dicFonary (made by Wikipedia Infobox) Else Query Type II. A;ribute Extractor Input: Query QA Sentences Sentences Web search results Find query aHributes Ichiro has aHributes Born Height Future Science Museum GEO Sentences Sentences Sentences Address Approach: III. Sentence Ranker FACILITY Rank sentences by some heurisFc rules Content Similarity cosine s1 similarity s2 LexRank algorithm (similar to other sentences s4 s3 ARTIST = high score) ACTOR Query a@ributes POLITICIAN weight for each aHribute as importance ATHLETE like TF-‐IDF (based on classifier results) DEFINITION Manual clue words and URLs “birthday”: more relevant for celebrity “amazon.co.jp”: less relevant score Search Rank 1 linear score with search rank 0 search rank Sentences Sentences Ranked Sentences Output: X-‐string Posi.on has aHributes Opening Hours Tel Web search results Extract tables or text containing “:” Born Height Posi.on October 22, 1973 (age 38) 5 g 11 in (1.80m) Ouhielder Extract a Hributes Born: October 22, 1973 (age 38) Height: 5 g 11 in (1.80m) Posi.on: Ouhielder Height Posi.on Born Sentences Sentences A;ributes IV. Post-‐Processor Diversify ranked sentences sentence1 sentence2 sentence3 … sentencen compare with already selected sentences (MMR algorithm) X-‐string sentence1 sentence3 … Similar sentences are eliminated EvaluaFon Results and Discussion ▶ Query classificaFon accuracy: 0.83 ▷ FACILITY: insufficient suffixes and other rules ▶ Query aHributes are effecFve especially for celebrity queries ▷ E.g. ATHLETE query “イチロー” (Ichiro; professional baseball player): e ffecFve a Hributes “ 所属” ( affilia;on) a nd “ 打率” ( ba[ng a verage) gold \ sys ARTACTPOLATHFACGEODEF QA ▶ D EFINITION a nd Q A: a Hributes d o n ot w ork w ell ( by t he n ature o f t ypes) ARTIST 9 1 0 0 0 0 0 0 0.3 ACTOR 1 9 0 0 0 0 0 0 with aHributes 0.25 without aHributes POLITICIAN 1 0 8 0 0 0 1 0 0.2 without query type ATHLETE 0 0 0 10 0 0 0 0 0.15 FACILITY 1 0 0 0 7 1 5 1 0.1 0.05 GEO 1 0 0 0 0 14 0 0 0 DEFINITION 0 0 0 1 1 1 12 0 Overall ARTIST ACTOR POLITICIAN ATHLETE FACILITY GEO DEFINITION QA QA 0 0 0 0 0 0 1 14 mean S#-‐measure Conclusion and Future work ▶ AutomaFc query aHribute extracFon is effecFve especially for celebrity queries. ▶ Future work: automaFc extracFon of clue words for query classificaFon, new framework instead of aHributes for DEFINITION and QA
© Copyright 2026 Paperzz