IBM Research IBM Watson and Jeopardy: Was it only a game? Barclay Brown IBM 1 © 2010 IBM Corporation IBM Research IBM Watson The Enormity of the Challenge The Evolution of DeepQA How Watson Plays Jeopardy What else can we do with this technology? © 2010 IBM Corporation IBM Research IBM Watson The Enormity of the Challenge The Evolution of DeepQA How Watson Plays Jeopardy What else can we do with this technology? © 2010 IBM Corporation IBM Research $400 Watson’s ancestor defeated the world champion at this in 1997, also a Broadway musical. © 2010 IBM Corporation IBM Research What is, “Chess” ? Source: UPI http://www.upi.com/News_Photos/view/6a0429fa1ada13e8ae7f5558b9deaf88/Champion-Garry-Kasparov-plays-IBMs-Deep-Blue/ © 2010 IBM Corporation IBM Research The Jeopardy! Challenge posed a unique and compelling artificial intelligence question… “Can a computer system be designed to compete against the best humans at a task thought to require high levels of human intelligence, and if so, what kind of technology, algorithms, and engineering is required?” Ferrucci, et al, Building Watson, AI Magazine, Fall 2010 © 2010 IBM Corporation IBM Research Want to Play Chess or Just Chat? Chess –A finite, mathematically well-defined search space –Limited number of moves and states –All the symbols are completely grounded in the mathematical rules of the game Human Language –Words by themselves have no meaning –Only grounded in human cognition –Words navigate, align and communicate an infinite space of intended meaning –Computers can not ground words to human experiences to derive meaning © 2010 IBM Corporation IBM Research Easy Questions? ln((12,546,798 * π)) ^ 2 / 34,567.46 = 0.00885 Select Payment where Owner=“David Jones” and Type(Product)=“Laptop”, Owner Serial Number David Jones 45322190-AK Serial Number Type Invoice # 45322190-AK INV10895 LapTop David Jones David Jones 8 = Invoice # Vendor Payment INV10895 MyBuy $104.56 Dave Jones David Jones ≠ © 2010 IBM Corporation IBM Research Hard Questions? Computer programs are natively explicit, fast and exacting in their calculation over numbers and symbols….But Natural Language is implicit, highly contextual, ambiguous and often imprecise. Person Birth Place A. Einstein ULM Where was X born? Structured Unstructured One day, from among his city views of Ulm, Otto chose a water color to send to Albert Einstein as a remembrance of Einstein´s birthplace. X ran this? Person Organization J. Welch GE If leadership is an art then surely Jack Welch has proved himself a master painter during his tenure at GE. © 2010 IBM Corporation IBM Research The Jeopardy! Challenge: A compelling and notable way to drive and measure the technology of automatic Question Answering along 5 Key Dimensions Broad/Open Domain Complex Language $200 $1000 If you're standing, it's the direction you should look to check out the wainscoting. The first person mentioned by name in ‘The Man in the Iron Mask’ is this hero of a previous book by the same author. $600 $2000 High Precision Accurate Confidence High Speed In cell division, mitosis splits the nucleus & cytokinesis splits this liquid cushioning the nucleus Of the 4 countries in the world that the U.S. does not have diplomatic relations with, the one that’s farthest north © 2010 IBM Corporation IBM Research Broad Domain We do NOT attempt to anticipate all questions and build specialized databases. 3.00% 2.50% 2.00% In a random sample of 20,000 questions we found 2,500 distinct types*. The most frequent occurring <3% of the time. The distribution has a very long tail. And for each these types 1000’s of different things may be asked. 1.50% 1.00% Even going for the head of the tail will barely make a dent 0.50% he film group capital woman song singer show composer title fruit planet there person language holiday color place son tree line product birds animals site lady province dog substance insect way founder senator form disease someone maker father words object writer novelist heroine dish post month vegetable sign countries hat bay 0.00% *13% are non-distinct (e.g., it, this, these or NA) Our Focus is on reusable NLP technology for analyzing volumes of as-is text. Structured sources (DBs and KBs) are used to help interpret the text. 11 © 2010 IBM Corporation IBM Research IBM Watson The Enormity of the Challenge The Evolution of DeepQA How Watson Plays Jeopardy What else can we do with this technology? © 2010 IBM Corporation IBM Research $600 Watson’s spare-time hobby--second only to reading millions of books (which it doesn’t really understand). © 2010 IBM Corporation IBM Research What is “parsing”? Volumes of Text Syntactic Frames Semantic Frames Inventors patent inventions (.8) Officials Submit Resignations (.7) People earn degrees at schools (0.9) Fluid is a liquid (.6) Liquid is a fluid (.5) Vessels Sink (0.7) People sink 8-balls (0.5) (in pool/0.8) © 2010 IBM Corporation Keyword Evidence IBM Research In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India. In May, Gary arrived in India after he celebrated his anniversary in Portugal. arrived in celebrated In May 1898 Keyword Matching 400th anniversary This evidence suggests “Gary” is the answer BUT the system must learn that keyword matching may be weak relative to other types of evidence 15 Keyword Matching Portugal celebrated In May Keyword Matching anniversary Keyword Matching in Portugal arrival in India explorer Keyword Matching India Gary © 2010 IBM Corporation Deeper Evidence IBM Research In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India. On the 27th of May 1498, Vasco da Gama landed in Kappad Beach Search Far and Wide Explore many hypotheses celebrated Find Judge Evidence Portugal May 1898 400th anniversary landed in Many inference algorithms Temporal Reasoning 27th May 1498 Date Math arrival in Stronger evidence can be much harder to find and score. 16 India Statistical Paraphrasing GeoSpatial Reasoning Paraphrase s Kappad Beach GeoKB Vasco da Gama explorer The evidence is still not 100% certain. © 2010 IBM Corporation IBM Research Not Just for Fun Some Questions require Decomposition and Synthesis A long, tiresome speech delivered by a frothy pie topping . Diatribe . Harangue . . . Whipped Cream . . Meringue . . . Answer: Meringue Harangue Category: Edible Rhyme Time 17 © 2010 IBM Corporation IBM Research Divide and Conquer (Typical in Final Jeopardy!) Must identify and solve sub-questions from different sources to answer the top level question Lyndon B Johnson In 1968 this man was U.S. president. When "60 Minutes" premiered this man was U.S. president. ? The DeepQA architecture attempts different decompositions and recursively applies the QA algorithms 18 © 2010 IBM Corporation IBM Research The Missing Link Buttons Shirts, TV remote controls, Telephones Mt Everest Edmund Hillary He was first On hearing of the discovery of George Mallory's body, he told reporters he still thinks he was first. © 2010 IBM Corporation IBM Research IBM Watson The Enormity of the Challenge The Evolution of DeepQA How Watson Plays Jeopardy What else can we do with this technology? © 2010 IBM Corporation IBM Research The Best Human Performance: Our Analysis Reveals the Winner’s Cloud Each dot represents an actual historical human Jeopardy! game Top human players are remarkably good. Winning Human Performance Grand Champion Human Performance Computers? 2007 QA Computer System More Confident 21 Less Confident © 2010 IBM Corporation IBM Research Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence Apollo 11 moon landing MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning the Big Bang moment, we were creatures of the cosmic ocean” Call on the phone THE QUEEN'S ENGLISH Give a Brit a tinkle when you get into town & you've done this urinate Louis Pasteur FATHERLY NICKNAMES This Frenchman was "The Father of Bacteriology" How Tasty Was My Little Frenchman © 2010 IBM Corporation IBM Research Grouping Features to produce Evidence Profiles Clue: Chile shares its longest land border with this country. 1 0.8 0.6 Argentina Bolivia Bolivia is more popular due to a commonly discussed border dispute.. 0.4 0.2 Positive Evidence 0 -0.2 Negative Evidence © 2010 IBM Corporation IBM Research Evidence: Time, Popularity, Source, Classification etc. Clue: You’ll find Bethel College and a Seminary in this “holy” Minnesota city. Saint Paul South Bend There’s a Bethel College and a Seminary in both cities. System is not weighing location evidence high enough to give St. Paul the edge. © 2010 IBM Corporation IBM Research Evidence: Puns Clue: You’ll find Bethel College and a Seminary in this “holy” Minnesota city. Saint Paul South Bend Humans may get this based on the pun since St. Paul since is a “holy” city. We added a Pun Scorer that discovers and scores Pun relationships. © 2010 IBM Corporation IBM Research DeepQA: Incremental Progress in Answering Precision on the Jeopardy Challenge: 6/2007-11/2010 IBM Watson Playing in the Winners Cloud 100% 90% v0.8 11/10 80% V0.7 04/10 70% v0.6 10/09 v0.5 05/09 Precision 60% v0.4 12/08 50% v0.3 08/08 v0.2 05/08 40% v0.1 12/07 30% 20% 10% Baseline 12/06 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% % Answered © 2010 IBM Corporation IBM Research Precision, Confidence & Speed Deep Analytics – Combining many analytics in a novel architecture, we achieved very high levels of Precision and Confidence over a huge variety of as-is content. Speed – By optimizing Watson’s computation for Jeopardy! on over 2,800 processing cores we went from 2 hours per question on a single CPU to an average of just 3 seconds. Results – in 55 real-time sparring games against former Tournament of Champion Players last year, Watson put on a very competitive performance in all games -- placing 1st in 71% of the them! © 2010 IBM Corporation IBM Research Watson Workload Optimized System (Power 750) 90 x IBM Power 7501 servers 2880 POWER7 cores POWER7 3.55 GHz chip 500 GB per sec on-chip bandwidth 10 Gb Ethernet network 15 Terabytes of memory 20 Terabytes of disk, clustered Can operate at 80 Teraflops Runs IBM DeepQA software Scales out with and searches vast amounts of unstructured information with UIMA & Hadoop open source components SUSE Linux provides a cost-effective open platform which is performance-optimized to exploit POWER 7 systems 10 racks include servers, networking, shared disk system, cluster controllers 1 Note that the Power 750 featuring POWER7 is a commercially available server that runs AIX, IBM i and Linux and has been in market since Feb 2010 © 2010 IBM Corporation IBM Research $800 This computer does NOT hear, see, connect to the Internet or watch Jeopardy! © 2010 IBM Corporation IBM Research What is “Watson” ? Clue Grid Insulated and Self-Contained Human Player 1 Decisions to Buzz and Bet Strategy Watson’s Game Controller Jeopardy! Game Control System Text-to-Speech Clue & Category Answers & Confidences Watson’s QA Engine 2,880 IBM Power750 Compute Cores 15 TB of Memory Human Player 2 Clues, Scores & Other Game Data Analysis of natural language content equivalent to 1 Million Books © 2010 IBM Corporation IBM Research Generalized DeepQA Reasoning Paradigm Case/ Question/ Topic Data Learned Models help combine and weigh the Evidence Evidence Sources Answer Sources Primary Search Question /Case Analysis Candidate Answer Generation Fact, Finding and Question Decomposition Hypothesis Scoring Hypothesis Generation Hypothesis Generation Evidence Retrieval Hypothesis and Evidence Scoring Hypothesis and Evidence Scoring ... Deep Evidence Scoring Synthesis Models Models Models Models Models Models Final Merging and Confidence & Ranking ConfidenceWeighted Differential Diagnosis © 2010 IBM Corporation IBM Research IBM Watson The Enormity of the Challenge The Evolution of DeepQA How Watson Plays Jeopardy What else can we do with this technology? © 2010 IBM Corporation IBM Research To leverage our lead we must cross this trough very quickly through education and training Growth Dave: How do I get 8% return in this economy? Watson: What is Toronto???? © 2010 IBM Corporation 35 IBM Research © 2010 IBM Corporation IBM Research Watson’s real value proposition: Efficient decision support over unstructured (and structured) content Deeper Understanding, Higher Precision and Broader, Timely Coverage at lower costs Jeopardy! Challenge Shallow Understanding Low Precision Broad Coverage Unstructured Data Broad, rich in context Rapidly growing, current Invaluable yet under utilized Deeper Understanding but Brittle High Precision at High Cost Narrow Limited Coverage Structured Data Precise, explicit Narrow, expensive © 2010 IBM Corporation 37 IBM Research Search vs. Database vs. Data Mining vs. Watson (Watson uses unstructured and structured sources) Search Quickly finding documents from large volumes of text based on keywords Database Search term: “Rudolph” Finding precise information based on precise queries over tables Document List Based on Keywords Query: Select Payment where Owner=“David Jones” and Type(Product)=“Laptop”, Tables Name Serial Number David Jones 45322190-AK Serial Number Type Invoice # 45322190-AK LapTop INV10895 Invoice # Vendor Payment INV10895 MyBuy $104.56 Result: $104.56 Data Mining I need to drop and head I need to drop and head I need I need to to drop and drop and head head I need to drop I need to drop and and head I need to I need to drop and head I need to drop drop and head head I need to drop and head I need I need to to drop and and head I need to drop and head I need to drop and head I need to drop drop and head head I need to drop and head I need to I need to drop and I need to drop and and head I need to drop and head head I need to drop I need to drop and head I need to drop drop and head and head I need head I need to drop I need to I need to to drop andand head I need to drop and drop and head and head I need to drop and head head I need to drop I need to drop and head I need to drop drop and head and head I need to head I need to drop I need to I need to drop andand head I need to drop and drop and head and head I need to drop and head head I need to drop I need to drop and head I need to drop drop and head and head I need to head I need to drop I need to I need to drop andand head I need to drop and drop and head and head I need to drop and head head I need to drop head I need to drop drop and head and head I need to I need to I need to drop andand head I need to drop and drop and head drop and head head I need to drop head I need to drop and head I need to I need to drop andand head I need to drop and head head I need to dropdrop and head Uncovering knowledge insights based on searching for correlations within LDL > 220 large number of and Patient suffered a heart cases attack Both in green and head I need to I need to drop and drop and head head I need to drop and head I need to drop and head Watson Efficiently apply existing knowledge to help inform decisionmaking Name Serial Number David Jones 45322190-AK Marion Barber 88769234-NY I need to drop and head I need to drop Insights from other analytics and head I need to and I need to drop drop and head head I need to drop I need to drop and and head I need to I need to drop and head I need to drop drop and head head I need to drop and head I need to and I need to drop and head I need to drop and head I need to drop and head I need to drop drop and head and head I need to head I need to drop I need to drop and I need to drop and drop and head and head I need to head I need to drop I need to drop and head I need to drop drop and head and head I need to and and head I need to head I need to drop I need to drop I need to drop and drop and head drop and head and head I need to head I need to drop I need to drop and head I need to drop drop and head and head I need to and head I need to head I need to drop I need to drop and I need to drop and drop and head drop and head and head I need to head I need to drop I need to drop and head I need to drop drop and head and head I need to and head I need to head I need to drop I need to drop and I need to drop and drop and head drop and head and head I need to head I need to drop head I need to drop drop and head and head I need to and head I need to I need to drop and I need to drop and drop and head drop and head head I need to drop head I need to drop and head I need to and head I need to I need to drop and drop and head drop and head head I need to drop 38 © 2010 IBM Corporation and head I need to I need to drop and drop and head head I need to drop and head I need to drop and head IBM Research Evidence Profiles from disparate data is a powerful idea Each dimension contributes to supporting or refuting hypotheses based on – Strength of evidence – Importance of dimension for diagnosis (learned from training data) Evidence dimensions are combined to produce an overall confidence Overall Confidence Positive Evidence Negative Evidence © 2010 IBM Corporation IBM Research What’s next for Watson? Some ideas from students. IBM Announces Student Winners of Watson Case Competition from Cornell University Faculty award winners from Nine Universities Also Announced; Professors to Receive $10,000 Grants for Watson Curriculums First Place: Customer Service: Say Hello to Watson - Consumers with urgent questions about their electronic devices, call, e-mail, post and chat with customer service representatives. Watson could look at unstructured and structured information to help consumer electronics firms answer inquiries with greater accuracy and faster response times ultimately creating happy customers. Second Place: Watson Helps You Plan Your Next Beach Getaway Faster Planning business trips and vacations, including websites that aggregate flight and hotel rates, tourism sites packed with things to do, and millions of online reviews from fellow travelers. Third Place: You're Hired! Big Data Helping HR - Adaptive human capital management model that calls upon Watson's ability to quickly comb through data and make informed recommendations, to help businesses match open jobs with the best candidates. © 2010 IBM Corporation IBM Research © 2010 IBM Corporation
© Copyright 2026 Paperzz