What makes Watson work? - Center for Intelligent Information Retrieval

What makes Watson work?
A sketch of question answering technology
James Allan
Center for Intelligent Information Retrieval
February 15, 2011
Department of Computer Science
Caveats, credits, and disclaimers
Did not work on Watson project directly
• Most of the engineering was done by IBM staff
• Includes UMass Amherst alums, interns, and software
• We collaborated on supporting research
Some of presentation speculation
• Based on knowledge of QA technology which has been
heavily studied since about 2000
• Based on some information from IBM and its staff
Examples of how Watson works are illustrations
• Explains the types of thing that might have happened
• Somewhat simplified (only a 12-minute presentation!)
• Definitely not precisely what happened on the show
Department of Computer Science
2
What is Watson?...
Image copyright © IBM 2011
Department of Computer Science
3
Broadly speaking…
1.
2.
3.
4.
5.
6.
7.
What are the plausible targets?
Build queries to find answers
Search unstructured text for matching texts
Extract candidates from text
Look for evidence for each candidate
Score candidates
Rank and decide if
confident enough
Department of Computer Science
Image copyright © IBM 2011
4
An example from last night…
Department of Computer Science
5
Plausible target for answer (question, here)
Something that…
•
•
•
•
Can be wanted
Has an appearance
Is involved in killing
Has a personality (a split one)
Looking for any string that
fits all of those
Probably a noun
Possibly a person, though
other animate objects fit
Category is “literary” and
“character” and “APB”
Department of Computer Science
6
Build a query
“wanted for killing sir danvers carew:
appearance.. pale & dwarfish; seems to have a
split personality”
Some words/phrases are more important
• Killing, Danvers Carew, pale, dwarfish, split personality
Might add synonyms (not really needed here)
Category words (literary character APB) useful
Plausible simple query
• Killing “danvers carew” pale dwarfish “split personality”
literary character APB
• Most likely includes weights and backoffs
Department of Computer Science
7
Search text sources
Primarily unstructured or semi-structured text
• Encyclopedia articles, dictionaries
• Books, news
• Movie scripts
Added material needed for Jeopardy
• Can’t get by without the works of Shakespeare
• Gazetteers for geographical information
Search is similar to Google/Bing/Yahoo
• Except no links or query log like on Web
• Focus on finding small passages likely to have answer
Uses CIIR’s Indri search engine!
Department of Computer Science
8
Extract candidates from text
Consider spans of text that match
• “Sir Danvers Carew: member of Parliament who is
murdered by Hyde”
• “…Mr. Hyde was pale and dwarfish…”
• “…Mr. Hyde-type split personality…”
• “…Sherlock Holmes solves the mystery surrounding
Jekyll and Hyde…”
Candidates from these few samples
•
•
•
•
Sir Danvers Carew, member of Parliament, Parliament
Murdered, Hyde
Sherlock Holmes, mystery
Jekyll
Department of Computer Science
9
Look for evidence
What does Watson “know” about candidates?
• Associated words
• Association with words in the clue
• Nature of association
For example
Parliament – noun, no personality, no killing, real
Murdered – not a noun, related to killing, not a character
Sherlock Holmes – person, killings, appearance, wanted, fictional
Mystery – killings, not a character
Jekyll – (is a) person, (connection to) Hyde, Carew, fictional
Hyde – (is a) person, (connection to) Jekyll, Carew (killer of),
wanted, (has a) split personality, fictional
• …
•
•
•
•
•
•
Department of Computer Science
10
Score candidates
Combine thousands of features of evidence
Decide which candidate best matches
Machine learning trained on many past questions
• What do good answers look like
• What sort of evidence is needed
Department of Computer Science
11
Rank and decide if confident enough
Here, top three candidates were:
Evidence for second two was underwhelming
• e.g., “…power of Dracula’s remains. Because of this Maxim
ended up developing a split personality…”
• e.g., “Dr. Jekyll and Mr. Holmes, a novel written in 1980 by
Loren D. Estleman. Sherlock Holmes solves the mystery
surrounding Jekyll and Hyde.”
Hyde was best choice and over threshold
• Yellow indicates only somewhat above
Department of Computer Science
12
An even easier one for Watson (and anyone?)…
Department of Computer Science
13
An even easier one…
Beatles people “and anytime
you feel the pain, hey”
“refrain, don’t carry the
world upon your shoulders”
+ target: “guy” (person or noun)
Many many occurrences of this:
• “And anytime you feel the pain, hey Jude, refrain: don't
carry the world upon your shoulders”
Department of Computer Science
14
An even easier one…
Beatles people “and anytime
you feel the pain, hey”
“refrain, don’t carry the
world upon your shoulders”
+ target: “guy” (person or noun)
Many many occurrences of this:
• “And anytime you feel the pain, hey Jude, refrain: don't
carry the world upon your shoulders”
Department of Computer Science
15
One that got Watson…
Department of Computer Science
16
Stylish elegance was more important
Chic strongly associated with style and elegance.
• Also connected to year (“year to be chic”)
Class (correct answer) has similar meaning but has
many other meanings, too
Department of Computer Science
17
Stylish elegance was more important
Panache is strongly associated with style
Also means “plume” and is associated with Cyrano
de Bergerac, but even then means “style”
Department of Computer Science
18
Stylish elegance was more important
Vera Wang?
• …didn’t graduate from
design school…
• …graduation dresses…
Department of Computer Science
Image from http://www.verawang.com which presumably holds copyright
19
Evaluation of Watson during development
Consider past games of Jeopardy
For each contestant
• How many questions did they (try to) answer
• Number tried / number could have tried
• How often were they right when they answered
• Number right / number attempts
Compare Watson’s abilities to those numbers
(Also ran actual trials which were shown as
“bloopers” last night)
Department of Computer Science
20
Humans way better than Watson 4 years ago
Department of Computer Science
Image copyright © IBM 2011
21
Rapid progress into “winners cloud”
Image copyright © IBM 2011
Department of Computer Science
22
Doesn’t hurt to have a big machine
90 x IBM Power 750 servers
2880 POWER7 cores
POWER7 3.55 GHz chip
500 Gb/sec on-chip bandwidth
16 Terabytes of
memory
4 Terabytes of disk
80 Teraflops
Linux-based
Image copyright © IBM 2011
Department of Computer Science
23
Has IBM/Watson solved question answering?
No! Many open challenges
• And UMass Amherst courses that touch on topics!
Faster computers (of course)
• That fit in a shoebox and run on a tuna fish sandwich?
Speech recognition (“the 1920s” mistake)
Machine learning at scale and speed
Improved search, particularly to find candidates
Improved natural language processing
• Moving closer to natural language understanding?
Automated reasoning
…
Department of Computer Science
24