Watson Jeopardy! A Thinking Machine

V IENNA U NIVERSITY OF T ECHNOLOGY
Watson Jeopardy!
A Thinking Machine
Louise Beltzung, 0350634
[email protected]
16th December, 2013
Abstract
In 2011 the US-American quiz show ‘Jeopardy!’ had its first non-human
competitor: ‘Watson’, a computer developed by IBM. ‘Watson’ had been
specifically conceived for this ultimate test. The aim was to demonstrate the
machinic ability to not only decrypt, but understand natural language. In this
seminar paper, this topic will be explored from a philosophical perspective.
The focus will be onto whether the artificial intelligence of ‘Watson’ is
comparable to the human capability to understand language. An outlook on
the usage of ‘Watson’ as for today in the medical sector will be followed by
preliminary conclusions on AI technologies and humans.
1 Introduction
In 1997 the research team of IBM had achieved what would have been considered impossible
before: their computer ‘Deep Blue’ won a chess game against the world-champion Gary
Kasparov. For the first time a machine had been able to win against someone who had until
then been considered as unbeatable. Nonetheless, the Deep Blue project was suspended
after this success. Subsequently, the company’s research department of Artificial Intelligence
was put under substantial pressure to find a project capable of competing with the amount of
public attention ‘Deep Blue’ had enjoyed. A new idea was born: the next pioneering project in
the AI domain of IBM should be a computer winning the Jeopardy! TV-quiz;
‘Jeopardy!’ is an US-American quiz show, which attracts 25 million of viewers each week.
(Jeopardy!) Nonetheless, compared to the chess-winning computer this first seemed rather
trivial in appearance and too complex to attain. (Baker 2011) It was more than difficult to make
it look attractive to and worth a try for researchers to put their work efforts in this. They had
doubts into why they should spend so much time on developing a highly complex architecture
to perform finally a task, which seemed to be met well enough by search engines such as
Google.
‘Jeopardy, with its puns and strangely phrased clues, seemed too hard for a
computer. IBM was already building machines to answer questions, and their
performance, in speed and precision, came nowhere close to that of even a
moderately informed person. How could the next machine grow so much smarter?
And while researchers regarded the challenge as daunting, many people, Horn
knew, saw it precisely the other way. Answering questions? Didn’t Google already
do that?’ (Baker 2011)
Finally, the position of chief scientist for the team with the mission to build ‘Watson Jeopardy!’
was found: David Ferrucci. Before, he had built ‘Brutus’, a computer writing fiction. ‘Watson’ in
turn was to be presented to the big public in a match against two legends of the show, but it
should be more than a mere search engine.
The match was sought of It should culminate in showing how a machine may be able not only
to find answers to precise questions, as if it was a machinic version of the library of Babel of
the Argentinean author Jorge Borges (1998/1944); but a machine equipped in a way that even
the undertones, suggestive hints, could be decrypted. Therefore, Watson would need to judge
and guess, it would need some kind of a ‘confidence’ into his knowledge. (Baker 2011)
In this seminar paper, I aim to focus on Watson as an example of artificial intelligence. First, I
will focus on the Jeopardy match to explain the architecture of Watson. Second, I will proceed
with deploying the debate on whether Watson is to be regarded as a machine that
understands language and thereby recur to philosophical inquiries into what human thinking
means. Third, I will explore the application of Watson as for today within the medical sector
and finally, draw preliminary conclusions out of this.
2 of 7
2 The Match
‘Jeopardy!’ does not have a conventional quiz show format. The questions are not questions,
but they are formulated as answers to which the answer has to be in reverse formulated as
question. Also, the sentences are full of clues, nuances and jokes. (Irwin und Kyle Johnson
2011) Unsurprisingly, many people are not even able to decrypt and understand what is
actually being asked for. To illustrate this, a sample clue: Nicholas II was the last ruling czar of
this royal family. The according ‘answer’, which has to be formulated as question would hence
be: Who are the Romanovs?1
The ‘Watson Jeopardy!’ match was scheduled to take place over three consecutive days in
January 2011, against two winning champions of the quiz show. Ken Jennings had won more
than 2.5 million Dollars in 2004 and 2005, in a series of 74 games. Brad Rutter had earned
best in the history of the show, nearly 3.5 million Dollars. (Jeopardy!, www)
While the text was read aloud, all three competitors were able to read it. Watson obtained the
clue as a text file. (IBM, www) The first day ended with Watson and Brad Rutter with the same
amount of money, but on the second day Watson had already won the lead with 35,734
Dollars, three times more than Mr. Rutter. Finally, after three days Watson won 77,147 Dollars,
followed by Mr. Jenning with 24,000 Dollars and Mr. Rutter with 21,600 Dollars.
2.1
The Deep Natural Language Processing System of Watson
Understanding language for the purpose of winning a contest such as ‘Jeopardy!’, requires a
language processing system able to discern what is beneath and beyond the spoken words.
The challenge hence, is not so much to be precise, but accurate. (High 2012, 3)
The Deep Natural Language Processing (NLP) System of Watson is ought to guarantee this.
The process of decrypting sentences during the Jeopardy! quiz in a very short amount of time
was the following (High 2012, 5–6):
1
http://www.jeopardy.com/beacontestant/contestantsearches/practicetest/
3 of 7
1. Watson extracts the major contents of the question.
2. The computer then searches within its database for possible relevant text extracts and
generates hypotheses.
3. Reasons algorithms compare the question and the responses with regard to whether
they match. This done with various aspects, e.g. a focus on temporal or spatial
aspects, or terms and synonyms.
4. The analysis results in scores for every question-answer combination, which are then
weighted according to a model to establish the level of confidence.
5. This process is repeated until the response is found probable enough.
3 The Intelligent Machine
The question on whether a machine can think did not emerge with the technological
developments towards artificial intelligence or the imagination of science fiction authors and
film-makers, but has been central to philosophical inquiries into what human thinking means
and to which extent the mind could be replicated since centuries.
The French Cartesian philosopher René Descartes (1596-1650) assumed that machines, no
matter how evolved, could never be able to understand language and think as humans do,
because this happened ‘in the immaterial soul and that souls could not attach to machines’.
(Irwin and Kyle Johnson 2011) Interestingly enough, the critical commentaries in the debate
after the match of ‘Watson Jeopardy!’ raised this classical issue once more: the idea of a
thinking machine seems obscene, frightening or utopian, because it implies that the human
might be some-thing one could build in total. The human would lose its uniqueness, which is
more and more attached not to his/her body, but to the ‘more’ of it.
The question of how intelligent artificial machines may ever be, is hence also related to the
dualist debates on body-mind and to whether the perception humans have, the experience
they share and the feelings they have, may be explained with mechanical principles or not.
Perception and all that relates to this primary world-relation humans have, cannot be explained
that way, said Gottfried Wilhelm Leibniz in his Monadology. He imagines that one could walk
into a machine and open the mechanical black box, still the parts and mechanics would not
account for what human perception finally is.
‘Imagine there were a machine whose structure produced thought, feeling, and
perception; we can conceive of its being enlarged while maintaining the same
relative proportions ·among its parts·, so that we could walk into it as we can walk
into a mill. Suppose we do walk into it; all we would find there are cogs and levers
and so on pushing one another, and never anything to account for a perception. So
perception must be sought in simple substances, not in composite things like
machines. And that is all that can be found in a simple substance — perceptions
and changes in perceptions; and those changes are all that the internal actions of
simple substances can consist in.’ (Leibniz 2004/1714, 17)
This, was used to dispute that Watson actually understood anything. Thereby, the focus was
on what his ability to understand language actually means. (Baker 2011)
4 of 7
3.1
The Chinese Room
The US-American philosopher John Searle deploys his argument in four points: for him
programs such as Watson are purely formal, whereas the human mind has ‘mental contents’.
(Stanford Encyclopedia of Philosophy 2004) Behind this lies the difference between syntax
and semantic language.
Understanding the syntax of language means to understand the grammatical structure,
whereas semantics refer to the meanings beyond the grammar. In other words, a sentence
may be arranged in a way that is syntactically correct, but make no sense and hence be
semantically worthless.
With regard to Watson this means that he may be very well prepared and equipped to
understand the syntax of language, but then lacking self-conscious of his act to decrypt the
semantics, he does not understand in a human sense.
‘Watson did not understand the questions, nor its answers, nor that some of its
answers were right and some wrong, nor that it was playing a game, nor that it
won—because it doesn't understand anything.’ (Searle 2011)
This is well explained by what is referred to as the Chinese Room thought experiment. The
idea is that one finds itself in a room and does not know a word of Chinese. The assumption is
that the person has a set of rules formulated in English which enable him to to correlate the
Chinese symbols. This way, the person could even answer to question formulated in Chinese,
without understanding anything except the syntax. Searle used this as argument to explain
how the symbols that Watson analyzed had been ‘meaningless’ to him. ‘The bottom line can
be put in the form of a four-word sentence: Symbols are not meanings.’ (Searle 2011) The
conclusion out of this would be that computers cannot understand, but they may be perfectly
well built to simulate thinking.
IBM's computer was not and could not have been designed to understand. Rather, it was
designed to simulate understanding, to act as if it understood. It is an evasion to say, as some
commentators have put it, that computer understanding is different from human
understanding. Literally speaking, there is no such thing as computer understanding. There is
only simulation. (Searle 2011)
Still, the irony about simulations has well been pointed out by the French philosopher Jean
Baudrillard. Virtual refers to being ‘almost-there’ and ‘almost-so’, but when the ‘actual is
ignored in favour of the virtual’, they become ‘more real and real’ (Shields 2003, 4) and
perhaps the question as to whether they are simulation or actual is irrelevant.
This is well underlined, if one pays attention to how those defending Watson for his
performance and ability, interestingly, deployed their argument. He was human-like because of
his technical weaknesses.
‘Even more remarkable: His mistakes were all too human. They were the same
mistakes that we've seen humans make on Jeopardy before [...] Here is the
mistake we found most extraordinary. This was the $400 clue in the category Final
Frontiers:"From the Latin for ‘end.' This is where trains can also originate.” Watson
5 of 7
answered finis, the Latin word for end. He did not take note of the latter half of the
clue, or realize that the clue was asking for an English word, not a Latin one.
Mostly interesting though, Ken made the latter mistake as well. His answer was
terminus instead of terminal. Trebek still gave Ken the score, but it's remarkable
that both Ken and Watson misunderstood what the clue was asking for in the same
way. And if they misunderstood it in the same way, didn't they understand it in the
same way? Isn't Watson understanding?’ (Irwin und Kyle Johnson 2011)
This illustrates how in the end it might be completely futile to think about whether Watson truly
understands what he does, what he says, and what he does not know and knows. If his
human counterparts assume he is one of their own, then again, why should he not be one?
Finally, as Donna Haraway (2007) shows the closes companions of humans might as well be
animals, cyborgs and other creatures.
4 Outlook and Conclusion
The match was won, but the Watson project did not stop there. Today, it is used for industrial
purposes, as within the medical sector. The ‘bionic Dr. House’ (Baker 2011) serves to support
doctors in their decision-taking.
An advanced question-answering machine could serve as a bionic Dr. House.
Unlike humans, it could stay on top of the tens of thousands of medical research
papers published every year. And, just as in Jeopardy, it could come up with lists of
potential answers, or diagnoses, for each patient’s ills. It could also direct doctors
toward the evidence it had considered and provide its reasoning. The machine,
lacking common sense, would be far from perfect. Just as the Jeopardy computer
was certain to botch a fair number of clues, the diagnoses coming from a digital Dr.
House would sometimes be silly. So people would still run the show, but they’d be
assisted by a powerful analytical tool. (Baker 2011)
As a pilot it was implemented at the Cedars-Sinai research hospital in Los Angeles, the server
was accessible via the internet. The complex architecture of language processing should
make sure that questions asked differently led to contradicting responses. (Keene 2011, S. -)
The advantage hereby is that in contrast to the Jeopardy! IBM challenge the time of decision
taking may be extended. (Clark Estes 2013)
The potential of the application lies in the limitedness of the doctor’s time and capacity to scan
all the knowledge available on illnesses. With Watson research databases, prior cases and
other related texts could be scanned automatically to support evidence-based medicine.
(Mearian 2011)
The Deep Language Processing System implemented in Watson is able to judge and
understand meanings beyond the syntax; still does it get the semantics? In a narrow sense, it
does: it understands jokes, in the sense that it may decode them at least. IBM emphasizes this
with the following statement: ‘Without context, we would be lost.’ (High 2012, 4)
Still, in this statement leaves space to think about whether the context and the semantics may
ever be reduced to decoding a joke, understanding a hint or a language trap. With regard to
the application within the medical sector for example, the context might be much broader than
6 of 7
one extractable from language analysis: a patient history is more than the prescribed
medications, treatments and previous diagnoses, and doctors know quite well already that
even the spoken word of humans tend not to deliver the context needed to understand them.
Still, the challenge Jeopardy! imposed on the technology developers led also to the greatest
strength of the system: it’s possible guidance towards asking the questions needed to find the
right answers.
One of greatest revelations about Watson is that, by using Watson to help answer
questions, you might realize that you are fundamentally asking the wrong questions.
When Watson responds to your questions, even answering you correctly, you might
realize that you need to ask other, better, and more important questions to help consider
your business problem in a whole new way. You start to think in ways that help you to
understand the competitive threats and opportunities in your marketplace that never
occurred to you before (High 2012, S. 9)
5 References
Baker, Stephen (2011). Final Jeopardy. Man vs. Machine and the Quest to Know Everything. New York:
Houghton Mifflin Harcourt Publishing.
Borges, Jorge Luis (1998). "The Library of Babel." Collected Fictions.Trans. Andrew Hurley. NewYork:
Penguin. Originally published in 1944.
Clark Estes, Adam (2013). Watson, the Jeopardy-Winning Computer, Will Help Doctors Fight Cancer
with Data, (13.12.2013).
Haraway, Donna (2007). When Species Meet. Minnesota Press.
High, Rob (2012). The Era of Cognitive Systems: An Inside Look at IBM Watson and How it Works.
Redguites for Business Leaders.
IBM: Jeopardy! – Watson Match. The IBM Challenge.
http://www.youtube.com/watch?v=YLR1byL0U8M.
Irwin, William and Kyle Johnson, David (2011). Watson in Philosophical Jeopardy? Does Watson, the
Jeopardy playing computer, have a mind?. http://www.psychologytoday.com/blog/platopop/201102/watson-in-philosophical-jeopardy, (13.12.2013).
Jeopardy!: Did you know… History of Jeopardy!
http://www.jeopardy.com/showguide/abouttheshow/showhistory/, (14.12.2013).
Keene, Jamie (2011). IBM's Watson supercomputer turns to treating cancer. The Verge.
http://www.theverge.com/2011/12/27/2663313/ibm-watson-cancer-treatment-cedars-sinai.
Leibniz, G. W. (2004/1714). The Principles of Philosophy known as Monadology.
Mearian, Lucas (2011). IBM's Watson Shows up for Work at Cedars-Sinai's Cancer Center.
Computerworld.
http://www.pcworld.com/article/246976/ibms_watson_shows_up_for_work_at_cedarssinais_canc
er_center.html, ( 13.12.2013).
Searle, John (2011). Watson Doesn't Know It Won on 'Jeopardy!'. IBM invented an ingenious program not a computer that can think. The Wall Street Journal.
http://online.wsj.com/news/articles/SB10001424052748703407304576154313126987674,
(13.12.2013).
Shields, Rob (2003). The Virtual. New York: Routledge.
Stanford Encyclopedia of Philosophy (2004). The Chinese Room Argument. Stanford Encyclopedia of
Philosophy. http://plato.stanford.edu/entries/chinese-room/#2.1, last update 22.09.2009.
7 of 7