- Backpack

Aakarsh Malhotra (2011002)
Gandharv Kapoor(2011047)

Introduction
• A
cognitive
technology
that
processes information more like a
human than a computer.
• Named after IBM founder Thomas
J. Watson.
• It is IBM’s Question Answering
(QA) project led by David Ferrucci.
• It answers questions asked in
natural language with speed,
accuracy and confidence.

Natural Language
◦ Watson can read and understand natural language,
important in analyzing unstructured data that make up
as much as 80 percent of data today.

Hypothesis Generation
◦ When asked a question, Watson relies on hypothesis
generation and evaluation to rapidly parse relevant
evidence and evaluate responses from disparate data.

Dynamic Learning
◦ Through repeated use, Watson literally gets smarter by
tracking feedback from its users and learning from both
successes and failures.



Pizza box sized.
Understands complex human language
well, including slangs, metaphors and
badly framed question and answers
precisely with confidence.
Keyword-based search to intuitive,
personalised search, with confidenceranked response.
Watson competed on Jeopardy! against
former winners Brad Rutter and Ken Jenning.
A short video….
http://youtu.be/7CypnrFVS1U





IBM developed DeepQA, a massively parallel
software architecture that examined natural
language content in both the clues set by
Jeopardy and in Watson's own stored data.
DeepQA works out what the question is asking,
then works out some possible answers based on
the information.
It then generated a ranked list of answers, with
evidence for each of its options.
All the information had to be locally stored .
Watson wasn't allowed to connect to the Internet
during the quiz.



Articles
Wikipedia
Internal organization documents
◦ Say, like a doctor’s notes, his experiences!!

Encyclopaedias, Dictionaries, Thesauri

Hardware

Software
Watson is composed of :
o Cluster of ninety IBM Power 750 servers.
o Each of which uses a 3.5 GHz POWER7 eight core processor,
with four threads per core.
o In total, the system has 2,880 POWER7 processor cores and
has 16 TB of RAM.
o Watson uses IBM's DeepQA software and the Apache UIMA
(Unstructured Information Management Architecture)
framework.
o System is written in various languages, including Java, C++
and Prolog.
o Runs on the Linux Enterprise Server 11 operating system
using Apache Hadoop framework to provide distributed
computing.


All the components after DeepQA give features of
computation and confidence. Hierarchical machine
learning is used to combine these features.
After searching using these keywords, candidate
answers are found out. Those with very low
confidence are rejected. Those with high
confidence are moved to final merging stage.
Medium confidence results are checked in soft
filtering phase based on machine learning on
training data.




http://craigrhinehart.com/2011/01/17/10-thingsyou-need-to-know-about-the-technology-behindwatson/
http://en.wikipedia.org/wiki/Watson_(computer)
http://www.ibm.com/smarterplanet/us/en/ibmwat
son
http://www.aaai.org/Magazine/Watson/watson.php
Thanks!!