How does Watson work?

Watson represents an innovation in
Data Analysis Computing, but what does that mean?  It means Watson can parse natural language
better than anything out there.  Think of
it this way: in the system we use today, the user formulates a question, pulls
out 2-3 keywords, sends them to a search engine which gets documents with the
keywords present within and sends them back to the user, who reads the documents
and analyzes the results.  All of the
real "work" is done by the user.  But a
better system is one in which the user asks a natural language question, the "expert"
understands the question, produces answer and evidence, analyzes the results,  produces a statement about how confidant it is
in the answer, and returns its findings to the user. This "expert" role used to
be filled by librarians.  Now, Watson is
capable of doing this in less time and with a much larger set of resources than
any human librarian could. 

This type of computing is called
Automatic Open-Domain Question Answering and the basic expectations for a
system like this are:  given a natural
language question and resources, it should return precise answers, accurate
confidence, justifications, and a fast response.   Before Watson, the best a program could do
would either be the first two of these results, or the last, but rarely all
four.  Watson can do all four, and this
is how.  As soon as the prompt is chosen
and clue revealed, Watson receives the text clue. While Alex is reading the
clue aloud (approximately 3 seconds) Watson parses the clue, looking for type
of answer, things to search for, and relationships to former clues, especially
in that category.  It searches through
its reference materials (which are sorted into subject/verb/object style syntactic
frames, semantic frames, and pure data) retrieves candidate answers, analyzes
those answers, synthesizes a final list of answers that are the same type
hinted at in the question, calculates the confidence in the answer based on
keyword matching, temporal reasoning (If the king of Spain lived for so many
years and died on such a date, he was born on such a date), statistical
paraphrasing (fluid and liquid are technically different, but are commonly
interchangeable terms), and geospatial reasoning (Bordeaux is in France),
possibly repeats this process if it broke the question into smaller pieces,
decides on the most statistically probable answer, and buzzes in as the buzzer
is turned on, moments after Alex finishes speaking. 

How does Watson accomplish this
feat in just three seconds?  For a single
processor, this task might take anywhere between 20 minutes and 3 hours, but
IBM again has found a solution.  Using a
method called asynchronous scale-out; these jobs are doled out to an entire
computing cluster comprised of 90 power 750 series servers with 2880 POWER7
processor cores, and 15 terabytes of RAM that can operate at 80 teraFLOPs.  Using this computing cluster, the same
process takes less than 3 seconds and is done in plenty of time for Watson to
activate his buzzer.  Between the
software and hardware of Watson, it's easy to see how advanced this technology
has become.

Click here for part 3.

The full first match:

Posted in: