[ExI] Watson On Jeopardy

Samantha Atkins sjatkins at mac.com
Wed Feb 16 23:48:03 UTC 2011

Thanks for the excellent article, Eugen.  Watson certainly is not 
simplistic.  And some of its capabilities are ones I did not know we had 
a good enough handle on.  Of course working with such beefy hardware is 
a big part of its level of RT success.   Until the algorithms and 
hardware costs combined give orders of magnitude lower cost I can't see 
such capabilities making that much of a difference more broadly real soon.

- s

On 02/16/2011 05:10 AM, Eugen Leitl wrote:
> On Wed, Feb 16, 2011 at 07:55:56AM -0400, Darren Greer wrote:
>> Annihilated them Spike. App. 35,000 for Watson, 4000 for Jennings and 10000
>> for Rutter.  He got final Jeopardy wrong but was parsimonious with his
>> wager  -- just 900 odd dollars.  Alex Trebek laughed and called him a
>> 'sneak'  because of the clever wager.  The category was which U.S. city has
>> an  airport named after a war hero and a WWII battle. Watson said Toronto. I
>> got  a good laugh. I didn't know we'd been annexed.   Another
>> interesting detail. Ratings for Jeopardy have soared into the
> Not to mention bring Big Blue back into the limelight.
>> stratosphere because of Watson. It moved into the number two spot in TV
>> land behind a Charlie Sheen sitcom last night.
> By the way, Watson is not nearly as dumb (and far more usable) than I
> thought. According to
> http://www.hpcwire.com/features/Must-See-TV-IBM-Watson-Heads-for-Jeopardy-Showdown-115684499.html?viewAll=y
> February 09, 2011
> Must See TV: IBM Watson Heads for Jeopardy Showdown
> Michael Feldman, HPCwire Editor
> Next week the IBM supercomputer known as "Watson" will take on two of the
> most accomplished Jeopardy players of all time, Ken Jennings and Brad Rutter,
> in a three-game match starting on February 14. If Watson manages to best the
> humans, it will represent the most important advance in machine intelligence
> since IBM's "Deep Blue" beat chess grandmaster Garry Kasparov in 1997. But
> this time around, the company also plans to make a business case for the
> technology. Trivial pursuit this is not.
> And impressive technology it is. On the hardware side, Watson is comprised of
> 90 Power 750 servers, 16 TB of memory and 4 TB of disk storage, all housed in
> a relatively compact ten racks. The 750 is IBM's elite Power7-based server
> targeted for high-end enterprise analytics. (The Power 755 is geared toward
> high performance technical computing and differs only marginally in CPU
> speed, memory capacity, and storage options.) Although the enterprise version
> can be ordered with 1 to 4 sockets of 6-core or 8-core Power7 chips, Watson
> is maxed out with the 4-socket, 8-core configuration using the top bin 3.55
> GHz processors.
> The 360 Power7 chips that make up Watson's brain represent IBM's best and
> brightest processor technology. Each Power7 is capable of over 500 GB/second
> of aggregate bandwidth, making it particularly adept at manipulating data at
> high speeds. FLOPS-wise, a 3.55 GHz Power7 delivers 218 Linpack gigaflops.
> For comparison, the POWER2 SC processor, which was the chip that powered
> cyber-chessmaster Deep Blue, managed a paltry 0.48 gigaflops, with the whole
> machine delivering a mere 11.4 Linpack gigaflops.
> But FLOPS are not the real story here. Watson's question-answering software
> presumably makes little use of floating-point number crunching. To deal with
> the game scenario, the system had to be endowed with a rather advanced
> version of natural language processing. But according to David Ferrucci,
> principal investigator for the project, it goes far beyond language smarts.
> The software system, called DeepQA, also incorporates machine learning,
> knowledge representation, and deep analytics.
> Even so, the whole application rests on first understanding the Jeopardy
> clues, which, because they employ colloquialisms and often obscure
> references, can be challenging even for humans. That's why this is such a
> good test case for natural language processing. Ferrucci says the ability to
> understand language is destined to become a very important aspect of
> computers. "It has to be that way," he says. "We just cant imagine a future
> without it."
> But it's the analysis component that we associate with real "intelligence."
> The approach here reflects the open domain nature of the problem. According
> to Ferrucci, it wouldn't have made sense to simply construct a database
> corresponding to possible Jeopardy clues. Such a model would have supported
> only a small fraction of the possible topics available to Jeopardy. Rather
> their approach was to use "as is" information sources -- encyclopedias,
> dictionaries, thesauri, plays, books, etc. -- and make the correlations
> dynamically.
> The trick of course is to do all the processing in real-time. Contestants, at
> least the successful ones, need to provide an answer in just a few seconds.
> When the software was run on a lone 2.6 GHz CPU, it took around 2 hours to
> process a typical Jeopardy clue -- not a very practical implementation. But
> when they parallelized the algorithms across the 2,880-core Watson, they were
> able to cut the processing time from a couple of hours to between 2 and 6
> seconds.
> Even at that, Watson doesn't just spit out the answers. It forms hypotheses
> based on the evidence it finds and scores them at various confidence levels.
> Watson is programmed not to buzz in until it reaches a confidence of at least
> 50 percent, although this parameter can be self-adjusted depending on the
> game situation.
> To accomplish all this, DeepQA employs an ensemble of algorithms -- about a
> million lines of code --- to gather and score the evidence. These include
> temporal reasoning algorithms to correlate times with events, statistical
> paraphrasing algorithms to evaluate semantic context, and geospatial
> reasoning to correlate locations.
> It can also dynamically form associations, both in training and at game time,
> to connect disparate ideas. For example it can learn that inventors can
> patent information or that officials can submit resignations. Watson also
> shifts the weight it assigns to different algorithms based on which ones are
> delivering the more accurate correlations. This aspect of machine learning
> allows Watson to get "smarter" the more it plays the game.
> The DeepQA programmers have also been refining the algorithms themselves over
> the past several years. In 2007, Watson could only answer a small fraction of
> Jeopardy clues with reasonable confidence and even at that, was only correct
> 47 percent of the time. When forced to answer the majority of the clues, like
> a grand champion would, it could only answer 15 percent correctly. By IBM's
> own admission, Watson was playing "terrible." The highest performing Jeopardy
> grand champions, like Jennings and Rutter, typically buzz in on 70 to 80
> percent of the entries and give the correct answer 85 to 95 percent of time.
> By 2010 Watson started playing at that level. Ferrucci says that while the
> system can't buzz in on every question, it can now answer the vast majority
> of them in competitive time. "We can compete with grand champions in terms of
> precision, in terms of confidence, and in terms of speed," he says.
> In dozens of practice rounds against former Jeopardy champs, the computer was
> beating the humans with a 65 percent win rate. Watson also prevailed in a
> 15-question round against Jennings and Rutter in early January of this year.
> See the performance below.
> None of this is a guarantee that Watson will prevail next week. But even if
> the machine just makes a decent showing, IBM will have pulled off quite
> possibly the best product placement in television history. Open domain
> question answering is not only one of the Holy Grails of artificial
> intelligence but has enormous potential for commercial applications. In areas
> as disparate as healthcare, tech support, business intelligence, security and
> finance, this type of platform could change those businesses irrevocably.
> John Kelly, senior vice president and director of IBM Research, boasts,
> "We're going to revolutionize industries at a level that has never been done
> before."
> In the case of healthcare, it's not a huge leap to imagine "expert" question
> answering systems helping doctors with medical diagnosis. A differential
> diagnosis is not much different from what Watson does when it analyzes a
> Jeopardy clue. Before it replaces Dr. House, though, the machine will have to
> prove itself in the game show arena.
> If Jennings and Rutter defeat the supercomputer this time around, IBM will
> almost certainly ask for a rematch, as it did when Deep Blue initially lost
> its first chess match with Kasparov in 1996. The engineers will keep stroking
> the code and retraining the computer until Watson is truly unbeatable.
> Eventually the machine will prevail.
> _______________________________________________
> extropy-chat mailing list
> extropy-chat at lists.extropy.org
> http://lists.extropy.org/mailman/listinfo.cgi/extropy-chat

More information about the extropy-chat mailing list