[ExI] Watson On Jeopardy.

Thu Feb 17 12:46:15 UTC 2011

Okay, first:  although I understand your position as an Agilista, and 
your earnest desire to hear about concrete code rather than theory ("I 
value working code over big ideas"), you must surely acknowledge that in 
some areas of scientific research and technological development, it is 
important to work out the theory, or the design, before rushing ahead to 
the code-writing stage.

That is not to say that I don't write code (I spent several years as a 
software developer, and I continue to write code), but that I believe 
the problem of building an AGI is, at this point in time, a matter of 
getting the theory right.  We have had over fifty years of AI people 
rushing into programs without seriously and comprehensively addressing 
the underlying issues.  Perhaps you feel that there are really not that 
many underlying issues to be dealt with, but after having worked in this 
field, on and off, for thirty years, it is my position that we need deep 
understanding above all.  Maxwell's equations, remember, were dismissed 
as useless for anything -- just idle theorizing -- for quite a few years 
after Maxwell came up with them.  Not everything that is of value *must* 
be accompanied by immediate code that solves a problem.

Now, with regard to the papers that I have written, I should explain 
that they are driven by the very specific approach described in the 
complex systems paper.  That described a methodological imperative:  if 
intelligent systems are complex (in the "complex systems" sense, which 
is not the "complicated systems", aka space-shuttle-like systems, 
sense), then we are in a peculiar situation that (I claim) has to be 
confronted in a very particular way.  If it is not confronted in that 
particular way, we will likely run around in circles getting nowhere -- 
and it is alarming that the precise way in which this running around in 
circles would happen bears a remarkable resemblance to what has been 
happening in AI for fifty years.  So, if my reasoning in that paper is 
correct then the only sensible way to build an AGI is to do some very 
serious theoretical and tool-building work first.

And part of that theoretical work involves a detailed understanding of 
cognitive psychology AND computer science.  Not just a superficial 
acquaintance with a few psychology ideas, which many people have, but an 
appreciation for the enormous complexity of cog psych, and an 
understanding of how people in that field go about their research 
(because their protocols are very different from those of AI or computer 
science), and a pretty good grasp of the history of psychology (because 
there have been many different schools of thought, and some of them, 
like Behaviorism, contain extremely valuable and subtle lessons).

With regard to the specific comments I made below about McClelland and 
Rumelhart, what is going on there is that these guys (and several 
others) got to a point where the theories in cognitive psychology were 
making no sense, and so they started thinking in a new way, to try to 
solve the problem.  I can summarize it as "weak constrain satisfaction" 
or "neurally inspired" but, alas, these things can be interpreted in 
shallow ways that omit the background context ... and it is the 
background context that is the most important part of it.  In a 
nutshell, a lot cognitive psychology makes a lot more sense if it can be 
re-cast in "constraint" terms.

The problem, though, is that the folks who started the PDP (aka 
connectionist, neural net) revolution in the 1980s could only express 
this new set of ideas in neural terms.  The made some progress, but then 
just as the train appeared to be gathering momentum it ran out of steam. 
There were some problems with their approach that could not be solved in 
a principled way.  They had hoped, at the beginning, that they were 
building a new foundation for cognitive psychology, but something went 
wrong.

What I have done is to think hard about why that collapse occurred, and 
to come to an understanding about how to get around it.  The answer has 
to do with building two distinct classes of constraint systems:  either 
non-complex, or complex (side note:  I will have to refer you to other 
texts to get the gist of what I mean by that... see my 2007 paper on the 
subject).  The whole PDP/connectionist revolution was predicated on a 
non-complex approach.  I have, in essence, diagnosed that as the 
problem.  Fixing that problem is hard, but that is what I am working on.

Unfortunately for you -- wanting to know what is going on with this 
project -- I have been studiously unprolific about publishing papers. 
So at this stage of the game all I can do is send you to the papers I 
have written and ask you to fill in the gaps from your knowledge of 
cognitive psychology, AI and complex systems.

Finally, bear in mind that none of this is relevant to the question of 
whether other systems, like Watson, are a real advance or just a symptom 
of a malaise.  John Clark has been ranting at me (and others) for more 
than five years now, so when he pulls the old bait-and-switch trick 
("Well, if you think XYZ is flawed, let's see YOUR stinkin' AI then!!") 
I just smile and tell him to go read my papers.  So we only got into 
this discussion because of that:  it has nothing to do with delivering 
critiques of other systems, whether they contain a million lines of code 
or not.  :-)   Watson still is a sleight of hand, IMO, whether my theory 
sucks or not.  ;-)

Richard Loosemore

Kelly Anderson wrote:
> On Wed, Feb 16, 2011 at 6:13 PM, Richard Loosemore <rpwl at lightlink.com> wrote:
>> Kelly Anderson wrote:
>>> Show me the beef!
>> So demanding, some people.  ;-)
> 
> I wouldn't be so demanding if you acknowledged the good work of
> others, even if it is just a "parlor trick".
> 
>> If you have read McClelland and Rumelhart's two-volume "Parallel Distributed
>> Processing",
> 
> I have read volume 1 (a long time ago), but not volume 2.
> 
>> and if you have then read my papers, and if you are still so
>> much in the dark that the only thing you can say is "I haven't seen anything
>> in your papers that rise to the level of computer science" then, well...
> 
> Your papers talk the talk, but they don't walk the walk as far as I
> can tell. There is not a single instance where you say, "And using
> this technique we can distinguish pictures of cats from pictures of
> dogs" or "This method leads to differentiating between the works of
> Bach and Mozart." Or even the ability to answer the question "What do
> grasshoppers eat?"
> 
>> (And, in any case, my answer to John Clark was as facetious as his question
>> was silly.)
> 
> Sidebar: I have found that humor and facetiousness don't work well on
> mailing lists.
> 
>> At this stage, what you can get is a general picture of the background
>> theory.  That is readily obtainable if you have a good knowledge of (a)
>> computer science,
> 
> Check.
> 
>> (b) cognitive psychology
> 
> Eh, so so.
> 
>> and (c) complex systems.
> 
> Like the space shuttle?
> 
>> It also
>> helps, as I say, to be familiar with what was going on in those PDP books.
> 
> Like I said, I read the first volume of that book a long time ago (I
> think I have a copy downstairs), nevertheless, I have a decent grasp
> of neural networks, relaxation, simulated annealing, pattern
> recognition, multidimensional search spaces, statistical and Bayesian
> approaches, computer vision, character recognition (published), search
> trees in traditional AI and massively parallel architectures. I'm not
> entirely unaware of various theories of philosophy and religion. I am
> weak in natural language processing, traditional databases, and sound
> processing.
> 
>> Do you have a fairly detailed knowledge of all three of these areas?
> 
> Fair to middling, although my knowledge is a little outdated. I'm not
> tremendously worried about that since I used a text book written in
> the late 1950s when I took pattern recognition in 1986 and you refer
> to a book published in the late 1980s... I kind of get the idea that
> progress is fairly slow in these areas except that now we have better
> hardware on which to run the old algorithms.
> 
>> Do you understand where McClelland and Rumelhart were coming from when they
>> talked about the relaxation of weak constraints, and about how a lot of
>> cognition seemed to make more sense when couched in those terms?
> 
> Yes, this makes a lot of sense. I don't see how it relates directly to
> your work. I actually like what you have to say about short vs. long
> term memory, I think that's a useful way of looking at things. The
> short term or "working" memory that uses symbols vs the long term
> memory that work in a more subconscious way is very interesting stuff
> to ponder.
> 
>> Do you
>> also follow the line of reasoning that interprets M & R's subsequent pursuit
>> of non-complex models as a mistake?
> 
> Afraid you lose me here.
> 
>> And the implication that there is a
>> class of systems that are as yet unexplored, doing what they did but using a
>> complex approach?
> 
> Still lost, but willing to listen.
> 
>> Put all these pieces together and we have the basis for a dialog.
>>
>> But ...  demanding a finished AGI as an essential precondition for behaving
>> in a mature way toward the work I have already published...?  I don't think
>> so.  :-)
> 
> If I have treated you in an immature way, I apologize. I just think
> arguing that four years of work and millions of dollars worth of
> research being classified as "trivial" when 10,000,000 lines of
> actually working code is not a strong position to come from.
> 
> I am an Agilista. I value working code over big ideas. So while I
> acknowledge that you have some interesting big ideas, it escapes me
> how you are going to bridge the gap to achieve a notable result. Maybe
> it is clear to you, but if it is, you should publish something a
> little more concrete, IMHO.