[ExI] Computational resources needed for AGI...

Sun Feb 6 14:33:10 UTC 2011

Kelly Anderson wrote:
> On Sat, Feb 5, 2011 at 9:50 AM, Richard Loosemore <rpwl at lightlink.com> wrote:
>> Kelly Anderson wrote:
>>> On Wed, Feb 2, 2011 at 9:56 AM, Richard Loosemore <rpwl at lightlink.com>
>>> wrote:
>>>> Kelly Anderson wrote:
>>> I doubt it, only in the sense that we don't have anything with near
>>> the raw computational power necessary yet. Unless you have really
>>> compelling evidence that you can get human-like results without
>>> human-like processing power, this seems like a somewhat empty claim.
>> Over the last five years or so, I have occasionally replied to this question
>> with some back of the envelope calculations to back up the claim.  At some
>> point I will sit down and do the job more fully, and publish it, but in the
>> mean time here is your homework assignment for the week.... ;-)
>>
>> There are approximately one million cortical columns in the brain.  If each
>> of these is designed to host one "concept" at a time, but with at most half
>> of them hosting at any given moment, this gives (roughly) half a million
>> active concepts.
> 
> I am not willing to concede that this is how it works. I tend to
> gravitate towards a more holographic view, i.e. that the "concept" is
> distributed across tens of thousands of cortical columns, and that the
> combination of triggers to a group of cortical columns is what causes
> the overall "concept" to emerge. This is a general idea, and may not
> apply specifically to cortical columns, but I think you get the idea.
> The reason for belief in the holographic model is that brain damage
> doesn't knock out all memory or ability to process if only part of the
> brain is damaged. This neat one to one mapping of concept to neuron
> has been debunked to my satisfaction some time ago.

The architecture I outlined above has a long pedigree (the main ancestor 
being the parallel distributed processing ideas of Rumelhart, McClelland 
et al), so it is okay to suggest a different architecture, but there 
does have to be motivation for whatever suggestion is made about the 
hardware-to-concept mapping.

That said, there are questions.  If something is distributed, is it (a) 
the dormant, generic "concepts" in long term memory, or is it the 
active, instance "concepts" of working memory?  Very big difference.  I 
believe there are reasons to talk about the long term memory concepts as 
being partialy distributed, but that would not apply to the instances in 
working memory.....   and in the above architecture I was talking only 
about the latter.

If you try to push the idea that the instance atoms (my term for the 
active concepts) are in some sense "holographic" or distributed, you get 
into all sorts of theoretical and practical snarls.

I published a paper with Trevor Harley last year in which we analyzed a 
paper by Quiroga et al, that made claims about the localization of 
concepts to neurons.  That paper contains a more detailed explanation of 
the mapping, using ideas from my architecture.  It is worth noting that 
Quiroga et al's explanation of their own data made no sense, and that 
the alternative that Trevor and I proposed actually did account for the 
data rather neatly.

>> If each of these is engaging in simple adaptive interactions with the ten or
>> twenty nearest neighbors, exchanging very small amounts of data (each
>> cortical column sending out and receiving, say, between 1 and 10 KBytes,
>> every 2 milliseconds), how much processing power and bandwidth would this
>> require, and how big of a machine would you need to implement that, using
>> today's technology?
> 
> You are speaking of only one of the thirty or so organelles in the
> brain. The cerebral cortex is only one part of the overall picture.
> Nevertheless, you are obviously not talking about very much
> computational power here. Kurzweil in TSIN does the back of the
> envelope calculations about the overall computational power of the
> human brain, and it's a lot more than you are presenting here.

Of course!

Kurzweil (and others') calculations are based on the crudest possible 
calculation of a brain emulation AGI, in which every wretched neuron in 
there is critically important, and cannot be substituted for something 
simpler.  That is the dumb approach.

What I am trying to do is explain an architecture that comes from the 
cognitive science level, and which suggests that the FUNCTIONAL role 
played by neurons is such that it can be substituted very adequately by 
a different computational substrate.

So, my claim is that, functionally, the human cognitive system may 
consist a network of about a million cortical column units, each of 
which engages in relatively simple relaxation processes with neighbors.

I am not saying that this is the exactly correct picture, but so far 
this architecture seems to work as a draft explanation for a broad range 
of cognitive phenomena.

And if it is correct, the the TSIN calculations are pointless.

>> This architecture may well be all that the brain is doing.  The rest is just
>> overhead, forced on it by the particular constraints of its physical
>> substrate.
> 
> I have no doubt that as we figure out what the brain is doing, we'll
> be able to optimize. But we have to figure it out first. You seem to
> jump straight to a solution as a hypothesis. Now, having a hypothesis
> is a good part of the scientific method, but there is that other part
> of testing the hypothesis. What is your test?

Well, it may seem like I pulled the hypothesis out of the hat yesterday 
morning, but this is actually just a summary of a project that started 
in the late 1980s.

The test is an examination of the consistency of this architecture with 
the known data from human cognition.  (Bear in mind that most artificial 
intelligence researchers are not "scientists" .... they do not propose 
hyotheses and test them ..... they are engineers or mathematicians, and 
what they do is play with ideas to see if they work, or prove theorems 
to show that some things should work.  From that perspective, what I am 
doing is real science, of a sort that almost died out in AI a couple of 
decades ago).

For an example of the kind of tests that are part of the research 
program I am engaged in, see the Loosemore and Harley paper.

>> Now, if this conjecture is accurate, you tell me how long ago we had the
>> hardware necessary to build an AGI.... ;-)
> 
> I'm sure we have that much now. The problem is whether the conjecture
> is correct. How do you prove the conjecture? Do something
> "intelligent". What I don't see yet in your papers, or in your posts
> here, are results. What "intelligent" behavior have you simulated with
> your hypothesis Richard? I'm not trying to be argumentative or
> challenging, just trying to figure out where you are in your work and
> whether you are applying the scientific method rigorously.

The problem of giving you and answer is complicated by the paradigm.  I 
am adopting a systematic top-down scan that starts at the framework 
level and proceeds downward.  The L & H paper shows an application of 
the method to just a couple of neuroscience results.  What I have here 
are similar analyses of several dozen other cognitive phenomena, in 
various amounts o detail, but these are not published yet.  There are 
other stages to the work that involve simulations of particular algorithms.

This is quite a big topic.  You may have to wait for my thesis to be 
published to get a full answer, because fragments of it can be confusing.

All I can say at the moment is that the architecture gives rise to 
simple, elegant explanations, at a high level, of a wide range of 
cognitive data, and the mere fact that one architecture can do such a 
thing is, in my experience, unique.  However, I do not want to publish 
that as it stands, because I know what the reaction would be if there is 
no further explanation of particular algorithms, down at the lowest 
level.  So, I continue to work toward the latter, even though by my own 
standards I already have enough to be convinced.

>> The last time I did this calculation I reckoned (very approximately) that
>> the mid-1980s was when we crossed the threshold, with the largest
>> supercomputers then available.
> 
> That may be the case. And once we figure out how it all works, we
> could well reduce it to this level of computational requirement. But
> we haven't figured it out yet.
> 
> By most calculations, we spend an inordinate amount of our cerebral
> processing on image processing the input from our eyes. Have you made
> any image processing breakthroughs? Can you tell a cat from a dog with
> your approach? You seem to be focused on concepts and how they are
> processed. How does your method approach the nasty problems of image
> classification and recognition?

The term "concept" is a vague one.  I used it in our discussion because 
it is conventional.  However, in my own writings I talk of "atoms" and 
"elements", because some of those atoms correspond to very low-level 
features such as the ones that figure in the visual system.

As far as I can tell at this stage, the visual system uses the same 
basic architecture, but with a few wrinkles.  One of those is mechanism 
to spread locally acquired features into a network of "distributed, 
position-specific" atoms.  This means that when visual regularities are 
discovered, they percolate down in the system and become distributed 
across the visual field, so they can be computed in parallel.

Also, the visual system does contain some specialized pathways (the 
"what" and "where" pathways) that engage in separate computations. 
These are already allowed for in the above calcuations, but they are 
specialized regions of that million-column system.

I had better stop.  Must get back to work.

Richard Loosemore