[ExI] Vermis ex machina

Sun Mar 1 11:54:04 UTC 2015

(Short version: I think you overestimate things a little bit, but not much)

Stuart LaForge <avant at sollegro.com> , 1/3/2015 1:42 AM:Now the connectivity map of worm's nervous system can be represented   
by a graph 302 vertices and 6393 edges. One can represent such a graph   
as a 302 X 302 matrix of ones and zeroes called an adjacency matrix.   
The adjacency matrix will be 91204 bits in size. 12786 of those bits   
would be 1's and the remaining 78418 bits would be 0's. 

This is actually the main problem: connectivity is not enough, we need to know synaptic weights - and those are hard to measure in C elegans, which is why this has not been done years ago. 

But leaving that aside: the adjacency matrix can also be represented as a list of 6393 18-bit numbers (you need 9 bits to represent which neuron a synapse starts at and 9 for which it ends at - slightly clever encoding will save you 1.4 bits per synapse), 115,074 bits in total. This is *larger* than the full matrix: this is a less effective encoding.

However, in the human brain the matrix is about 8.6*10^10 X 8.6*10^10, and each neuron has about 8000 connections. So you need 7.4*10^21 bits for the full matrix, but the edge representation takes just 8.6*10^10*8000*74 = 5.1*10^16 bits. (you need 37 bits to address each neuron). 

Here there are *definitely* going to be compression wins too, since you can use an encoding of neuron addresses that makes common pathway addresses shorter at the expense of the rare ones (basically a spatial hierarchical Huffman coding). Most connections are local (roughly half are within one cm, and most of them are likely within the same 1 mm region), so if a neuron ID starts with some bits of rough location information most edges will have these ID bits identical and be highly compressible. So if we parcel the brain into 1400 1cm^3 voxels with 11 bit addresses, half of the connections will share the first 11 bits of start and destination neurons: we can indicate that by a '0' followed by the voxel (11 bits) and then 2 X 26 bits of inter-voxel ID - 64 bits in total rather than 74. This 10 bit win occurs for half of the neurons, saving us 3.4*10^15 bits. The other half of the connections become one bit longer, so there we lose 3.4*10^14 bits - all in all we get a saving of 6%. Doing this with the mm-size connectivity likely gives about the same saving; then things get more complex. So all in all, I think it is plausible that the compression limit is on the order of just above 10^16 bits. 

 So now that   
we have adjacency matrices for the worm brain and the human brain, so   
comparing the two can give us a scaling factor. For example scaling   
factor, S = 7.396 x 10^21 bits / 9.1204 x 10^4 = 8.10929 x 10^16. 

Now to estimate the computing power required to simulate a human   
brain, B(h), one can simply multiply the byte size of all the software   
and data required to simulate the worm and multiply by the scaling   
factor: B(h) = S*B(w). 

No. You are mixing up program and data. The code needed to run a neuron is only needed in one copy when calculating the Kolmogorov complexity: when you run the program you might want to copy it into 10^10 copies for each local node. 

The complexity will be roughly the neural simulation code part of the worm project, plus the connectivity complexity. We can approximate openworm: looking at the github repository, I would be very surprised if it was beyond a million lines of code. There are about 43 directories, and each may have about 10 (say 30) source code files, typically with about 200 lines of code. That makes about 258,000 sloc. http://arxiv.org/pdf/1502.01410v1.pdf shows that most java code is chaff; only about 4% is functional. Assuming this to apply here, we get around 10,000 lines of core code. Of these about 11000 bytes would be the full connectivity matrix, but since a line is many bytes we can more or less ignore the connectivity size, it is likely <1%. So if we guess that a line of code is about 10 characters, the complexity of the project is very roughly 800,000 bits.

For comparison, the NEURON simulator ( http://www.neuron.yale.edu/neuron/download ) comes in at around 30 Mb, assuming 4% efficiency gives a size of 1.2 Mb=9,600,000, ten times bigger.

This is all *microscopic* compared to the human connection data, so one can approximate it to zero: the human upload Kolmogorov complexity is somewhere around 10^16.

(This all ignores, as I said, connection weights - there are factors upping the complexity a bit)

In any case my ability to give an actual estimate, instead of a   
methodology to calculate it, is hampered by by inability to locate the   
relevant data on the open worm worm project. If somebody out there   
knows how big the open worm project software platform and data sizes   
are, please post them. Then I could give an actual number that I could   
cross correlate with Moore's Law to get an estimate of when the   
Singularity might occur. 

Hmm, you are looking at data when you should be looking at processing power. Looking at table 8 and 9 in my old report http://www.fhi.ox.ac.uk/brain-emulation-roadmap-report.pdf suggests that storage requirements are going to be met much earlier than processing requirements. This was the basis for my estimates in http://www.aleph.se/papers/Monte%20Carlo%20model%20of%20brain%20emulation%20development.pdf

But I think the basic approach is sound: find software that does an emulation, scale up the computing requirements, plug in estimates for Moore's law. As I argue in the Monte Carlo paper data acquisition and neuroscience add uncertainty to the estimate: I would *love* to get better estimates there too. 

Anders Sandberg, Future of Humanity Institute Philosophy Faculty of Oxford University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20150301/31866a9a/attachment.html>