[ExI] The NSA's new data center
Tomasz Rola
rtomek at ceti.pl
Sat Mar 31 19:45:07 UTC 2012
On Sat, 24 Mar 2012, Anders Sandberg wrote:
> I did a little calculation: at what point can governments spy 24/7 on their
> citizens and store all the data?
>
> I used the World Bank World Development Indicators and IMF predictors for
> future GDP growth and the United nations median population forecasts, the fit
> 10.^(-.2502*(t-1980)+6.304) for the cost (in dollars)per gigabyte (found on
> various pages about Kryder's law) and the assumption that 24/7 video
> surveillance would require 10 TB per person per year.
>
> Now, if we assume the total budget is 0.1% of the GDP and the storage is just
> 10% of that (the rest is overhead, power, cooling, facilities etc), then the
> conclusion is that doing this becomes feasible around 2020. Bermuda,
> Luxenbourg and Norway can do it in 2018, by 2019 most of Western Europe plus
> the US and Japan can do it. China gets there in 2022. The last countries to
> reach this level are Eritrea and Liberia in 2028, and finally Zimbabwe in
> 2031. By 2025 the US and China will be able to monitor all of humanity if they
> want to/are allowed.
>
> So at least data storage is not going to be any problem. It would be very
> interesting to get some estimates of the change in cost of surveillance
> cameras and micro-drones, since presumably they are the ones that are actually
> going to be the major hardware costs. Offset a bit because we are helpfully
> adding surveillance capabilities to all our must-have smartphones and smart
> cars. I suspect the hardware part will delay introduction a bit in countries
> that want it, but that just mean there will be hardware overhang once they get
> their smart dust, locators or gnatbots.
Anders, I admire your analysis and even more, ability to ask questions
like this (damn, why I didn't... ;-) ). I would like to add few points to
your answer, of course from my limited "AFAIK" point of view.
First, while cost of mere storage (a.k.a. price per gigabyte) is dropping,
there is more to having extralarge database than just stacking haddrives
on each other. I think maintainance cost is going to kill any such project
very quickly. The biggest publicly aknowledged databases nowadays range
from petabyte to somewhere around 10PB (hard to tell where exactly this
"around" is, news are somewhat dated) [1] [2]. Anyway, while we could
extrapolate, based on facts like "in 1992 Teradata creates first 1TB
system" and in 1997 they make a 24TB one [3], so this would give
[19]> (exp (/ (log 24) 5))
1.88
So, almost doubling a year, which should give about 26PB in 2008:
[24]> (expt (exp (/ (log 24) 5)) 16)
26102.13
However, from the same source ([3]), they delivered "only" a 1PB system in
that year.
So, the brutal "doubling every year" extrapolation doesn't work in real
world and what was close to 2 in the past is close to 1.5 nowadays.
I guess reasons for this vary from technical issues to maintainance costs.
There are also issues related to actually processing your data. You can
throw tapes or disks into the basement, no problem, however pulling
anything useful from this pile of crap is totally different thing. The
current technological limit seems to be somewhere between 100PB and 500PB
[4] [5] [6], even though there is an exabyte tape library on sale [7] [8].
And this has nothing to do with various estimates about exabytes of
content per month passing through the net.
Basically, from the point of maintainance, at current tech level it
requires a building built every year to house this much data, and it
requires actively checking your data for errors, making backups, and so
on. Which reduces real capacity by about half, optimistically-wise.
With the best tech available now (but not really deployed into the field
yet), you could provide total-sur for:
[27]> (/ (* 250 +peta+) (* 10 +tera+))
25600 people.
Assuming it will double every year (I think it will not), this gives ca.
25 million heads by 2022. If you could throw 100 times as much money into
this project, about 1/3 humanity. However, every year into the project,
the number of people needed to maintain data storage will grow. There will
be costs with migrating from one storage medium/technology to another
about every 10-15 years. And some other costs I am not aware of, because I
am not very deep into the subject.
The second problem I see is with data transfer. If current telecom
infrastructure says anything, the limit today is less than 10Tb/cable in
intercontinental links [9] [10]. If we assume "central hub" is to be
located in USA, then transferring live coverage of all European pop with
300kbps stream will take
[39]> (floor (* 600 +million+ 300 +kilo+) (* 10 +tera+))
16 ;
8398139555840
So, rounding up, 17 best submarine telecom cables thinkable today, which
however are not yet deployed - AFAIK the best ones are somewhere around
5Tbps and they are still in construction. Even worse, pushing all live
streams from all humanity would require
[42]> (floor (* 8 +billion+ 300 +kilo+) (* 10 +tera+))
223 ;
5689070059520
About 225 10Tbps cables, all coming down into one data storage. I wouldn't
bet any money this is possible now. Maybe 10 years from now this will look
better, but I won't bet either.
In a centralised scenario, it *might* be possible to "serve" about million
heads per continent. In a decentralised one, maybe 2-10 times as much. Ten
years from now, multiply by... 1.5^10=60 (optimistic case) or 1.2^10=6
(more realistic one).
Assuming I didn't blow up anywhere and of course I am using all official
data from public sources.
When it comes to algorithmic advantage of "unofficial" guys as opposed to
"public" ones, hard to tell but I wouldn't count on miracles. Perhaps some
problem whose best published algorithm has O(n^2) complexity has been
"solved" with unpublished algo of O((log2 n)^2) complexity, but there are
some limits as of how good the good can be made. With amounts of data we
talk about, I guess this doesn't help so much. Or, you can reduce analysis
of one second of material by 1000 times, yet with so many seconds to
analyse this is not going to be all that helpful.
Soo... I am sure total-sur might be possible in the future. But at the
same time I think the window of opportunity is probably a bit wider than
10 years.
I mean, we can talk about "intelligent dust" or "intelligent insects" etc
etc, but AFAIK they are not fielded, and besides, even dust will have to
report back its recorded material, so we have to make a hub for data
storage and analysis, which is going to be hard.
Besides, if you care, I guess using good vacuum cleaner and mosquitieras
can make life of dust and insects much harder. So many research grants
only to end in a trasher. Or be watered down the drain inside huge shower
cabin. Woo hoo, big deal.
Now, few words to supporters of "down with privacy" side (not you, Anders,
but I am in a bit of hurry now, so I can as well put it here).
I don't think life in the past was anywhere "normal". Dying from TB or
gangrene is not "normal". Being eaten alive by predators is not "normal".
I am 20/21st century man, not some caveman. By extension, having no
privacy is not "normal" either.
Also, I wonder, how many Galileos and Copernics would we have in tot-sur
society ruled by Inquisition? Since we are at it, Inquisition might have
started from religious based reasons, but soon some folks discovered it
may suite their earthly needs very well, too. Hence it was so cool to
denounce neighbors or rival merchant, and it was so cool to torture naked
women, which AFAIK wasn't required by any religion at the time. As soon as
you create a powerfull tool like tot-sur system, expect it to be taken
over by all kind of psychopatic element, in about 10-20 years time, or
maybe even 0 years really. Good luck living under their rule for the next
10 thousand years.
BTW, if you think nowadays is any better, because we have civilised, well,
dream on. I wonder, for example, how many planes Wright bros would have
built with mob coming to their workshop every day for a bit of wooing and
joking.
Regards,
Tomasz Rola
[1] http://www.focus.com/fyi/10-largest-databases-in-the-world/
[2] http://gadgetopia.com/post/1730
[3] http://en.wikipedia.org/wiki/Teradata
[4] http://en.wikipedia.org/wiki/Petabyte
[5]
http://www.geek.com/articles/chips/blue-waters-petaflop-supercomputer-installation-begins-20120130/
[6] http://www.technologyreview.com/computing/38440/
[7] http://en.wikipedia.org/wiki/Exabyte
[8] http://www.oracle.com/us/corporate/press/302409
[9] http://en.wikipedia.org/wiki/TAT-14
[10] http://en.wikipedia.org/wiki/Transatlantic_communications_cable
--
** A C programmer asked whether computer had Buddha's nature. **
** As the answer, master did "rm -rif" on the programmer's home **
** directory. And then the C programmer became enlightened... **
** **
** Tomasz Rola mailto:tomasz_rola at bigfoot.com **
More information about the extropy-chat
mailing list