I had an idea which could make those of us who do data mining useful in the
current debate.  According to some reports, a bunch of private email from
East Anglia University Climate Research Unit was stolen and is being leaked.
CBS is covering it, Fox has mentioned it once or twice.  One of the leaked
memos is below, the one that gave me the idea.
Britain has a version of our Freedom of Information Act (Max or other
British national please verify or refute?).  So it looks like we (or
someone) should be able to FOIA the raw data upon which Mann-made climate
change theory is and has been based, and put it on a public access site.  
But the real idea is this: we should be able to FOIA the actual computer
code used to reduce the data as well.  I know how to read Fortran.  Still!
Or if it is written in something else, I would learn that protocol and go
thru it, see if we can reproduce the hockey stick from the raw data.  I do
this kinda stuff for a living!  {8-]  Or used to.  {8-[  But I still know
how.  {8-]  We could write our own, then compare to Mann, et.al.  
This would be a relevant and timely task for a few dozen extropian minded
people, to dig thru the data and the computer code.  Oh the fun we could
have: divide ourselves into three groups regarding the Mann-made climate
change theory: the faithful, the heretics and the not-sures, then everyone
reduce the data in whatever way makes sense, then see if there is a
difference in the overall result of the three teams.  I volunteer to lead
the not-sure team.  We are looking at igNobel prize material here.  {8-]
Leaked email from Programmer Harry from CRU:
I am seriously worried that our flagship gridded data product is produced by
Delaunay triangulation - apparently linear as well. As far as I can see,
this renders the station counts totally meaningless. It also means that we
cannot say exactly how the gridded data is arrived at from a statistical
perspective - since we're using an off-the-shelf product that isn't
documented sufficiently to say that. Why this wasn't coded up in Fortran I
don't know - time pressures perhaps? Was too much effort expended on
homogenisation, that there wasn't enough time to write a gridding procedure?
Of course, it's too late for me to fix it too. Meh.

I am very sorry to report that the rest of the databases seem to be in
nearly as poor a state as Australia was. There are hundreds if not thousands
of pairs of dummy stations, one with no WMO and one with, usually
overlapping and with the same station name and very similar coordinates. I
know it could be old and new stations, but why such large overlaps if that's
the case? Aarrggghhh! There truly is no end in sight... So, we can have a
proper result, but only by including a load of garbage!

One thing that's unsettling is that many of the assigned WMo codes for
Canadian stations do not return any hits with a web search. Usually the
country's met office, or at least the Weather Underground, show up - but for
these stations, nothing at all. Makes me wonder if these are
long-discontinued, or were even invented somewhere other than Canada!

Knowing how long it takes to debug this suite - the experiment endeth here.
The option (like all the anomdtb options) is totally undocumented so we'll
never know what we lost. 22. Right, time to stop pussyfooting around the
niceties of Tim's labyrinthine software suites - let's have a go at
producing CRU TS 3.0! since failing to do that will be the definitive
failure of the entire project.

Ulp! I am seriously close to giving up, again. The history of this is so
complex that I can't get far enough into it before by head hurts and I have
to stop. Each parameter has a tortuous history of manual and semi-automated
interventions that I simply cannot just go back to early versions and run
the update prog. I could be throwing away all kinds of corrections - to
lat/lons, to WMOs (yes!), and more. So what the hell can I do about all
these duplicate stations?...
[Programmer Harry]

