<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.hoenzb
{mso-style-name:hoenzb;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><div><div><div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in'><div><div><div><div><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>We OCRed all of it. The results were better than nothing, not great. A lot of what I had left in my filing cabinet were hand-written notes from meetings. In the form they existed, they were not accessible anyway, so I guess it was no real loss.</span><o:p></o:p></p></div></div></div></div></div></div></blockquote></div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal><span style='color:#1F497D'>>…</span>Was it hand written stuff, or printed stuff? When did you last try OCR with the best available package? Was a partial index better than nothing?<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>-Kelly<o:p></o:p></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p></div><div><p class=MsoNormal><span style='color:#1F497D'>It was a mix of handwritten and printed stuff. The OCR produced a mixed bag of results, some of it surprising. We had cases where the OCR failed in printed material that scanned well. So we ended up keeping PDFs of the stuff and doing our best with the indexing. I don’t suppose it really matters anyway: it is clear enough to me that the knowledge needed for building a satellite is being ever more encoded in specialty software packages. We don’t really need to maintain human knowledge of the physics and chemistry behind it all if our software knows it well enough.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>spike<o:p></o:p></span></p></div></div></div></body></html>