Secrets of Microsoft recognized by Evince
I was viewing some confidential PDF documents from Microsoft related to a story on Slashdot. While that might be interesting to write about, I discovered a cool feature in Evince. It has built in optical character recognition (OCR). Whenever I selected a part of the scanned documents it automaticly converted it to text I could copy and I can even search the document. What a cool and useful feature!
Update:
…but it was too good to be true. Søren Hansen pointed out that the text has been embedded into the document by the pdf document scanner and is not being OCR’ed by Evince. I confirmed that by running pdftotext on the document. Someone really should implement OCR in Evince though! I’m sorry about the mistake.
http://slashdot.org/article.pl?sid=07/02/03/1524250
