Adobe Acrobat 9: OCR!

PDF files can contain searchable, editable, read-out-loudable text (that's an article for another week!). However, not all PDF files contain 'real' text. If the PDF file was converted from an image, such as a PC fax file or a screen capture, the PDF will contain pictures of text instead of text. Our eyes and brains can certainly read the text, but the computers cannot.

Cataloguing, editing, searching and reading text that is actually an image out loud are not possible without first processing the PDF through some sort of Optical Character Recognition (OCR) process.

Back in the day, I owned a utility called OmniPage Pro that miraculously turned images of text into editable text. The product is still available today, but you might be able to save some money if you need to call upon OCR technology because, believe it or not,  Adobe has included document OCR in Acrobat 9 Professional.

With an image-based PDF file open, choose Document > OCR Text Recognition > Recognize Text Using OCR.

You will be presented with a dialog box where you can specify a desired page range. You can optionally click the Edit button and select OCR settings.

OCR dialog box

The PDF Output Style options allow you to predetermine how your OCR processed file will be displayed once processed. Try selecting ClearScan from the drop-down menu to have your processed PDF file show the file's text in a less scanned look. This can also greatly reduce the processed file's size, since it will replace the original image of text with actual text. Don't worry about any non-text items such as graphics in the original file. Acrobat's OCR engine will likely understand that it's a picture and just leave it alone.

Once your file has been processed, you can click in the Find toolbar or use Acrobat's Search feature to locate desired words and phrases.

Find text after OCR conversion

Acrobat is loaded with gems like this. I am constantly hearing the words "I never knew that!" from seasoned Acrobat users. Come sign up for a class and see what tools and features are waiting to be discovered and used to increase your productivity… and marketability!

 
***
 
Would you like to discover the gems hidden in Acrobat? We've got a live, online class for you. Click here for more details.
 
***
 
David R. Mankin is a Certified Technical Trainer, desktop publisher, computer graphic artist, and Web page developer. He is an Adobe-Certified Expert in Acrobat.

12 Replies to “Adobe Acrobat 9: OCR!”

  1. The online application Free OCR allows transforming the contents of an image file in a text output format. Though Microsoft Word is not supported currently.

  2. The online application Free OCR allows transforming the contents of an image file in a text output format. Though Microsoft Word is not supported currently.

  3. The online application Free OCR allows transforming the contents of an image file in a text output format. Though Microsoft Word is not supported currently.

  4. You have two options: purchase a commercial OCR software, install and then do the OCR job. Or just go to some OCR website like goodocr.com, upload your image and wait to get the result. The latter will for sure save you a lot of time if you only do this occasionaly.

  5. You have two options: purchase a commercial OCR software, install and then do the OCR job. Or just go to some OCR website like goodocr.com, upload your image and wait to get the result. The latter will for sure save you a lot of time if you only do this occasionaly.

  6. You have two options: purchase a commercial OCR software, install and then do the OCR job. Or just go to some OCR website like goodocr.com, upload your image and wait to get the result. The latter will for sure save you a lot of time if you only do this occasionaly.

Leave a Reply

Discover more from The Logical Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading