Print this page
You can generate a searchable text file from a scanned image. To do this, you run optical character recognition (OCR) processing on an image file. The original scanned document in the file repository remains unchanged.
Note: Typically, a case administrator performs this task. Group members and group leaders can run OCR processing if they have permissions to do so.
You can submit the following file formats for OCR processing: .bmp, .dcx, FAXserve Fax Document, .gif, JBIG2 compression standard, .jpeg, JPEG2000, .max, PDF, .png, .sim, TIFF, .wmp, .xif, .xps.
Note: The application does not support XFA-based PDF forms.
The maximum size for images that can be submitted for OCR processing is 8400 x 8400 pixels.
Use the following procedure to run OCR processing on documents.
1.In the List pane, select the check box next to the documents that you want to submit for OCR processing.
2.On the Tools menu, click OCR Processing.
3.By default, OCR processing is run on documents that have matching text files. If you do not want to process documents that have matching text files, select the Skip documents with text files check box.
4.Choose the OCR processing options:
oEmbed text in PDF files (process will not update original file): Embeds text in a PDF file, rather than create a separate text file.
oAuto-rotate images: Rotates images in 90-degree increments and attempts to align them to their correct upright position.
oRun spelling checker: Corrects misspellings that are caused if the OCR processor misreads an image. For example, corrects "speli" to "spell."
Note: Spelling checker does not correct misspellings in the original document.
oAuto-deskew images: If the image was scanned at an angle, adjusts the angle so that text is aligned horizontally.
oDespeckle images: Removes spots or shading that occurred during scanning to improve the accuracy of the output.
oIgnore OCR errors: Continues processing when the application encounters an error. To stop the OCR processing when the application encounters an error, clear the check box. All unprocessed documents are marked "skipped due to error" and must be resubmitted.
oEnable verbose logging: Sends job processing information to the system. You must select this option to view error information related to the job.
5.Under Recognized languages, select the language of the document that you are scanning:
oTo scan for English, select the check box.
oTo scan for a different language, click English, and then select a language from the list. Select the check box next to the language.
oTo scan for multiple languages, click the Add button. Click None, select a language from the list, and then select the check box next to the language that you added.
Note: For the best speed and quality, process only one recognized language at a time.
When processing is complete, OCR text is searchable and appears in the Formatted content or Unformatted content view of the document after the next indexing and enrichment job runs. Your administrator can also manually update the index.