These options are used when converting to PDF with any of the PEERNET built-in converters installed with Document Conversion Service. These settings do not apply to any other output format.
Caution |
|
This feature is not supported on Microsoft® Windows Server 2008 R2 and Microsoft® Windows 7. |
Optical Character Recognition (OCR for short), searches for and recognizes text (characters) on scanned pages or images and extracts it as digital text. When recognizing text, the OCR engine has to know which languages to look for on the page. OCR works by analyzing the patterns, shapes, and curves of the text characters on the page and matching them to predefined information for different characters in each language.
OCR will increase the processing time for file conversion. Outside factors such as image quality, the font used, and any image background on the pages will all affect the validity of the OCR results.
They are used by the following converters:
•Built-in PDF Converter
•Built-in Image Converter
•Built-in Cadd Converter
Table values in bold text are the default value for that setting.
|
|||||
---|---|---|---|---|---|
|
|||||
|
|||||
Adding LanguagesDocument Conversion Service comes with files to support recognizing Arabic, English, French, German, Hebrew, Hindi, Italian, and Spanish. You can download additional language files or complete sets of language files from Traineddata Files for Tesseract. To add them to Document Conversion Service, copy the desired *.traineddata files into the following folder:
|