These options are used when converting to PDF with any of the PEERNET built-in converters installed with Document Conversion Service. These settings do not apply to any other output format.
Caution |
|
This feature is not supported on Microsoft® Windows Server 2008 R2 and Microsoft® Windows 7. |
Optical Character Recognition (OCR for short), searches for and recognizes text (characters) on scanned pages or images and extracts it as digital text. When recognizing text, the OCR engine has to know which languages to look for on the page. OCR works by analyzing the patterns, shapes, and curves of the text characters on the page and matching them to predefined information for different characters in each language.
OCR will increase the processing time for file conversion. Outside factors such as image quality, the font used, and any image background on the pages will all affect the validity of the OCR results.
They are used by the following converters:
•Built-in PDF Converter
•Built-in Image Converter
•Built-in Cadd Converter
Table values in bold text are the default value for that setting.
Sample Profile |
|
<?xml version="1.0" encoding="utf-8"?> <Profile Type="0" DisplayName="Adobe PDF OCR Searchable" Description ="Converts to OCR (searchable) PDF."> <Settings>
<add Name ="ConverterPlugIn.PNBuiltinsOCRPDF.Enabled" Value="1"/> <add Name ="ConverterPlugIn.PNBuiltinsOCRPDF.Languages" Value="eng+fra"/> <add Name ="ConverterPlugIn.PNBuiltinsOCRPDF.FirstPageOnly" Value="0"/>
<!-- Output file options --> <add Name="Devmode settings;Resolution" Value="300"/> <add Name="Save;Output File Format" Value="Adobe PDF Multipaged"/> ...
</Settings> </Profile> |
Code Sample - C# |
|
item.Set("ConverterPlugIn.PNBuiltinsOCRPDF.Enabled", "1"); item.Set("ConverterPlugIn.PNBuiltinsOCRPDF.Languages", "eng+fra"); item.Set("ConverterPlugIn.PNBuiltinsOCRPDF.FirstPageOnly", "0");
item.Set("Devmode settings;Resolution", "300"); item.Convert("Cadd - Builtin", _ @"C:\Test\BuildingPlan.dwf", _ @"C:\Test\Out\ConvertedDrawings"); |
Code Sample - VB.NET |
|
Dim item As PNDocConvQueueServiceLib.IPNDocConvQueueItem
' Create the conversion item item = New PNDocConvQueueServiceLib.PNDocConvQueueItem()
' Set conversion settings item.Set("ConverterPlugIn.PNBuiltinsOCRPDF.Enabled", "1") item.Set("ConverterPlugIn.PNBuiltinsOCRPDF.Languages", "eng+fra") item.Set("ConverterPlugIn.PNBuiltinsOCRPDF.FirstPageOnly", "0")
item.Set("Devmode settings;Resolution", "300") item.Set("Save;Output File Format", "Adobe PDF Multipaged") ... ' convert the file item.Convert("Cadd - Builtin", _ "C:\Test\BuildingPlan.dwf", _ "C:\Test\Out\ConvertedDrawings") |
|
|||||
---|---|---|---|---|---|
|
|||||
|
|||||
Adding LanguagesDocument Conversion Service comes with files to support recognizing Arabic, English, French, German, Hebrew, Hindi, Italian, and Spanish. You can download additional language files or complete sets of language files from Traineddata Files for Tesseract. To add them to Document Conversion Service, copy the desired *.traineddata files into the following folder:
|