WIN32 I OCR components
OCR stands for Optical Character Recognition. This is where graphic characters are recognized in an image. Recognition is based on patterns and probabilities. This function can, for instance, be used if the control's text cannot be identified via the technical properties of the object.
Since both the quality of the source material as well as the technical constraints vary, there is no guarantee of success with this method. |
-
Enable the OCR component in the Settings dialog (see setting "UseOCR").
-
In Tosca Wizard, use the property OCRText to identify the control. Simply copy the property from the Control Properties tab and define this in the Automation Properties tab as the Technical ID (see chapter "Control Properties").
OCRText property in Tosca Wizard
Configure the OCR settings if necessary. You can find them in the Settings dialog at Settings->Engine->OCR->Tesseract, or Settings->Engine->OCR->Textract (see chapter "Settings - OCR").
Every configuration parameter can be overwritten by a parameter in the Objectmap of a control. This is done by creating a new parameter with the prefix OCR and the corresponding configuration parameter. This value overwrites the value that has been specified in the Settings dialog.
Under certain circumstances, it may be necessary to train Tesseract to enhance character recognition.
In the illustration below the OK button may possibly cause recognition problems if this is recognized as GK. Training can help Tesseract recognize it correctly.
Problematic button Please contact Tricentis Support for further details on Tesseract training. |