Compare PDF files

The PDF Engine 3.0 lets you compare two PDF files with the Module 1:1 Compare. It performs a visual comparison between two files you specify:

  • If the similarity between the two files is above the specified accuracy, the TestCase passes.

  • If the similarity between the two files is below the specified accuracy, the TestCase fails.

You can also customize your comparison to exclude specific areas, text (including regular expressions), or entire pages. Once the comparison is finished, you can view a detailed comparison report to pinpoint the smallest differences between two PDF files.

Work with the comparison

To compare two PDF files, follow these steps:

  1. Create a TestCase from the 1:1 Compare Module.

  2. Tailor the comparison to your needs with the provided TestStepValues:

    • Reference PDF and Target PDF let you specify the path to the files you wish to compare.

      If your files are password-protected, specify their respective password in the Reference PDF Password and Target PDF Password TestStepValues.

    • Accuracy [%] lets you specify the comparison similarity threshold.

    • Text-only Comparison lets you specify whether to only compare text contents between files.

    • Excluded Pages, Excluded Areas, Excluded Text let you ignore entire pages, areas, or text patterns during the comparison.

  3. Execute your TestCase.

  4. View the comparison report.

In this example, you compare the file base.pdf to the file target.pdf.

You specify the password for each file.

You expect that the files should be at least 90 percent similar.

You exclude page 2 and pages 5 to 8 from the comparison.

Compare two PDF files

The comparison result shows that the files don't have the minimum similarity that you have specified.

Consequently, your TestCase fails.

Failed PDF comparison

Exclude areas from the comparison

You can exclude specific areas from the comparison to narrow down your comparison to areas of interest. This is especially useful if you only need to compare specific page elements, instead of entire pages.

To exclude areas from the comparison, follow these steps:

  1. If you haven't yet, create a TestCase from the 1:1 Compare Module.

  1. Specify one or more areas to exclude:

    • Scan the source PDF file with PDF Scan.

    • Select Text, Image, or Table in the ribbon menu and highlight one or more areas you want to exclude.

    • Select Save and Close to return to Tosca.

Highlight areas you want to exclude with the PDF Scan

  1. Get the dimensions of the areas you want to exclude:

    • Go to the Module you have just created with the PDF Scan in the Modules tab of Tosca Commander.

    • Right-click the ModuleAttribute respective to one of the areas you have highlighted and select Copy Dimensions.

    • Paste the dimensions somewhere handy. You need them for the next step.

Copy the dimensions of the areas you want to exclude

  1. Specify the areas you want to exclude in your tests:

    • Go to the TestCase you created from the 1:1 Compare Module.

    • Paste the dimensions of the area you want to exclude in the Areas to Exclude > Area to Exclude > Dimensions TestStepValue.

    • Use the Areas to Exclude > Area to Exclude > Page(s) TestStepValue to specify in which pages of the document the excluded area is.

    • For multiple excluded areas in the same document, repeat the previous steps as needed with additional Areas to Exclude > Area to Exclude TestStepValues.

  2. Execute your TestCase.

In this example, you compare two PDF files and expect 100% similarity between them. Moreover, you exclude three areas from the comparison:

  • The first area in pages 1, 2, 3, and 4.

  • The second area in pages 3, 4, 5, and 6.

  • The third area in every page of the document.

Compare two PDF files with excluded areas

Exclude text patterns from the comparison

You can exclude text patterns from the comparison to narrow down your comparison to specific contents only. This is especially useful if you want to ignore unique strings of text or patterns during the comparison.

To exclude text patterns from the comparison, follow these steps:

  1. If you haven't yet, create a TestCase from the 1:1 Compare Module.

  2. Go to the TestCase you created and set the following TestStepValues:

    • Text to Exclude > Pattern to Exclude > Pattern, to specify the string of text that you wish to exclude from the comparison. You can use regular expressions to specify unique patterns. We recommend you use RegExr to verify your regular expressions.

      Note that if you use special characters in this pattern but intend to use them as ordinary characters, you have to escape them.

    • Text to Exclude > Pattern to Exclude > Use Regex, to specify whether the Pattern contains regular expressions.

  3. Execute your TestCase.

In this example, you only compare the text between two PDF files, expecting 100% similarity. Moreover, you exclude three different text patterns from the comparison:

  • The sentences Good morning, and Good afternoon,.

  • The sentence Your assigned category is:, followed by any letter between A and Z.

  • The sentence Your assigned token is:, followed by any combination of letters and numbers, followed by a line break with carriage return.

  • The word Tricentis.

View the comparison report

You can view a detailed comparison report to help you understand the comparison results. The report uses color coding to indicate differences, such as additions or deletions. This helps you pinpoint even the smallest differences between two PDF files.

Visualize the PDF Compare Report

To view the comparison report, right-click the respective TestCaseLogEntry in your test results and select View PDF Compare Report. Note that you can't view the report if you run your comparison test in the Scratchbook.

Tricentis Tosca automatically stores pages which are not a match during the comparison as BMP files in the %TRICENTIS_PROJECTS%\PdfCompareFailedPages directory.