Data Integrity Reconciliation testing

Use Reconciliation tests if you want to perform an in-depth comparison of each individual source row with each target row. While aggregated tests are faster, Reconciliation tests are very precise. They provide you with a detailed report that identifies cell differences in two datasets. However, depending on the size of your datasets, this comparison can be time and resource consuming.

Types of datasets you can use

For Reconciliation tests, you can use the following types of datasets:

  • Database tables

  • Files - including files on a Hadoop system or on a Linux/Unix environment - connected via SSH

  • SSAS OLAP cubes

  • Other sources - such as Microsoft® Excel files - if you have installed an appropriate ODBC driver

  • Other sources - such as data that resides on a cloud platform - if you use the Preview: Tosca Data Integrity Agent that supports JDBC connections and runs on Windows and Linux

  • Custom data sources if you have a custom data source reader

Use Reconciliation tests

The main goal of reconciliation testing is to confirm that the source data matches the target data. Ideally, source and target are at minimum similar, both in content and in sorting. Tosca Data Integrity then helps you to find mismatches.

Reasons for mismatches

A mismatch means that the source data doesn't match the target data. Possible reasons for mismatches are:

  • A row in the source dataset is not present in the target dataset.

  • A row in the target dataset is not present in the source dataset.

  • A source row matches a target row by Row Key, but not on all other columns.

Work with a Row Key

You can specify a Row Key that identifies a row uniquely. Tosca DI uses Row Keys to find matching rows in source and target.

For detailed information on how to work with Row Keys, see chapter "Specify a Row Key".

How the Reconciliation test works

The system processes source and target from top to bottom. It takes the first source row and tries to find a matching Row Key within the target.

Once the system has found a matching Row Key in the target, it matches the rest of the row data. If both Row Key and row data match, Tosca DI considers it a true match.

If the Row Key matches but the data doesn't, the system reports an error for the source and target row.

Handle duplicate rows

By default, Tosca DI takes a source row and searches for a match in the target. It concludes the Row by Row Comparison once all source rows have been processed.

However, in some cases, the target may contain additional rows, for instance duplicate rows. By default, Tosca DI ignores them.

If you want to report on unmatched target rows, such as duplicates, enter the value True for the XTestStepValue Report Unmatched Target Rows.

Best practices for Reconciliation tests

Tricentis recommends that you follow the best practices outlined below to ensure that your Reconciliation tests run smoothly.

Make sure that source and target are sorted in the same way.

This speeds up the comparison and saves memory (RAM).

Define a useful Row Key.

See our example in chapter "Specify a Row Key" which illustrates the importance of a useful Row Key.

If you also want to test transformations, add the transformation logic to your SQL statement.

This allows you to perform the transformation and the comparison within the same test.

Get started