Data Integrity Reconciliation testing
Use Reconciliation tests if you want to perform an in-depth comparison of each individual source row with each target row. While aggregated tests are faster, Reconciliation tests are very precise. They provide you with a detailed report that identifies cell differences in two datasets. However, depending on the size of your datasets, this comparison can be time and resource consuming.
Types of datasets you can use
Reconciliation tests allow you to test the following types of datasets:
-
Any database connected to ODBC or JDBC.
-
Various file types, for example CSV, JSON, Avro, or Parquet. These files can be local or on blob storage.
-
If you have installed an appropriate ODBC driver, you can test other sources such as Microsoft Excel files.
-
If you use the Preview: Data Integrity Agent that supports JDBC connections, you can test data that resides on a cloud platform.
-
Other custom data sources if you have a custom data source reader.
Use Reconciliation tests
The main goal of reconciliation testing is to confirm that the source data matches the target data. Ideally, source and target are at minimum similar, both in content and in sorting. Data Integrity then helps you to find mismatches.
Reasons for mismatches
A mismatch means that the source data doesn't match the target data. Possible reasons for mismatches are:
-
A row in the source dataset is not present in the target dataset.
-
A row in the target dataset is not present in the source dataset.
-
A source row matches a target row by Row Key, but not on all other columns.
Work with a Row Key
You can specify a Row Key that identifies a row uniquely. Data Integrity uses Row Keys to find matching rows in source and target.
Click here for detailed information on how to work with Row Keys.
How the Reconciliation test works
The system processes source and target from top to bottom. It takes the first source row and tries to find a matching Row Key within the target.
Once the system has found a matching Row Key in the target, it matches the rest of the row data. If both Row Key and row data match, Data Integrity considers it a true match.
If the Row Key matches but the data doesn't, the system reports an error for the source and target row.
Handle duplicate rows
By default, Data Integrity takes a source row and searches for a match in the target. It concludes the Row by Row Comparison once all source rows have been processed.
However, in some cases, the target may contain additional rows, for instance duplicate rows. By default, Data Integrity ignores them.
If you want to report on unmatched target rows, such as duplicates, enter the value True for the TestStepValue Report Unmatched Target Rows.
Best practices for Reconciliation tests
Tricentis recommends that you follow the best practices outlined below to ensure that your Reconciliation tests run smoothly.
|
Make sure that source and target are sorted in the same way. This speeds up the comparison and saves memory (RAM). |
|
Define a useful Row Key. This example illustrates the importance of a useful Row Key. |
|
If you also want to test transformations, add the transformation logic to your SQL statement. This allows you to perform the transformation and the comparison within the same test. |
Get started
-
Learn how to use the Row by Row Comparison (SSAS OLAP) Wizard, if you have data in SQL Server Analysis Services OLAP cubes.
-
Learn how to run a comparison on a Preview: Data Integrity Agent that is installed either locally or on a remote computer.