Data Integrity - FAQ
This chapter provides answers to the most frequently asked questions about Tosca Data Integrity.
Out of the box, Tosca Data Integrity supports all databases that support ODBC drivers.
If you use JDBC drivers, you have to run your tests with the preview feature Tosca Data Integrity Agent.
Tricentis recommends that you use the latest ODBC driver that is compatible with the OS and database.
Tosca Data Integrity supports both 64-bit and 32-bit ODBC drivers.
Note that Tosca Data Integrity only uses ODBC drivers to access the data source. The ODBC driver capability decides what Tosca Data Integrity can support. For example, Oracle ODBC drivers don't support multiple SQL statements.
For Row by Row Comparison, Tosca Data Integrity supports both Windows and Linux.
For Windows, use ODBC drivers. For Linux, use the Data Integrity Agent with JDBC drivers, file, or SSH connections.
DI licenses are consumed if you have enabled the Data Integrity AddIn and start Tosca Commander.
Whenever you execute a DI Module, the system performs a license check and confirms that the license is still valid.
Whenever you run tests that contain DI Modules without starting the GUI, that is, Tosca Commander.
For example, if you use TC-Shell, TC API, or Tosca Distributed Execution instead.
The Tosca Data Integrity Agent requires a DI Designer license or DI Execution Only license depending on how the test is triggered.
For further information on licenses, contact your Tricentis sales person.
You only need Data Integrity Parallel Execution licenses if you want to run Data Integrity tests in parallel, that is, at the same time.
For further information on licenses, contact your Tricentis sales person.
The Row by Row Comparison algorithm applies the following principles:
-
The algorithm uses a unique identifier called Row Key to compare rows.
-
You can specify the Row Key. It can be one or more columns.
-
If you specify a Row Key, the algorithm uses it in the source dataset to find the corresponding row in the target dataset. Then it compares all remaining columns with each other.
-
If you don't specify a Row Key, the algorithm uses the entire row as identifier.
To achieve the best results, follow the best practices outlined below:
|
Select a Row Key that uniquely identifies a row. Otherwise the algorithm potentially matches the wrong rows as it doesn't do fuzzy matching and processes source and target sequentially. |
|
Ensure that source and target are sorted in the same way. The algorithm reads chunks of source and target and immediately compares them. Correct sorting speeds up the comparison and saves memory (RAM). |
If a Row Key is a duplicate and not unique, the algorithm performs automatic matching based on the remaining columns.
If automatic matching doesn't provide an exact match, the algorithm provides the most likely match based on the number of cells that match.
By default, Tosca Data Integrity tries to determine the total number of rows from the source before it starts the comparison. This allows Tosca Data Integrity to display the remaining time.
If this process takes too long or you don't need the time frame, you can skip the row count.
The row count only provides more detailed information on the execution progress, it has no influence on the comparison itself.
Tosca Data Integrity offers multiple execution options. Ultimately, your setup determines which execution is suitable and most advantageous for you.
Scroll through the list below for an overview of advantages () and disadvantages () of specific execution options:
Run local Tosca TBox execution and access data through ODBC |
|
Execution runs closer to data All execution happens on one machine Fast data access for local data |
Might be slow when you compare large remote data sets Only runs on Windows and not on Linux |
Run local Tosca TBox execution with local Tosca Data Integrity Agent (JDBC) |
|
Execution runs closer to data All execution happens on one machine Fast data access for local data Can use JDBC driver instead of ODBC driver |
You must install an additional Tosca Data Integrity Agent for JDBC You must start the Tosca Data Integrity Agent Only runs on Windows since Tosca Commander only runs on Windows |
Run local Tosca TBox execution with remote Tosca Data Integrity Agent (JDBC) close to the data source |
|
Faster comparison speed since the execution happens close to the data Runs on Windows and Linux |
You must install an additional Tosca Data Integrity Agent for JDBC You must start the Tosca Data Integrity Agent Difficult orchestration and scaling since the TestStep must know the Agent IP |
Run local Tosca TBox execution through Tosca Distributed Execution (DEX), DI execution runs on DEX Agent |
|
Orchestration is available Requires no Tosca Data Integrity Agent (JDBC) when you run your execution on Windows with ODBC |
Can't run on Linux |
To run your tests with the Data Integrity Agent, you have to specify the class name of your JDBC driver in one of the TestSteps (see chapter "JDBC connection"). If you don't know the class name, you can find it in the Java Archive (JAR) file of your JDBC driver.
See the examples below to learn how to find a JDBC driver class name in two different ways.
Example 1: Find the class name in the manifest file
This example uses the manifest file to find the driver class name. The manifest is a text file that contains essential information about a JAR file's content. You have to perform the following steps:
-
To extract the content of your JAR file, rename it by adding the extension .zip.
For example, rename your file from ojdbc7.jar to ojdbc7.jar.zip.
-
Double-click the ZIP file and extract the content.
-
Search the extracted content for a file called MANIFEST.MF.
-
Open the manifest file and look for the Main-Class which should specify the class name.
In this example, the file states Main-Class: oracle.jdbc.OracleDriver.
The driver class name is oracle.jdbc.OracleDriver. You can now use this information in the TestStep.
If the class name isn't specified in the Main-Class, try out the steps in the next example.
Example 2: Find the class name in the META-INF/services folder
This example uses the META-INF/services folder to find the driver class name. This folder contains information by the service provider. You have to perform the following steps:
-
To extract the content of your JAR file, rename it by adding the extension .zip.
In this example, you rename your file from mssql-jdbc-8.2.2.jre11.jar to mssql-jdbc-8.2.2.jre11.jar.zip.
-
Double-click the ZIP file and extract the content.
-
Search the extracted content for a folder called META-INF\services.
-
In this folder, open the file java.sql.Driver which should contain the class name.
In this example, the file states: com.microsoft.sqlserver.jdbc.SQLServerDriver.
The driver class name is com.microsoft.sqlserver.jdbc.SQLServerDriver. You can now use this information in the TestStep.