Text Files

All operations in this section describe how to create different tests for text files.

Load your text files.
Run your tests on the files. Upon testing, Tosca loads text files into an SQLite database.
Run Pre-Screening tests on your text files to prevent loading the wrong data.

The following Modules are included in the Tosca BI Modules And Samples.tsu subset and located in the folder Modules->BI - DWH/BigData:

File Load into Caching Database
Defined File Tests against Caching Database
Complete Row by Row Comparison
Metadata Comparison
JSON/XML File Load into Caching Database

File Load into caching database

Use the File Load into caching database Module to load the content of a text file into an SQLite database. You can load multiple files with the File Load into caching database Module.

Configure the path of the SQLite caching database in the Settings dialog in Tosca Commander (see chapter "Settings - Tosca BI").

This example loads the file SalesLT.Address.SemiColon.txt into the table Address.

The columns of the file are separated by a semicolon.

The first row contains the header names.

Loading character-separated file

The File Load into caching database Module contains the following attributes:

ModuleAttribute	Description	Optional
Table Name	The name of the SQLite table into which Tosca BI loads the data.
Keep Existing Table Entries	If set to True, the data of the text file is added to the existing table. Otherwise, the table is dropped before loading new data.	X
File Name	Full file path and file name of the text file you want to load. You can use wildcards, if needed. For instance, you could enter C:\temp\myExcelFile*.csv.
File Mode	Defines whether the columns have a fixed length or are separated by commas.
Column Delimiter/Length Info	Indicates the starting point for each column in a comma-separated list or the separator symbol in character-separated files.
Columns To Process	Allows you to specify a semicolon-separated list of columns which should be loaded. By default, Tosca BI loads all columns.
Column Renaming - <Current Name>	Allows you to rename columns. The value in the Name column indicates the column to rename, the value in the Value column indicates the new name.
Header Row	Indicates with True whether the file has a header row. By default, there is no header row. In case the header is on a different row, specify the row number using #<n> instead of using True. Only single-row headers are supported.
Skip Lines Starting With	Allows you to specify a semicolon-separated list of values that are skipped.	X
Only Lines Starting With	Allows you to define a semicolon-separated list of characters that indicate a valid row. E.g.: process lines that start with _,- or < are specified as _;-;<	X
Cell Settings - All Columns - Option	Applies one of the actions below to all columns.	X
Cell Settings - Single Columns - <Name>	Applies one of the actions below to a specified column. To specify a column, replace <Name> with the column name.	X
Load Error Behavior - Ignore Load Errors	Defines whether load errors are ignored. The default value is False.	X
Load Error Behavior - Max Errors	Maximum number of errors before Tosca aborts execution. The default value is 100.	X
Load Error Behavior - File Name	Errors are logged to this file. A full path and file name are required. An existing file of the same name is overridden. By default, load errors are not logged.	X

Available actions for Cell Settings

The following actions are available for Cell Settings - All Columns and Cell Settings - Single Columns:

Action	Description
Trim	Removes all leading and trailing whitespace characters. Default scope: HeaderAndData
Trim[<character>]	Removes all leading and trailing occurrences of the specified character. Default scope: HeaderAndData Replace <character> with the character you want to remove. To remove a ", enter the " four times, e.g. Trim[""""].
TrimStart	Removes all leading whitespace characters. Default scope: HeaderAndData
TrimStart[<character>]	Removes all leading occurrences of the specified character. Default scope: HeaderAndData Replace <character> with the character you want to remove. To remove a ", enter the " four times, e.g. Trim[""""].
TrimEnd	Removes all trailing whitespace characters. Default scope: HeaderAndData
TrimEnd[<character>]	Removes all trailing occurrences of the specified character. Default scope: HeaderAndData Replace <character> with the character you want to remove. To remove a ", enter the " four times, e.g. Trim[""""].
Replace[<search string>][<replace string>]	Replaces all occurrences of the first string with the second string. Default scope: Data
Substring[<start index>]	Extracts a part of a longer string. The extract starts at the defined start index position and runs until the end of the string. Default scope: Data For example: Substring[9] with input Project Manager returns Manager
Substring[<start index>][<length>]	Extracts a part of a longer string. The extract starts at the defined start index position and contains the number of characters specified in length. Default scope: Data For example: Substring[9][3] with input Project Manager returns Man
Right[<length>]	Extracts a part of a longer string. The extract runs from the end of the string towards the beginning and contains the number of characters specified in length. Default scope: Data For example: Right[7] with input Project Manager returns Manager
Lowercase	Converts the string to lowercase using the currently active locale. Default scope: Data For example: Lowercase with input Project Manager returns project manager
Lowercase[Culture:<culture name>]	Converts the string to all lowercase. Lowercase[Culture:<culture name>] uses the given culture name to create a new locale. Default scope: Data For example: Lowercase[Culture:zh-Hans] uses the culture information of "Chinese(simplified)" to convert uppercase to lowercase characters.
Uppercase	Converts the string to uppercase using the currently active locale. Default scope: Data For example: Uppercase with input Project Manager returns PROJECT MANAGER
Uppercase[Culture:<culture name>]	Converts the string to all uppercase. Uppercase[Culture:<culture name>] uses the given culture name to create a new locale. Default scope: Data For example: Lowercase[Culture:en-us] uses the culture information of "English - United States" to convert lowercase to uppercase characters.

To change the default scope of an action, add a Scope parameter:

To apply the action to the header row only, add the parameter [Scope:Header]
To apply the action to all data rows but not the header, add the parameter [Scope:Data]
To apply the action to the header and all data rows, add the parameter [Scope:HeaderAndData]

We want to replace all instances of the string CustomerDataAustria with the string CustomerDataUSA. This change should be applied to the header and all data rows.

To do so, we define the following action:

Replace[<CustomerDataAustria>][<CustomerDataUSA>][Scope:HeaderAndData]

Specify which File Load into caching database Module to use in your tests

If you have more than one File Load into caching database Module in your workspace, you need to specify which Module should be used in your tests. Otherwise, the system uses the first Module it finds.

This is also true if the File Load into caching database Modules have different names, since the system uses the configuration parameter SpecialExecutionTask to check for applicable Modules.

To specify a Module, you need to create a configuration parameter for it:

Right-click the respective Module and select Create Configuration Param from the mini toolbar.

Create a configuration parameter via the mini toolbar

Name the newly created configuration parameter BIWizardUseInAutocreate.
Set its Value to True.

You can switch Modules between tests by deleting the configuration parameter from one Module and adding it to another.

Make sure that only one File Load into caching database Module at a time has the BIWizardUseInAutocreate value set to True. Otherwise, the system uses the first Module it finds.

Defined File Tests against caching database

The Defined File Tests against caching database Module allows you to perform predefined tests on tables you created with the File Load into caching database Module.

The following table shows a mapping of the test types in the wizard and the respective test specified in the Module:

Wizard Test type	Module Test
Has No Empty Values	NoEmptyValue
Field Type	IsNumeric
Min Value	MinValue
Max Value	MaxValue
Sum	Sum
Value Range	ValueRange
Min Length	MinLength
Max Length	MaxLength
Exact Length	Length
Is Unique	IsUnique
Row Count	RowCount

In addition, the following tests are available in the Module only:

GreaterThan
GreaterThanOrEqualsTo
LessThan
LessThanOrEqualsTo
Between
BetweenOrEquals
Occurance
DoesColumnNameExist

The Defined File Tests against caching database Module contains the following attributes:

ModuleAttribute	Description
Table Name	Name of the table which will be tested.
Column	Name of the column to analyze. All rows within that column are tested.
Test	The test used.
Param	Use this parameter when the test type requires an additional input, e.g. when doing a RowCount test, enter the number for the expected amount of rows.
Value	You can verify the overall result. If the test passes, the message is OK. This ModuleAttribute requires the ActionMode Verify.

In this example, Tosca BI verifies the amount of rows of a column.

The AddressID column in the Address table holds 450 rows.

Checking the row count

You can find further examples in the Tosca BI Modules And Samples.tsu subset.

The examples are located in the folder Sample tests->Sorted by use case->Pre Screening Tests->Against files.