Text Files

All operations in this section describe how to create different tests for text files.

  1. Load your text files.

  2. Run your tests on the files. Upon testing, Tosca loads text files into an SQLite database.

  3. Run Pre-Screening tests on your text files to prevent loading the wrong data.

The following Modules are included in the Tosca BI Modules And Samples.tsu subset and located in the folder Modules->BI - DWH/BigData:

  • File Load into Caching Database

  • Defined File Tests against Caching Database

  • Complete Row by Row Comparison

  • Metadata Comparison

  • JSON/XML File Load into Caching Database

File Load into caching database

Use the File Load into caching database Module to load the content of a text file into an SQLite database. You can load multiple files with the File Load into caching database Module.

Configure the path of the SQLite caching database in the Settings dialog in Tosca Commander (see chapter "Settings - Tosca BI").

This example loads the file SalesLT.Address.SemiColon.txt into the table Address.

The columns of the file are separated by a semicolon.

The first row contains the header names.

Loading character-separated file

The File Load into caching database Module contains the following attributes:

ModuleAttribute

Description

Optional

Table Name

The name of the SQLite table into which Tosca BI loads the data.

 

Keep Existing Table Entries

If set to True, the data of the text file is added to the existing table. Otherwise, the table is dropped before loading new data.

X

File Name

Full file path and file name of the text file you want to load.

You can use wildcards, if needed. For instance, you could enter C:\temp\myExcelFile*.csv.

 

File Mode

Defines whether the columns have a fixed length or are separated by commas.

 

Column Delimiter/Length Info

Indicates the starting point for each column in a comma-separated list or the separator symbol in character-separated files.

 

Columns To Process

Allows you to specify a semicolon-separated list of columns which should be loaded. By default, Tosca BI loads all columns.

 

Column Renaming - <Current Name>

Allows you to rename columns. The value in the Name column indicates the column to rename, the value in the Value column indicates the new name.

 

Header Row

Indicates with True whether the file has a header row. By default, there is no header row.

In case the header is on a different row, specify the row number using #<n> instead of using True.

Only single-row headers are supported.

 

Skip Lines Starting With

Allows you to specify a semicolon-separated list of values that are skipped.

X

Only Lines Starting With

Allows you to define a semicolon-separated list of characters that indicate a valid row.

E.g.: process lines that start with _,- or < are specified as _;-;<

X

Cell Settings - All Columns - Option

Applies one of the actions below to all columns.

X

Cell Settings - Single Columns - <Name>

Applies one of the actions below to a specified column. To specify a column, replace <Name> with the column name.

X

Load Error Behavior - Ignore Load Errors

Defines whether load errors are ignored. The default value is False.

X

Load Error Behavior - Max Errors

Maximum number of errors before Tosca aborts execution. The default value is 100.

X

Load Error Behavior - File Name

Errors are logged to this file. A full path and file name are required. An existing file of the same name is overridden. By default, load errors are not logged.

X

Available actions for Cell Settings

The following actions are available for Cell Settings - All Columns and Cell Settings - Single Columns:

Action

Description

Trim

Removes all leading and trailing whitespace characters.

Default scope: HeaderAndData

Trim[<character>]

Removes all leading and trailing occurrences of the specified character.

Default scope: HeaderAndData

Replace <character> with the character you want to remove.

To remove a ", enter the " four times, e.g. Trim[""""].

TrimStart

Removes all leading whitespace characters.

Default scope: HeaderAndData

TrimStart[<character>]

Removes all leading occurrences of the specified character.

Default scope: HeaderAndData

Replace <character> with the character you want to remove.

To remove a ", enter the " four times, e.g. Trim[""""].

TrimEnd

Removes all trailing whitespace characters.

Default scope: HeaderAndData

TrimEnd[<character>]

Removes all trailing occurrences of the specified character.

Default scope: HeaderAndData

Replace <character> with the character you want to remove.

To remove a ", enter the " four times, e.g. Trim[""""].

Replace[<search string>][<replace string>]

Replaces all occurrences of the first string with the second string.

Default scope: Data

Substring[<start index>]

Extracts a part of a longer string.

The extract starts at the defined start index position and runs until the end of the string.

Default scope: Data

For example:

Substring[9] with input Project Manager returns Manager

Substring[<start index>][<length>]

Extracts a part of a longer string.

The extract starts at the defined start index position and contains the number of characters specified in length.

Default scope: Data

For example:

Substring[9][3] with input Project Manager returns Man

Right[<length>]

Extracts a part of a longer string.

The extract runs from the end of the string towards the beginning and contains the number of characters specified in length.

Default scope: Data

For example:

Right[7] with input Project Manager returns Manager

Lowercase

Converts the string to lowercase using the currently active locale.

Default scope: Data

For example:

Lowercase with input Project Manager returns project manager

Lowercase[Culture:<culture name>]

Converts the string to all lowercase. Lowercase[Culture:<culture name>] uses the given culture name to create a new locale.

Default scope: Data

For example:

Lowercase[Culture:zh-Hans] uses the culture information of "Chinese(simplified)" to convert uppercase to lowercase characters.

Uppercase

Converts the string to uppercase using the currently active locale.

Default scope: Data

For example:

Uppercase with input Project Manager returns PROJECT MANAGER

Uppercase[Culture:<culture name>]

Converts the string to all uppercase. Uppercase[Culture:<culture name>] uses the given culture name to create a new locale.

Default scope: Data

For example:

Lowercase[Culture:en-us] uses the culture information of "English - United States" to convert lowercase to uppercase characters.

To change the default scope of an action, add a Scope parameter:

  • To apply the action to the header row only, add the parameter [Scope:Header]

  • To apply the action to all data rows but not the header, add the parameter [Scope:Data]

  • To apply the action to the header and all data rows, add the parameter [Scope:HeaderAndData]

We want to replace all instances of the string CustomerDataAustria with the string CustomerDataUSA. This change should be applied to the header and all data rows.

To do so, we define the following action:

Replace[<CustomerDataAustria>][<CustomerDataUSA>][Scope:HeaderAndData]

Specify which File Load into caching database Module to use in your tests

If you have more than one File Load into caching database Module in your workspace, you need to specify which Module should be used in your tests. Otherwise, the system uses the first Module it finds.

This is also true if the File Load into caching database Modules have different names, since the system uses the configuration parameter SpecialExecutionTask to check for applicable Modules.

To specify a Module, you need to create a configuration parameter for it: 

  1. Right-click the respective Module and select Create Configuration Param from the mini toolbar.

Create a configuration parameter via the mini toolbar

  1. Name the newly created configuration parameter BIWizardUseInAutocreate.

  2. Set its Value to True.

You can switch Modules between tests by deleting the configuration parameter from one Module and adding it to another.

Make sure that only one File Load into caching database Module at a time has the BIWizardUseInAutocreate value set to True. Otherwise, the system uses the first Module it finds.

Defined File Tests against caching database

The Defined File Tests against caching database Module allows you to perform predefined tests on tables you created with the File Load into caching database Module.

The following table shows a mapping of the test types in the wizard and the respective test specified in the Module:

Wizard Test type

Module Test

Has No Empty Values

NoEmptyValue

Field Type

IsNumeric

Min Value

MinValue

Max Value

MaxValue

Sum

Sum

Value Range

ValueRange

Min Length

MinLength

Max Length

MaxLength

Exact Length

Length

Is Unique

IsUnique

Row Count

RowCount

In addition, the following tests are available in the Module only:

  • GreaterThan

  • GreaterThanOrEqualsTo

  • LessThan

  • LessThanOrEqualsTo

  • Between

  • BetweenOrEquals

  • Occurance

  • DoesColumnNameExist

The Defined File Tests against caching database Module contains the following attributes:

ModuleAttribute

Description

Table Name

Name of the table which will be tested.

Column

Name of the column to analyze. All rows within that column are tested.

Test

The test used.

Param

Use this parameter when the test type requires an additional input, e.g. when doing a RowCount test, enter the number for the expected amount of rows.

Value

You can verify the overall result. If the test passes, the message is OK.

This ModuleAttribute requires the ActionMode Verify.

In this example, Tosca BI verifies the amount of rows of a column.

The AddressID column in the Address table holds 450 rows.

Checking the row count

You can find further examples in the Tosca BI Modules And Samples.tsu subset.

The examples are located in the folder Sample tests->Sorted by use case->Pre Screening Tests->Against files.