Lysine User Guide

Overview

Lysine is an SSIS (SQL Server Integration Services) custom component that enables the inline conversion from EBCDIC (including redefines and packed decimal fields) into an ASCII byte stream to use within the pipeline in your SSIS package. It does this from reading a from an EBCDIC file on the file system.

System Requirements

Lysine requires SQL Server Integration Services 2005/2008

Connection Manager

The Connection Manager works like other Connection Managers in SSIS and can be added by using the SSIS menu option while editing a package.

Choose EBCDIC from the list of Connection Manager types. A new EBCDICConnectionManager will be created in your Connection Manager window and will automatically open to the edit window for your EBCDIC Connection.

There are 3 main steps to configuring this new EBCDIC Connection:

  • General
  • Layout
  • Preview

General

General is used to select the EBCDIC source file and set attributes such as Header Rows To Skip. Simply click 'Browse' and choose the EBCDIC file that you wish to use. After you have set any file attributes you can then click 'Layout' on the left hand side to begin your file definition. The following screen should appear:

Layout

Layout is where you will define the structure of your file using the add column button on the far left of the toolbar.

Layout will also validate the definition and provide warnings regarding invalid columns. Clicking on a validation warning will select the Column in the Definition related to the warning.

There are 2 major types of columns in Lysine:

  • Parent Columns
  • Child Columns

Parent Columns can contain Child columns but that is currently limited as illustrated in the following chart:

Column Types

Regular

The Regular Column is used for data that only needs to be converted to EBCDIC without any special processing.

The Name and Length are required in order to identify the column and to calculate its size. Length is used to determine the number of bytes to read from the source file while processing the column. In addition, you can specify the NumberOfDecimalPlaces if you would like Lysine to insert a period at a given point. For example, if NumberOfDecimalPlaces = 2 then a simple column with a length of 4 and a column value of 1234 would be output as 12.34.

Zoned

The Zoned Column is used for signed or unsigned numeric fields. The "zone" refers to the first 4 bits of the last byte in the column which is used to determine the sign of the numeric value. For example, an EBCDIC column value of (hex) C5 would be output as +5 and a column value of (hex) D5 would be output as -5. For larger numbers the other digits in the number should be preceded by F as in (hex) F5C5 which would be output as +55 or (hex) F5D5 which would be output as -55. Note a column value of (hex) F5F5 will also be output as +55 since in the absence of a sign a positive is assumed.

Lysine also allows you to specify the NumberOfDecimals which will be applied after conversion.

The following screen shot shows a simple definition for a Zoned Column with a total length of 7 bytes and 2 Decimals

When Previewed the output looks like the following.

Packed Decimal (Comp 3)

The Packed (Comp 3) Column represents a packed decimal column. Packed Decimal is a technique used in EBCDIC to encode 2 digits in 1 byte. The sign indicator is located in the last 4 bits of the last byte. For example, an EBCDIC column value of (hex) 03 1C would be output as +031 and a column value of (hex) 12 3D would be output as -123.

The Name and Length are required in order to identify the column and to calculate its size. Length is used to determine the number of bytes to read from the source file while processing the column. In addition, you can specify the NumberOfDecimalPlaces if you would like Lysine to insert a period at a given point. For example, if NumberOfDecimalPlaces = 2 then a simple column with a length of 4 and a column value of 1234 would be output as 12.34.

The following screen shot shows a simple definition for a Packed Column with a total length of 2 bytes and 1 Decimal

When Previewed the output looks like the following.

Occurs

The Occurs Column is used to hold 1 or more columns that will be repeated.

The Name and OccursCount properties are required in order to identify the column and to calculate its size. OccursCount is used to determine the number of bytes to read from the source file while processing the column. The calculated number of bytes to read is reflected in the Length property. To add child columns to the occurs simply select the node in the treeview and use the "Add Column" button as you normally would.

NOTE: At this time only Regular, Zoned, and Packed columns can be added as children.

The preview for an Occurs Column is displayed so that the child columns can be viewed individually per occurrence. In the diagram row 1 is highlighted and shows 2 occurrences

Occurs (Depending On)

The Occurs (Depending On) Column works exactly like the Occurs Column with 1 exception, the OccursCount value is read from a previous column in the current row.

The Name and DependingOnColumn properties are required in order to identify the column and to calculate its size at runtime. DependingOnColumn is a reference to a previous column and is used to determine the number of bytes to read from the source file while processing the column. To add child columns to the occurs simply select the node in the treeview and use the "Add Column" button as you normally would.

NOTE: At this time only Regular, Zoned, and Packed columns can be added as children.

The preview for an Occurs (Depending On) Column is displayed so that the child columns can be viewed individually per occurrence. In the diagram row 1 is highlighted and shows 2 occurrences based on a FlagColumn value of 2. However, in the following row the FlagColumn value is 1 and there is only 1 occurrence.
Redefine & Redefine Set

The Redefine Column allows a certain number of bytes to be read and then output as a RedefineSet which contains 1 or more columns. The RedefineSet is chosen based on the value of the FlagColumn specified. Each RedefineSet has a property called FlagColumnValue which can be specified in various ways.

The following is an example of a simple Redefine containing 2 RedefineSets.

The following shows the Comparison Builder screen used to specify the FlagColumn value. The only option that requires explanation is the "List" option which allows you to specify a list of possible values for the FlagColumn.

When we Preview the file we see 2 types of rows based on the FlagColumn:

  • flag = 1 is a numeric value +055
  • flag = 2 is a character value NY

NOTE: At this time only Regular, Zoned, and Packed columns can be added as children of the RedefineSet.

Other supported column types
  • Binary Fields

Additionally, you can delete a column, move it up or move it down using the toolbar as well as import or export your current definition in XML format for a number of uses. The red button at the end of the toolbar is used to show and hide the validation message box that Lysine provides to guide you through layout creation.

Preview

At any time during the creation of your layout you may click 'Preview' on the left hand side and Lysine will attempt to process 10 rows of your source file and display the result.

The following Preview was generated from a simple test file containing 4 rows of 2 columns:

1 byte for "RowType" and 50 bytes for "Company Name"

Note that you may specify the number of rows to skip while previewing so that you may see other areas of the source file by incrementing the Data rows to skip counter located above the Preview grid.

Source Component

The EBCDIC Source Component works like other source components in SSIS and must first be added to the toolbox in order to be added to a Data Flow.

To add the Source Component to your toolbox do the following:

  • Make sure that the toolbox is visible, use View->Toolbox from the File Menu if it is not.
  • Right click the toolbox and click 'Choose Items...'
  • Once the 'Choose Toolbox Items' dialog is open click on the 'SSIS Data Flow Items' tab.
  • Check the 'EBCDIC File Source' item as shown below

Once added to the Toolbox the EBCDIC File Source can just be dragged onto the Data Flow design surface. You will then need to select the corresponding EBCDIC File Connection Manager as shown below.

Once connected, the EBCDIC File Source will create an output for each Redefine Set that contains the columns defined. The outputs can then be used like the output from any other Data Source. NOTE: Currently, Lysine requires that you connect the Error output in order to work properly, however, this can simply be connected to a dummy Recordset Destination if needed.