Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

Seven Stars

Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

Hi Team,

I need to create a data map for an EBCDIC data file. However, we do not have the copybook for the data file. Also we are not sure of the exact encoding. However, we do have the metadata from a different DI tool (DataStage). Is it possible for us to create the structure and do the mapping without having the copybook?

Employee

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

There is a way to do it using what is called an Importer structure.  However, it's not easy. I'm checking to see if I have it documented somewhere.  In the meantime, can you do a copy/paste of the datastage metadata and attach it?

Seven Stars

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

Thanks @thoff I just received the datastage columns screenshots and started working on it. I will try with IBM037 encoding as it seems to be the default. Please do let me know if you get more information on the importer structure. I saw that on user guide. But it was very minimal details as shown below.

 

Using a Map to Import Definitions

You can use a map to create a structure definition from any input. For example, suppose you have a positional structure that is described by a spreadsheet. The spreadsheet contains a list of element names, sizes, start columns, and further description. You can export this spreadsheet as a CSV file and create a map that maps the contents of the spreadsheet to the Importer structure.

The Importer structure is a predefined structure in the Builtin project (Builtin/Structures/Executable Structures/Importer). When it is executed in a map, the Importer structure can create one or more structures whose elements are defined by the mappings. To use it, create a standard map, and specify the Importer structure as the output structure. You can specify any input structure you like and map the elements to produce the desired elements in the structure. You can build and test your map in the studio as usual, producing the output of the map in XML for testing purposes. The Test Run menu will provide an additional option called Test Run to Importer Structure, which will actually create the structures. You can re-run the map as many times as you like, which creates the structures each time and replaces their contents.

Employee

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

How many columns are there?  The thing about the copybooks is that they can contain packed fields (COMP, COMP-3) which will need to be set up correctly.

Seven Stars

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

64 columns. I am wondering how they are marking the packed decimal columns in datastage. In the screens, those are just marked as a decimal.

Employee

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

Try this link.  It's a webex recording showing how to use the importer.  It has not been edited. 

 

https://urldefense.proofpoint.com/v2/url?u=https-3A__meettalend.webex.com_meettalend_ldr.php-3FRCID-...

 

 

Seven Stars

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

Thanks for sharing the recording @thoff. For me to even take a view at the EBCDIC datafile without the copybook, I may have to convert the data file to ASCII, right? After that at least I should be able to see the initiators for the records and based on that, we can dig deeper as per to the process mentioned in the video, right?

Seven Stars

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

Hi Terry,

 

How did you create the csv file with columns? Which utility helped you to create that? As per to the Video, seems like you created it manually. In our situation, we have only some datastage columns tab screenshots. However, we cannot view the contents of the data file. I am tempted to convert the file to ascii. 

Employee

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

Can you ask them to copy/paste the datastage info to you?  It can probably be parsed by TDM and then you can create your csv file from that.  I've done a similar thing before.  No, shouldn't have to convert it to ASCII first.  That's what TDM is for.

Seven Stars

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

Hi Terry,

 

I finally received the copybook details from them. Though the email contained something like below. If I copy the content between the top and bottom markers, save it as  UTF8 windows text file and import the copybook, data mapper is throwing many errors. However, if I just copy the contents without the line numbers and import it, it is getting imported fine. Is the second method, the correct way of importing the copy book? After importing the copybook, I am still facing challenges on viewing the data from the sample file received. Hence I am doubting my approach. Once you confirm the approach I will post the details of the other errors?

PSTREC

EDIT                      XXX.ACC.COPYLIB(PSTREC) - 0.05                         Columns 00001 00072

Command ===>                                                                                           Scroll =====>CSR

******     ********************************************Top of Data *******************************************

000001      01 PST-DETAIL_RECORD.

000002               03 P-KEY.

000003                       05 P-ACCT                                        PIC S9(9)     COMP-3.

....Many lines like above

******     ********************************************Bottom of Data *******************************************

Seven Stars

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

Hi Terry,

 

Finally they send the copy book and I was able to import it. However, when I point it to the data file to display data, it is throwing one error as shown below while showing the data in XML format. I have not modified the default structure it implied. Any thoughts?

 

Overall: Error
1: Error - An end-of-file was encountered during the processing of this element. (802)
Structure: Structures/NOPOST-DETAIL-RECORD/NOPOST-DETAIL-RECORD.xml - Element: /NOPOST-DETAIL-RECORD/Record/NP-KEY/NP-ACCT
Starting Line/Column: 1/1001 Byte offset: 0x3e8 (1000)
Line: 2 column: 0
Byte offset: 0x3ea (1002)
SystemId: file:/C:/Program%20Files/Talend/Talend-Studio-V6.2.1/workspace/FLEXDEMO/Sample%20Data/5504_F1nopost.DAT_1_20180723004343

 

Employee

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

Try this.  Go to Windows->Settings->Preferences.  Find Mapping and click on it.  To the right, change "Maximum output characters to show" and "Maximum characters to read for sample docs/highlighting" to -1.

Seven Stars

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

Thanks for the reply, Terry! Tried that option and restarted the designer. Still the problem persists. I noted two things.

1. The problematic field is having metadata for PIC S9(9). However, the actual data that is coming in the preview is having a length of 8.

2. Towards the end of the sample document I am seeing the below text.

 

 <Flat:Record>
<Flat:NP-KEY>
<Flat:NP-ACCT></Flat:NP-ACCT>
</Flat:NP-KEY>
</Flat:Record>

Could any of those two be the reason for the error?

Is there any way from me to make it skip the first byte while reading the data?

Also, I would like to know if there are ways to suppress the error when using this structure in a job.

Employee

Re: Talend Data Mapper - Finding the structure of a EBCDIC file without the copy book

You don't want to skip any bytes.  Sounds like your copybook doesn't match up with your data.  Can you take a screenshot of your structure with the associated data?  I will be gone next Mon - Thurs so will try to look into this when I get back.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download