Unable to proceed further with tFileInputMSDelimited

Six Stars

Unable to proceed further with tFileInputMSDelimited

Hello Community,

 

I have a scenario where I have to read a multi schema file and store that in multiple output files. The number of output files would be dependent upon the number of schema inside the input file. I have tried this using tFileInputMSDelimited. However, I could not find a way out to write the output to multiple files. Attached is the Input File for reference.

 

Any way to handle such situation?

 

Thanks in advance!

 

Best Regards,

Dipanjan

 

Eleven Stars

Re: Unable to proceed further with tFileInputMSDelimited

sending you a sample job. Change the Path accordingly.

Idea
- Get start and end line number of each schema
- Iterate through it and read input file with different header and Limit.

There might be other ways to do it.

Regards
Abhishek KUMAR
Six Stars

Re: Unable to proceed further with tFileInputMSDelimited

@akumar2301,

 

Many Thanks for the prompt response!

 

Although I was able to achieve the column names with tFileInputMSDelimited as well. I tried your way, however, the output file only has the column names and not the column values corresponding to each of the schema. Also, could you please explain little bit more about your thought process regarding the use of 2nd tMap or rather more about its use case specifically the Lastlinenumber?

tMap_2.png

 

Below is the screenshot from the output files -

 

Output Files.PNG*P.S. - I have removed the Non Alphanumeric Characters from the input file (Attached for reference). Earlier I have given such characters just to separate out each of the 3 schema for better understanding.

Eleven Stars

Re: Unable to proceed further with tFileInputMSDelimited

Please open it in wordpad or Notepad++ or excel. may be because of OutputPositional Component(Not sure)

The above code is to get number of records to each schema.
Startlinenumber|NumberofLinetoread
8|-1
4|4
0|4

For 1st schema , head is 0 and Limit is 4 ( to read only 4 records)
For 2nd schema , head should be 4 and Limit again 4 ( To read number of record in that schema)
For last schema , head should be 8 and Limit -1 ( to read rest of file).

How did you implement it using tFileInputMSDelimited?

Regards
Abhishek KUMAR
Six Stars

Re: Unable to proceed further with tFileInputMSDelimited

@akumar2301 ,

In tFileInputMSDelimited, I had to manually define the column names for each of the schema inside "Fetch Codes" section. But I was unable to get the column values.

 

I'm still not clear about the below code,

"For 1st schema , head is 0 and Limit is 4 ( to read only 4 records)
For 2nd schema , head should be 4 and Limit again 4 ( To read number of record in that schema)
For last schema , head should be 8 and Limit -1 ( to read rest of file).
"

 

Furthermore, in this case, each of the schema has fixed set of rows. If there comes a scenario where I need to handle somewhat like below - 

 

Schema 1 has 3 rows

Schema 2 has 5 rows

Schema 3 has 9 rows

 

Will I be able to achieve this using the same use case?

 

Eleven Stars

Re: Unable to proceed further with tFileInputMSDelimited

Yes it should work. Only condition is Schema should start with "Name" String.

 

Attaching another job with simpler approach. Change input/Output directory

Regards
Abhishek KUMAR
Six Stars

Re: Unable to proceed further with tFileInputMSDelimited

@akumar2301 , Why have you mentioned Field Separator as "XYZ" ?

Eleven Stars

Re: Unable to proceed further with tFileInputMSDelimited

Delimiter should not exist in your file so all contents comes in one cols.You could use any string which is not part of file.

I could have used fileinputfullrow instead of delmiter components.

Again there could be other better ways to implement it.
Regards
Abhishek KUMAR

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Why Companies Move to the Cloud: 7 Success Stories

Learn how and why companies are moving to the Cloud

Read Now