One Star

Processing different record types in one file

Hi folks,
I'm fairly new to TOS and am putting together my first 'real' job. I know what I'm trying to achieve but I'm not sure if I'm going about it the right way.
I have an input file which contains multiple record types. Different record type have different numbers of columns.
What I'm trying to do is to write out a single file with the columns of each record type moved such that each column in the output file has a single data type.
For example:
- input file may contain the following 3 records
RT1,20100101,fred
RT2,12345,david
RT3,abc,1234
My output file needs to be in the following format:
- column#1 should be string (is my record type)
- column#2 should be a date
- column#3 should be a string
- column#4 should be numeric.
Using my sample data from above my output file will be as follows:
RT1,20100101,fred
RT2,,david,12345
RT3,,abc,1234
I am using tMap to split the input file based on the record type (RT1, RT2 etc) and for testing have written those to tLogRow so I know I'm doing the correct processing.
How do I merge them back into one file ? I realise that I could write them all out to different files and then iterate through a filelist, but that seems like overkill to me. I'm typically processing less than 10 rows on each execution of the job.
I tried to use tUnite directly from the output of tMap but couldn't get it to accept multiple inputs. Seems strange but I could not get it to work. The first output from tMap dropped onto the tUnite component ok, but subsequent ones would not connect to it. The icon being shown was the "not allowed" one.
I then considered using a tBufferOutput from each row type of tMap, this is ok, but again trying to use tUnite to mwerge them back together I'm hitting the same problem.
What's the best way to achieve this ? All ideas and suggestions gratefully received.
Cheers,
Dave
9 REPLIES
Four Stars

Re: Processing different record types in one file

Unfortunately, you cannot unite an already separated stream. However, you can output your split flows into files, and then input them in the same job to join them together.
One Star

Re: Processing different record types in one file

Unfortunately, you cannot unite an already separated stream. However, you can output your split flows into files, and then input them in the same job to join them together.

I've just gotta ask, then - what is the tUnite object intended to do? The tooltip says it "Merges inputs into the same output".
Do your inputs all have the same schema?
Four Stars

Re: Processing different record types in one file

Let me rephrase, you can't unite a stream/flow that was separated INSIDE the talend job. So, if you have two sources coming in, in that situation you can use tUnite. It's a bit of a pain, but the work around works fine. I had to do it for a reconciliation project and it worked flawlessly.
One Star

Re: Processing different record types in one file

Hi Andrew & jkrfs,
Thanks for the info and thoughts.
Andrew, yes my schemas are all the same.
At least I now know that I'm not doing anything wrong / stupid. The tUnite component won't do what I'm trying to do in the way that I'm trying to do it. I don't understand why that is, but "that's what is". (time for an enhancement request).
I'll either write back to disk or try another way to approach this.
Cheers,
Dave
One Star

Re: Processing different record types in one file

An alternative approach would be to have the tMap emit only a single output - you would just need to write code to fill in the other columns that looks at the record type and uses the correct source data.
One Star

Re: Processing different record types in one file

Hi Andrew,
On the face of it that doesn't seem to make sense.
Why use tMap if you're then going to write the code to handle the different record types ? Certainly in this instance the main reason for choosing tMap was to allow it to recognise the different record types and process them accordingly.
Am I missing something here ? (very possible !)
Cheers,
Dave
One Star

Re: Processing different record types in one file

Why use tMap if you're then going to write the code to handle the different record types ? Certainly in this instance the main reason for choosing tMap was to allow it to recognise the different record types and process them accordingly.

It all seems to come down to whether the record type is an indicator of the format of the input row and you want a single output (what I understood to be the case) or if you want to split out the different record types for various processing efforts. If you want a single output with the parsed data, using a tMap with a single output would seem the best way to handle it. Otherwise the current setup would seem to work correctly.
One Star

Re: Processing different record types in one file

Hi Andrew,
I think we're now getting to the nub of the problem.
I want a single output.
What I need to do is something like:
- for record type#1, I need to move input-column-1 to output-column-1
- for record type#2, I need to move input-column-1 to output-column-2
Both record types #1 and #2 should be written to the same output.
When using the tMap i think that the 'move' has to be done by using different output records, which cannot be wriiten to the same output component.
Am I wrong ?
Is there another way of doing this using tMap ?
Cheers,
Dave
One Star

Re: Processing different record types in one file

hi, i m new to JasperETL. me to facing the similar problem
I need to read file EG:
@@@08123456BUILD 03/11/06POSITION
"ABC ","12345678","C","L"," 50.0000"
"CASH07 ","78901234","C","L"," 33655.3900"
"912827Z69","67890123","M","L"," 25000.0000"
@@@08123456BUILD 03/11/06SECDESC
"ABC ","cs","ALPHA BETA CO INC "," "," "," "," 48.5000","031106"," 48.5000"
"ZZZ ","cs","TESTACCT INC "," "," "," "," 40.5000","031106"," 40.5000"
"912827WQ1","gm","GNMA PL #422452 ","DUE 11/15/15 ","FACTOR = 0.055141710000","AMRT. AMT = 27,570.86 "," 113.3740","031106"," 0.0625"
@@@08123456BUILD 03/11/06CUSTOMER
"27471111","AGNES D BE MINEERA ","12 SPRINGLOSS RD NE "," ","ATLANTA GA 30306"," ","404-977-1111","404-873-1111","030383","AN","000-32-8697","SGS","4L","IR"," "," "
"33840000","T M FORESTA IIV ","4256 S DAHBIM ","ROUTE 8 BOX 645 ","KESWICN VA 22947"," ","804-979-1111","804-979-1111","062891","AN","200-00-1004","AMM","4L","4L"," "," "
"37930000","REBENA J MERDON ","2000 LITTLE BRAIKE LANE "," ","DUNWOODY GA 30338"," ","404-396-1111","404-111-1111","021683","AN","000-50-4754","ZMM","4L","IR"," "," "

this page is having 3 informations i.e, position, security and customer. under each section umber of rows are dynamic here there is only 3 rows, but it may be 4 or 5 or any thng. i need to read customer section - both its header and detail section.plz help me