One output record from multiple input records - What's the best way?

One Star

One output record from multiple input records - What's the best way?

Hi,
I have a really tricky issue.
I have an delimited input file where one output records? contents can be spread over several input records. Input records are related so that you know they belong together.
No field will have data across the span on the multiple input records (i.e field 10 could exist on input record 1,2, or 3, but only once within those lines. ( either 1 or 2 or 3)
Before this was handled using a program and it was able to use iterative reading and array-indexing to perform the formatting of the output record.
You always know when to start a new output record.
How can I achieve the multiple input/array indexing/ one output record functionality within Talend?
I have been racking my brains but cannot think of a way.
One Star

Re: One output record from multiple input records - What's the best way?

Hi Volker,
That case will not work for me as I do not have the column names with the file like the example. It's a pure comma delimted input but I know the max number of columns. I just a matter of 'compressing' several related input lines to one output line.
Thanks for looking into it - If you have any more thought I would be happy to hear them.

Re: One output record from multiple input records - What's the best way?

would loading the file into a database, and then selecting out using a GROUP BY work for you?
You can achieve the same behavior with a tAggregateRow
One Star

Re: One output record from multiple input records - What's the best way?

Hi,
could you please give us a short example (input and output). With this it would easier to find a good solution.
Bye
Volker
One Star

Re: One output record from multiple input records - What's the best way?

Hi,
Here's an example :-
Record1 - Field1, field2,,,,,field6
Record2 - ,,field3,,,,
Record3 - ,,,field4,field5,
I always know an occurence of the start of a 'new' record. Records 1,2,3 have different fields that need to processed to form one output record (field 1-6).
There are actually 300 + fields that make up the record but the above is representative.
Hope this helps clarify my requiremements.
One Star

Re: One output record from multiple input records - What's the best way?

If you have an variable structure you could:
a) use technical attribute names (like row1 to rowN) and later, after processing, use correct names.
b) If your file is line based but with different structure on each line, you could read your file line by line and process in a tJavaRow for example.
But on the end I think it would be possible to use the previous mentioned use case to create correct rows.
One simple approach:
1) read file line by line (tFileInputFullRow) or for example with tFileInputDelimited (with a maximum number of rows)
2) in your tJavaRow use your correct attributes and one additional (lineCompleted) for example)
3) after processing a full row set the flag lineCompleted and filter out all not completed rows (tRowFilter)
The limitation of this is that you could only create one row as output for one input row.
Bye
Volker
One Star

Re: One output record from multiple input records - What's the best way?

Volker,
I don't understand how it works but it does. Thanks for all you help.
One Star

Re: One output record from multiple input records - What's the best way?

If you have any question, just ask (even if it works)... ;-)
Bye
Volker
One Star

Re: One output record from multiple input records - What's the best way?

If you have an variable structure you could:
.........
The limitation of this is that you could only create one row as output for one input row.
Bye
Volker

Hi Volker,
If one input row can create 1 or more output, what is the work around?
Here is the description.
I have ID, Name1, Name2, Name3 in csv (input).
I want map this to DB (output) with ID, Name as the field without primary key.
So if csv like 01, John, Ann, Mark.
The output I expect in DB is like
ID Name
== ====
01 John
01 Ann
01 Mark
One Star

Re: One output record from multiple input records - What's the best way?

Hi,
I see two possible solutions. The first one is to concatenate your Names into one string and decompose it with tNormalize.
The second one is to take a look into Talend Exchange. I think there is a component to transpose your data.
Bye
Volker