Four Stars

Add row ID for Header and Line in a CSV File

Hello,

 

I'm asking for help to solve this :

 

Example Input File  :

Header;id_1;neg

Detail;25

Detail;20

Header;id_2;neg

Detail;10

Detail;7

 

Wanted Output File :

id_1;Header;id_1;neg

id_1;Detail;25

id_1;Detail;20

id_2;Header;id_2;neg

id_2;Detail;10

id_2;Detail;7

 

You can see that the idea is to add a row id to make a link between Header lines and Detail lines

I've tried with a tFileInputFullRow-->tMap-->tFileOutputCsv

 

 

I'm trying with keeping the Header id value in a global variable, calling it in an additional row in the tMap and change the value of this global variable for every next Header in the file....

 

But i failed to find a working way to do that

1 ACCEPTED SOLUTION

Accepted Solutions
Forteen Stars TRF
Forteen Stars

Re: Add row ID for Header and Line in a CSV File

1rst, use a tFileInputDelimited instead of a tFileInputFullRow and untick the option "check each row structure against schema".

Then you need to memorize the "id" each time a new "header" line is coming from the input flow.

You can achieve this using a pair of tMap variables:

  • the 1rst one (called "id" in my example) get the value from the input row when the "header" field contains the keyword "header", else its value is got from the 2nd var called "lastId"
  • after that, the value of the variable "lastId" is overwritten by the value of the variable "id"

Got it?

Here is how it looks like in the tMap:

Capture.PNG

And the result as expected:

[statistics] connecting to socket on port 3565
[statistics] connected
.----+------+----+---.
|     tLogRow_42     |
|=---+------+----+--=|
|id1 |header|id2 |neg|
|=---+------+----+--=|
|id_1|Header|id_1|neg|
|id_1|Detail|25  |   |
|id_1|Detail|20  |   |
|id_2|Header|id_2|neg|
|id_2|Detail|10  |   |
|id_2|Detail|7   |   |
'----+------+----+---'

[statistics] disconnected

Fianlly you just have to put the result into a tFileOutputDelimited without the header line and that's all.


TRF
4 REPLIES
Forteen Stars TRF
Forteen Stars

Re: Add row ID for Header and Line in a CSV File

1rst, use a tFileInputDelimited instead of a tFileInputFullRow and untick the option "check each row structure against schema".

Then you need to memorize the "id" each time a new "header" line is coming from the input flow.

You can achieve this using a pair of tMap variables:

  • the 1rst one (called "id" in my example) get the value from the input row when the "header" field contains the keyword "header", else its value is got from the 2nd var called "lastId"
  • after that, the value of the variable "lastId" is overwritten by the value of the variable "id"

Got it?

Here is how it looks like in the tMap:

Capture.PNG

And the result as expected:

[statistics] connecting to socket on port 3565
[statistics] connected
.----+------+----+---.
|     tLogRow_42     |
|=---+------+----+--=|
|id1 |header|id2 |neg|
|=---+------+----+--=|
|id_1|Header|id_1|neg|
|id_1|Detail|25  |   |
|id_1|Detail|20  |   |
|id_2|Header|id_2|neg|
|id_2|Detail|10  |   |
|id_2|Detail|7   |   |
'----+------+----+---'

[statistics] disconnected

Fianlly you just have to put the result into a tFileOutputDelimited without the header line and that's all.


TRF
Forteen Stars TRF
Forteen Stars

Re: Add row ID for Header and Line in a CSV File

Did this help you?
If so, thank's to mark your case as solved (Kudo also accepted).

TRF
Four Stars

Re: Add row ID for Header and Line in a CSV File

Sorry i was offline, i've just tried your solution now and its working perfectly !

I spent a long time with adding tJavas,tGlobalVars... for managing the id variables.
Because i didnt find how to use correctly the tMap Var part.

Thanks a lot TRF, now i understand better the tMap Var part Smiley Happy.

You solution is very clean
Forteen Stars TRF
Forteen Stars

Re: Add row ID for Header and Line in a CSV File

You're welcome


TRF