How to clean and transform 3 columns of a csv file

Two Stars

How to clean and transform 3 columns of a csv file

Hello !

I'm a beginner on Talend but I'm so stuck that this is the first time I'm posting on a forum !
Here is the deal, I want this csv :


Host Extract_date Size Used Avail Use% Mounted_on
host1 4112017 1008M 299M 659M 32% /MIDDLE/blabla
host1 4112017 6.0G 188M 5.5G 4% /MIDDLEDATA/blabla
host2 4112017 12G 71M 12G 1% /MIDDLELOGS/blabla

to become this :



Host ExtractDate Size Used Avail UsePercent MountedOn
host1 4112017 0.98 0.29 0.64 32 /MIDDLE/blabla
host1 4112017 6.0 0.18 5.5 4 /MIDDLEDATA/blabla
host2 4112017 12 0.0612 1 /MIDDLELOGS/blabla

1) Change headers :


I can manage this in metadatas

2) Apply these rules :

- If "Size","Used or "Avail" contains the letter "M" at the end :

  a) Get rid of the M (I'm using Treplace regexp for this)

  b) Cast from String to Float (I'm using Tmap for this)

  c) Divide by 1024 (to get a size in G) (Tmap too)

  d) Round the result, 2 digits after comma (Tmap again)

- If "Size","Used or "Avail" contains the letter "K" :

  Appl the same except step c) needs to be done twice

- If "Size","Used or "Avail" contains the letter "G" :

  apply step a) and b) only

How can I merge all of these outputs ? I will have duplicate lines like this !

2017-11-04 12_19_20-Talend Open Studio for Data Integration ( _ CNAM_NFE211-212 .pngThis works but ...

I need some help ! :'(



Eight Stars

Re: How to clean and transform 3 columns of a csv file

Do you want to fuse all these processening steps together?


Introduction to Talend Open Studio for Data Integration.

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.