Replace CSV Values with Database Values passing by transformation File

Highlighted
Five Stars

Replace CSV Values with Database Values passing by transformation File

From a CSV File, using a tFileInputdelimited, I want to transform a column’s elements to then send them to a database. The data has to be transformed before reaching the database by using a “dictionnary” file containing the terms to replace.

Context :

My initial CSV file looks like this :

1.PNG

I want to change the “State” and “Criticity” value to an id (integer) defined like this in my dictionnary :

2.PNG

 

The id set is a reference to a parameter table in my database that will indicate me what it means:

3.PNG

In my mapping, the term « faible » should take id “2” so that it is translated as « Low » in my database. I simplified the case for the example but I got more or less 20 ids to define.

How could I succeed creating this job in Talend ? 

Thx for your help ! 


Accepted Solutions
Six Stars

Re: Replace CSV Values with Database Values passing by transformation File

Hello !

Okay so I checked with more details all the examples you provided. Very helpful for me to understand your problem.


@Antoine01 wrote:

So we need 3 columns in output but we don't know how to do this. 

 

Is it possible to do this with only one Dictionary or should we create a Dictionary for every Column that we want to translate and in this is the case, how can we do it properly ? 

 


It is possible with only one dictionary : I don't think so. For me Talend cannot do that natively, maybe with some custom java's class you can achieve this point, but I don't see, or I don't know, how to do that with only one dictionary.

With your examples I made a job which uses 3 lookup, one for each column. Here's the screenshot below.csv_dictionnary_OK.png

How does it works :

Reading your input csv, lookup the 3 columns from your dictionary. In the tMap I'm just using the same as you did before, inner join with the value and return the ID.

The only added a few java code to "match" if your Priority or Criticite in the input file, is present in the "Valeur" column of your dictionary.

Here's the second screenshot :csv_dictionnary_tmap_OK.png

Details of the "Var" column :

varPriority : row2.VALEUR.contains(row3.Priority)?row2.ID:null

varCriticte : row8.VALEUR.contains(row3.Criticite)?row8.ID:null

 

I didn't use a var for "IssueType" because it matched perfectly between the input and your dictionary.

 

The result of that is not optimized, and honestly I don't know how to do it better.

I'm using a UniqRow, because in the tMap there is no key on the left, so I have a big cartesian product between the 2 columns and the input file. (lines_nb_dictionary_priority * lines_nb_dictionary_criticite * lines_nb_nput_files)

Then I've added a tFilter, many lines contained "null" values, because of the cartesian product, so I squizzed them.

The final screenshot with the result :

csv_dictionnary_result.png

4 - 11 - 2 : High - Improvement - Low (CHAMP -> PRIORITY)

4 - 11 - 7 : High - Improvement - Low (CHAMP -> CRITICITY)

It might be a bug, I don't know if I had to deal with the column "CHAMP".

Find in attachment the job, so you can import it into your Talend, It may help, keep my in touch

 

~Regards, jeoste
Kudos are always appreciated ; mark the topic as resolved if query is answered for the community

View solution in original post


All Replies
Six Stars

Re: Replace CSV Values with Database Values passing by transformation File

Hey there,

How you can succeed to do this ? It's not hard, here's the solution I made with your example :

 

csv_dictionnary.png

 

First tFileInputDelimited is your input with the following data.


@Antoine01

My initial CSV file looks like this :

1.PNG

I'm adding a first Log just to see the result of reading this file. Then you put a tMap, and add a second tFileOutDelimited.

csv_dictionnary_tmap.png

Above this is the tMap. Row3 corresponding to the first input, row2 is your dictionnary. You drag&drop the column with Priority of your input, to the column "Criticity" which is the "key" between theses two files ("faible" in your example). Then you just want to keep this id, so drag&drop the ID to your output 'out_mapping'.

 

The result is 2 when Talend sees 2 times "faible" or 5 when "haute" etc... Else null (look at the console).

And that's all !

 

Check the attachments, the two files I used are available

 

EDIT : syntax error

~Regards, jeoste
Kudos are always appreciated ; mark the topic as resolved if query is answered for the community
Five Stars

Re: Replace CSV Values with Database Values passing by transformation File

Hi,

Thanks for your answer ! I apologize because our situation is more complex than the example that I created and your answer doesn't work with our case. 

 

In fact we want to do that with many column at the same time.

 

My CSV file contains many columns with values that I want to translate into an ID:

 

a4.PNG

 

My real dictionary look like :

 

a2.png

 

I want to translate many column at the same time with only one dictionary.

 

My tMap look like (State is not use in this example) :

a3.png

 

My output has to look like :

 

a5.PNG

So we need 3 columns in output but we don't know how to do this. 

 

Is it possible to do this with only one Dictionary or should we create a Dictionary for every Column that we want to translate and in this is the case, how can we do it properly ? 

 

We found a subject who look like our situation :

https://community.talend.com/t5/Design-and-Development/One-Lookup-table-data-read-multiple-times-tp-...

 

Thanks for your help !

 

Regards,

 

Six Stars

Re: Replace CSV Values with Database Values passing by transformation File

Hello !

Okay so I checked with more details all the examples you provided. Very helpful for me to understand your problem.


@Antoine01 wrote:

So we need 3 columns in output but we don't know how to do this. 

 

Is it possible to do this with only one Dictionary or should we create a Dictionary for every Column that we want to translate and in this is the case, how can we do it properly ? 

 


It is possible with only one dictionary : I don't think so. For me Talend cannot do that natively, maybe with some custom java's class you can achieve this point, but I don't see, or I don't know, how to do that with only one dictionary.

With your examples I made a job which uses 3 lookup, one for each column. Here's the screenshot below.csv_dictionnary_OK.png

How does it works :

Reading your input csv, lookup the 3 columns from your dictionary. In the tMap I'm just using the same as you did before, inner join with the value and return the ID.

The only added a few java code to "match" if your Priority or Criticite in the input file, is present in the "Valeur" column of your dictionary.

Here's the second screenshot :csv_dictionnary_tmap_OK.png

Details of the "Var" column :

varPriority : row2.VALEUR.contains(row3.Priority)?row2.ID:null

varCriticte : row8.VALEUR.contains(row3.Criticite)?row8.ID:null

 

I didn't use a var for "IssueType" because it matched perfectly between the input and your dictionary.

 

The result of that is not optimized, and honestly I don't know how to do it better.

I'm using a UniqRow, because in the tMap there is no key on the left, so I have a big cartesian product between the 2 columns and the input file. (lines_nb_dictionary_priority * lines_nb_dictionary_criticite * lines_nb_nput_files)

Then I've added a tFilter, many lines contained "null" values, because of the cartesian product, so I squizzed them.

The final screenshot with the result :

csv_dictionnary_result.png

4 - 11 - 2 : High - Improvement - Low (CHAMP -> PRIORITY)

4 - 11 - 7 : High - Improvement - Low (CHAMP -> CRITICITY)

It might be a bug, I don't know if I had to deal with the column "CHAMP".

Find in attachment the job, so you can import it into your Talend, It may help, keep my in touch

 

~Regards, jeoste
Kudos are always appreciated ; mark the topic as resolved if query is answered for the community

View solution in original post

Five Stars

Re: Replace CSV Values with Database Values passing by transformation File

Hello !
Thank you very much for your fast and precise answer, I think it's the best solutions and we decided to follow this ways.

Regards,

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog