Six Stars

Adding a new derived column based on database function

Hi All,

 

I searched quite a lot bit unable to get any information. I have a requirement where within Talend Job I need to add a new column with an built database function

 

Input is for e.g.

FirstName

LastName

Address

output

FirstName

FirstName_Soundex ....this is a function Soundex(FirstName)

LastName

LastName_Soundex ....this is a function Soundex(LastName)

Address

 

 

I am unable to find a component which enables me to do this. TMAP allows adding new column but the syntax is based on Java...it cannot compute db functions.

 

Many Thanks

Ashish

  • Data Integration
Tags (1)
1 ACCEPTED SOLUTION

Accepted Solutions
Ten Stars

Re: Adding a new derived column based on database function

OK, I see I have been answering the wrong the question. If you want to use your function in a DB input component, you simply use it as you would use it in a normal SQL query. Then assign the function a column name with the "AS" keyword and ensure your schema has a column for it in the right place. The screenshot below shows my example which works....

query.jpg

 

My function is called "TestFunction" and I provide it with the account.code column. It returns the data as the "name" column. The schema has 4 columns; id, code, name and value. 

 

Just so that you know, the question I was answering previously was to do with writing to the database and not reading from it. 

Rilhia Solutions
17 REPLIES
Ten Stars

Re: Adding a new derived column based on database function

Take a look at step 8 of this tutorial (https://www.rilhia.com/tutorials/using-auto-generated-primary-key-update-row-just-inserted-mysql-dat...). I think it *may* be what you need.

Rilhia Solutions
Seven Stars TRF
Seven Stars

Re: Adding a new derived column based on database function

I'm afraid what you want to do is impossible.

If you want to add a column associated to a database function, you have to call this function using the "t<db>SP" component and put the result into the desired field.


TRF
Seven Stars TRF
Seven Stars

Re: Adding a new derived column based on database function

Arrrgh! @rhall_2_0, experts do not agree with each other  Smiley Wink


TRF
Nine Stars

Re: Adding a new derived column based on database function


rhall_2_0 wrote:

Take a look at step 8 of this tutorial (https://www.rilhia.com/tutorials/using-auto-generated-primary-key-update-row-just-inserted-mysql-dat...). I think it *may* be what you need.


Thank you, this was a very helpful example.

Six Stars

Re: Adding a new derived column based on database function

Thanks for the detailed instructions. I am afraid but does not work fully. 

tMSSqlOutput component has been modified.

Advanced Settings -> Additional Columns -> + --> 

SQL Expression if I use "Soundex(first_name)"

first_name is my column

 

Invalid column name 'first_name'.
[ERROR]:

tMSSqlOutput_1 - Invalid column name 'first_name'.

 

if I add instead a standard SQL Server function "DATEADD(day, DATEDIFF(day, 0, GETDATE()), -1)"

this works fine

I have tried all options

"Soundex([first_name])"

"Soundex(schema.table_name.first_name)"

"Soundex(table_name.first_name)"

 

Unfortunately none work...how can I reference the column in the function?

 

Thanks

 

 

Seven Stars

Re: Adding a new derived column based on database function

Can you try something like "Soundex("+input_row.first_name+")" ?

And also remember to mention it in the reference column.

Six Stars

Re: Adding a new derived column based on database function

Thanks

 

I had tried that but does not work. Reference column is used for position i.e. before/after or replace.

 

"Soundex("+ row1.first_name +")" 

 

returns NULL for all columns as its not able to interpret it as column name

 

So sorry not able to achieve the end result.

Ten Stars

Re: Adding a new derived column based on database function

A subtle difference, but have you tried....

 

"Soundex('"+ row1.first_name +"')"

 

I'm adding a ' before the double quote and after the double quote. This is done so that the value of row1.first_name (lets assume it is Richard) is added as a literal value at runtime. So the actual method passed to SQL would be...

 

Soundex('Richard')

 

That *might* work

Rilhia Solutions
Six Stars

Re: Adding a new derived column based on database function

Thanks

Had tried that, but it interprets "Soundex('"+ row1.first_name +"')"

As "Soundex('row1.first_name')"
row1.first_name is a string and it returns N400 for all rows.
Ten Stars

Re: Adding a new derived column based on database function

It seems odd that it would interpret it as you say because it seems to be ignoring the quotes and the + (if it assumes the whole value as a string) and doesn't represent what Java would generate if it indeed treats it as Java String manipulation. However, it is clear this is not working for you. Have you tried using a t{DB}Row component to use an bespoke insert statement? It would arguably be slower, but would work.

Rilhia Solutions
Ten Stars

Re: Adding a new derived column based on database function

OK, so this one intrigued me. I never like to be beaten so I have tried this out and now have a method that works. First of all I *think* what you have seen *could* be classed as a Talend bug. Everything you said regarding trying this and it failing holds true. However I don't believe that it should. The way I managed to get around this was to do the following.....

 

1) I added a tMap between my source and target. I used this tMap to create the function/parameter String making use of a literal value. So for example....

"Test.MyFunction('"+row1.myValue+"')"

This creates a String which if you pasted into a SQL developer tool, would return the value you want.
2) I put a tFlowToIterate component and a tFixedFlowInput component after the tMap. This is used to create a globalMap variable for all values in the flow and to convert them back to a flow that is iterated through. It is not quite as quick as it could be, but still pretty speedy. You may not be able to put up with the loss in performance though. The tFixedFlowInput is used to return the values stored globalMap variables. The tFlowToIterate component creates globalMap variables with the name {row}.{column_name}. So for a row called "row1" and a String column called "myValue" feeding into the tFlowToIterate, you would get a globalMap which could be accessed via....

 

((String)globalMap.get("row1.myValue"))

3) The function/procedure you want to use is added in the way I suggested earlier, but the globalMap is used instead of a String that you build. So in the location where the String was previously  added, you would add (assuming your column used to create the SQL function call is called "myFunction").....

((String)globalMap.get("row1.myFunction"))

When I ran this, I saw the function return a value (and add it to the insert) which represented what had been supplied to the function as a literal value.

 

I hope this helps.

Rilhia Solutions
Six Stars

Re: Adding a new derived column based on database function

Thanks a lot, struggling a bit on global map variable...is it possible to export the job and share please ?

Many Thanks

Ten Stars

Re: Adding a new derived column based on database function

I'm afraid I do not have my machine with me at the moment. What is it you are stuck with? GlobalMap variables are not too difficult to help out with. Maybe you can post a screenshot and we can help

Rilhia Solutions
Six Stars

Re: Adding a new derived column based on database function

Thanks,

Screen print below:

[cid:image001.png@01D2D56F.4CB8EEC0]


[cid:image002.png@01D2D56F.4CB8EEC0]

Above new column in tmap

Option for tFlowToIterate

[cid:image003.png@01D2D56F.4CB8EEC0]

Schema updated for tFixedFlowInput and added displaysoundex with ((String)globalMap.get("out1.firstnamesoundex"))

[cid:image004.png@01D2D56F.713DB000]
When I run the tlogrow shows

------------------.
| tLogRow_1 |
|=----------------=|
|displaysoundex |
|=----------------=|
|Soundex('Laquita')|
'------------------'

.---------------.
| tLogRow_1 |
|=-------------=|
|displaysoundex |
|=-------------=|
|Soundex('Lura')|
'---------------'

.-----------------.
| tLogRow_1 |
|=---------------=|
|displaysoundex |
|=---------------=|
|Soundex('Yuette')|
'-----------------'
.--------------.
| tLogRow_1 |
|=------------=|
|displaysoundex|
|=------------=|
|Soundex('') |
'--------------'



i.e. all rows...however the output delimited file at the end captures only the last record
displaysoundex

Soundex('')





Six Stars

Re: Adding a new derived column based on database function

Sorry images did not come through previous message. Please see the attached document.

 

Thanks

Ten Stars

Re: Adding a new derived column based on database function

OK, I see I have been answering the wrong the question. If you want to use your function in a DB input component, you simply use it as you would use it in a normal SQL query. Then assign the function a column name with the "AS" keyword and ensure your schema has a column for it in the right place. The screenshot below shows my example which works....

query.jpg

 

My function is called "TestFunction" and I provide it with the account.code column. It returns the data as the "name" column. The schema has 4 columns; id, code, name and value. 

 

Just so that you know, the question I was answering previously was to do with writing to the database and not reading from it. 

Rilhia Solutions
Six Stars

Re: Adding a new derived column based on database function

Many Thanks and you are a star. I agree with you...you never give up and are very helpful indeed.

 

Thanks