One Star

Retrieve dynamically schema columns from CSV

Hi to all
I have this problem. I need to create a job that get data from a CSV and load it inside a Excel file. This job will run a lot of time but the data inside CSV will not be the same: I write down 3 scenario:
SCENARIO 1) The data inside the CSV
colA;colB;colC;colTemp1;colTemp2
"...";"...";"...";"...";"..."
"...";"...";"...";"...";"..."
SCENARIO 2) The data inside the CSV
colA;colB;colC;colTemp1;colTemp2;colTemp3; colTemp4
"...";"...";"...";"...";"...";"...";"..."
"...";"...";"...";"...";"...";"...";"..."
SCENARIO 3) The data inside the CSV
colA;colB;colC;colTemp1
"...";"...";"...";"..."
"...";"...";"...";"..."
So the problem is that the first 3 columns (colA, colB, colC) are fixed colums while the next colums are dynamic (one time could be two colums like SCENARIO 1, another time could be 4 colums like SCENARIO 2, another time could be 1 colum like SCENARIO 3, and so on).
How can I make the schema dynamic so I retrieve in correct way the data from the CSV and make the right table inside the Excel? I read a lot of posts about the dynamic schema. Is there a way to make it in TOS DI?
I thought to get the first line from CSV using tFullInputRow-> tNormilize and then make a XML schema with the right number of columns. After, use this xml schema inside the tFileInputDelimited. It's an idea but I don't know how to make the XML Schema dynamically.
Thanks in advance,
Tommaso
17 REPLIES
Moderator

Re: Retrieve dynamically schema columns from CSV

Hi,
For your requirement, i think Dynamic schemas can meet your needs which allow you to design of Jobs with an unknown column structure (unknown name and number of columns). If necessary, dynamic columns can be mapped directly to the target using Pass-through mode.
However, dynamic schema feature is only available in the Enterprise version (on subscription) of Talend.

Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Seventeen Stars

Re: Retrieve dynamically schema columns from CSV

For this purpose I create the tFileInputTextFlat component. You can specify a schema and check the option "Use column header to find position". If your file has a column header it will adjust the positions by the header line.
Check out this component from talend exchange.
There is also the possibility to define an alternative name for the header in case of the header contains names which breaks the rule of talend schema column names (e.g. if the name contains spaces or other chars which are not allowed in Java identifiers).
One Star

Re: Retrieve dynamically schema columns from CSV

Hi Jloling,
I have tried to use ur component tFileInputTextFlat in my job, but always got error msg as attached.
I have checked the file "cimt.talendcomp.flatfileimport-1.3.jar" inside "D:\Talend_Erjan\Talend-Studio-r78327-V5.0.2\plugins\org.talend.designer.components.localprovider_5.0.2.r78327\components\tFileInputTextFlat"
Can you help me to resolve??
Thanks
Erjan
Moderator

Re: Retrieve dynamically schema columns from CSV

Hi Erjan,
Due to tFileInputTextFlat is custom component, make sure that you have installed it successfully.
Here is the reference on Talend Help Center
Installing a custom component.
What's more, did you put the custom component tFileInputTextFlat in this directory?
D:\Talend_Erjan\Talend-Studio-r78327-V5.0.2\plugins\org.talend.designer.components.localprovider_5.0.2.r78327\components\tFileInputTextFlat"

which is not the right place I think.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Seventeen Stars

Re: Retrieve dynamically schema columns from CSV

Hi Sabrina,
thats true, but very often the normal install process for user components fails. You can beliefe me, I have written a lot and get a lot of problem reports from users especially for the missing jar problem!
I have suggested a change to your mentioned help page.
Moderator

Re: Retrieve dynamically schema columns from CSV

Hi jlolling,
It is fantastic, thanks for your dedication to us which is very useful and helpful.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: Retrieve dynamically schema columns from CSV

Hi All,
Now it's working as expected!
Thanks
Erjan
Moderator

Re: Retrieve dynamically schema columns from CSV

Hi cyberjan,
How do you resolve your problem. Put the custom component in the right place? Or do as @jlolling said? Would you mind sharing your own experience with us.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: Retrieve dynamically schema columns from CSV

Hi Sabrina
I have created the custom components folder and point it in Talend and did as told by jloling "to delete the configuration/ComponentCache.javacache and restart TOS".
Now it worked, the missing jar message is not there, but subsequent login to TOS now it prompted the new error msg (attached).
Then I tried to delete the componentcache file again and it can login, but next login will prompt the same error msg.
What happen here??
Thanks
Erjan
Moderator

Re: Retrieve dynamically schema columns from CSV

Hi,
I have created the custom components folder and point it in Talend and did as told by jloling "to delete the configuration/ComponentCache.javacache and restart TOS".
Now it worked, the missing jar message is not there, but subsequent login to TOS now it prompted the new error msg (attached).
Then I tried to delete the componentcache file again and it can login, but next login will prompt the same error msg.

If you don't delete the component cache file, does this error pop up?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: Retrieve dynamically schema columns from CSV

Yes it will pop up again, that's why I need to delete the comp.cache file to be able to login.
Thanks
Erjan
Moderator

Re: Retrieve dynamically schema columns from CSV

Hi,
I think it is the problem as jlolling said " if you install an update of a component, the files of the old component will be overwritten and new files will be added BUT files which does not exists anymore in the new component (like old versions of jar or not used javajet files) will NOT removed. That caused often problems and the user currently has only the chance to fix that locally in his plugins directory ".
Our colleague will report an issue on talend bug tracker.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Community Manager

Re: Retrieve dynamically schema columns from CSV

Hi cyberjan
I don't get any trouble to install this custom component and make this component work just follows the guides, I have tried this on two version: 502 and 511. There may be something wrong in your studio, I suggest you to install the custom component in a new studio, and see if the problem still exists.
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
Employee

Re: Retrieve dynamically schema columns from CSV

Note that to clean automatically the component cache, if need you can simply start the studio with the additional parameter: --clean_component_cache
(You can create a new shortcut on windows with this parameter added...)
This might be helpfull in some case with custom components, even if here in fact it seems more like a problem of installation.
One Star

Re: Retrieve dynamically schema columns from CSV

Hi All,
Thanks for the help, think some problem on my Studio installation components.
I will try nrousseau suggestion
Thanks
Erjan
One Star

Re: Retrieve dynamically schema columns from CSV

I am trying the similar scene as to use Dynamic datatype for list of tables integration using single job..
Its really helpful, thanks Talend.
But saying that, I need a little help to get this working well. My date values from the source are inconsistent, I mean they are 'mm/dd/rr' or 'mm/dd/yyyy' or 'yyyy-mm-dd'. as these are written in same fashion of string value to target, its not working out for me.
My source is teradata, so I tried setting the session dateform to ANSIDATE, which may help here. I am not able to configure the session using tTeradataInput.
Please help to setup values for the session at runtime, as we do in Teradata SQLA. I guess I can use that in different scenes as well.
Appreciate quick reply or any generic job available for the similar purpose.
Thanks,
Rahul
Community Manager

Re: Retrieve dynamically schema columns from CSV

Hi Rahul
This topic was a little old, can you please report a new topic for your question, it is easy for us to follow up your topic and manage the replies.
Thanks!
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business