One Star

Passing Schema as a Parameter to talend job

Hi,
I'm using TOS for BigData 5.3.1 . My requirement is I wanted to pass a schema file as a parameter to talend job.
Description : Suppose i have a structured data file (csv) with the data type of the columns as below :
int
string
string
int
int
Total 5 columns with the 1st column int ,2nd string and so on as mentioned above. Here i wanted to define the above schema in text file (only the data types) and call that text file in talend job . So that next time if i get the other file it should validate the file based on the schema specified in the text file, supposed in the new file if the 2nd field is of the type int then it should discard that field and move to the next column for validation check.
Is it possible?
Also whatever i explained above is possible in talend without defining the external schema file , is there any component available in talend where we can specify our datatype schema to achieve my requirement ?
Right now i'm using TOS for bigdata where i do not have metadata tab under repository manager. But if i install the TOS for Data Integration am i able to achieve my requirement ? Does metadata tab under repository serve the same purpose what i wanted to do? Also metadata management option is available in the TOS for DI or should i go for enterprise edition?
Looking for your suggestions.
Thanks,
ShreeCS
12 REPLIES
Community Manager

Re: Passing Schema as a Parameter to talend job

What about https://help.talend.com/search/all?query=tSchemaComplianceCheck&content-lang=en?
This component is available in TOS for DI.
Metadata management is available to a certain extend in TOS for DI.
One Star

Re: Passing Schema as a Parameter to talend job

Hi,
I checked for the component tSchemaComplianceCheck component in TOS for bigdata. But that component is used to validate the data on the file/table whose schema is all ready defined somewhere. Here I'm looking for defining the schema externally or in talend where I'm specifying only the datatypes of the fields. I'm not getting how to define the schema externally and use that in talend Or defining the schema in talend .
Please suggest me what can be done.
Thanks,
ShreeCS
Four Stars

Re: Passing Schema as a Parameter to talend job

You can't do it using TOS.
One Star

Re: Passing Schema as a Parameter to talend job

Hi,
It is confirmed that i can not define the schema of a file without specifying the column names in talend.
Suppose if i have a text file which contains column type , nullability , field length information.
For example :
int notNull 5
String Null 20
int null 100
Can i call this text file to define the schema in talend. Is it possible? Is talend able to take the 1st column as int ,2nd as string and 3rd as int by referring the text file ? Is this possible using context variables?
I wanted to achieve this using talend , how it can be done ?
Thanks,
ShreeCS
Four Stars

Re: Passing Schema as a Parameter to talend job

Hi Shree,
If you are determined to do it in TOS only.
then go the Java way... you an do anything... ask some java developer how to do it?
Using TOS you can't do it. Talend is code generator... It will generate the java code based on what you define on the workspace...
How you expect the code will automatically change based on your inputs??? Think in that way, you will get an idea to do it using only Java..
Vaibhav
One Star

Re: Passing Schema as a Parameter to talend job

What u mean by writing java code. You want me go go for tJava component where i can specify file schema ?
But can i use tJava component as the 1st component in the job flow ?
I'm not getting clear picture here.
Thanks,
ShreeCS
Four Stars

Re: Passing Schema as a Parameter to talend job

You can't define or change metadata in TOS.
- You use java program to read a file or any source data using tjava or tjavaflex
- Define or convert data types using the way you want or do any type of operations.
But no way it can be done using TOS
Vaibhav
One Star

Re: Passing Schema as a Parameter to talend job

Hi,
To be clear i don't want to change the metadata. I will explain what is wanted to achieve in TOS.
Description:
I will define the schema in a text file (only datatypes) outside the talend. Then i will pass that text file as a parameter to some component in talend (but don't know how). So that whenever the new file (say csv) comes with the same number of fields and same datatypes defined in the text file , it should read that text file as parameter and used in reading the new csv file . So that next time whenever the new files comes they should make use of the text schema file.
If the new file comes with the different column types , it should reject that file after referring the text schema file.If it matches then it should read that file .
Is it possible ?
Thanks,
ShreeCS
Community Manager

Re: Passing Schema as a Parameter to talend job

One Star

Re: Passing Schema as a Parameter to talend job

Hi,
Thanks for the information provided so far.
In TOS for Data Integration there is a concept called as metadata where you can create your schema once and use the same schema for the next time for reading the different files with same structure.
I have TOS for bigdata installed with me where it doesn't have metadata tab under repository manager. So i thought of installing TOS for Data Integration. It will provide me the metadata feature right?
When i referred this link, http://www.talend.com/products/data-integration/matrix , it says repository manager is not available for Talend Open Studio(TOS)
If the metadata tab is not available in TOS for data integration , how can i achieve it (defining the schema once and use it many times)in open studio.
Thanks,
ShreeCS
Community Manager

Re: Passing Schema as a Parameter to talend job

I have TOS for bigdata installed with me where it doesn't have metadata tab under repository manager. So i thought of installing TOS for Data Integration. It will provide me the metadata feature right?

Yes indeed.
When i referred this link, http://www.talend.com/products/data-integration/matrix , it says repository manager is not available for Talend Open Studio(TOS)

True. It's a Platform tool becuase it suppose that you have several Talend repositories to manage (but nothing to do with metadata manager).
If the metadata tab is not available in TOS for data integration , how can i achieve it (defining the schema once and use it many times)in open studio.

The metadata manager is available in DI, so no problem. otherwise I don't think it is possible.
One Star

Re: Passing Schema as a Parameter to talend job

Yes, I have installed TOD for Data Integration and able to find the metadata tab under Repository.