One Star

Reading a directory of csv file with different schema

I am trying to read list of files from ftp. Each file is csv but of different schema. How do I achieve this?
Currenlty I have::
tftpfilelist-->onsubjobok-->tfileinputdelimited
I don't have fixed schema for tfileinputdelimited.
11 REPLIES
One Star

Re: Reading a directory of csv file with different schema

Hi,
I am also had same scenario, i cracked it by some other way
- I have converted all the csv to xlsx
- tfilefetch to read the xlsx file from directory
- Iterate each file to tFileExcellworkbookopen component
- then define the schema what you are looking for using tFileExcelSheetInput component. Which will allow dynamically map the columns, then you can do the transformations as you like.
Seventeen Stars

Re: Reading a directory of csv file with different schema

Oh no, there is a bit easier way. You could use the component tFileInputTextFlat which has also the feature of dynamic column positioning (actually I have introduced it at first in this component ;-)
http://www.talendforge.org/exchange/index.php?eid=745&product=tos&action=view&nav=1,1,1
One Star

Re: Reading a directory of csv file with different schema

@jlolling I have no predefined schema. I need to load the csv file as is. There are header which will be column labels. But again there is field called Field Extraction in tFileInputTextFlat. I don't know column name.
@rathinasamyy I can't manually convert csv to excel as it's system generated.(supports only csv)
One Star

Re: Reading a directory of csv file with different schema

Jan - thanks for pointing this out... Great component!
Sugandha - you might want to try it out. In the documentation (Jan's Smiley Happy), it states "It is not necessary to map all fields in the file to a schema column" and "For delimited fields, the position of the field can be automatically configured with the content of a header line"
Seventeen Stars

Re: Reading a directory of csv file with different schema

@Sugandha: The only way to work without any predefined schema is in the Enterprise Release and its called Dynamic Schema. This feature indeed allows you to avoid any knowledge about the schema of the file but if you want to write it in a table - at least now you have to know a schema.
You could also decide to write the whole line of the file in a CLOB column of the database but where is the use case to handle this data?
The way we have described at the moment depends on a predefined schema in your job. You have to know what values from which columns you need.
The mentioned component tFileInputTextFlat allows you to match the columns of your schema (the schema of your flow in the job) by regularly expressions with the columns in the header line.
One Star

Re: Reading a directory of csv file with different schema

since I don't have enterprise version I have made the schema unique. What I have done is:
tftplist->tftpget->tfilelist->tcopy
but I would like to do:
tftplist->tftpget->tinputdeliminated->tpostgresinput
i.e write the input to postgres database.
How do i achieve this?
Four Stars

Re: Reading a directory of csv file with different schema

Sugandha,
After tfilelist use tfileinput delimeted -- then use tpostgressout
Vaibhav
One Star

Re: Reading a directory of csv file with different schema

sanvaibhav, I have different schema. There are many files but they have different schema.
Four Stars

Re: Reading a directory of csv file with different schema

Hi Sugandha,
If you have multiple files and fixed schema, then you can control the flow by analysing the files and based on that analysis use control flow to redirect incoming file to output.
Thanks
Vaibhav
One Star

Re: Reading a directory of csv file with different schema

Could you please post an example? Do you mean analyse = filter by filename?
One Star

Re: Reading a directory of csv file with different schema

Hi,
Can anyone help on this?
My requirement is:
I have two folder input & Output
I need to read all files present in input folder make joins between files and finally generate one output file and this final output file has to store in Output folder.
Kind Regards,
Ram