Load multiple delimited files into temp table and then merging into main redshift output table

Eight Stars

Load multiple delimited files into temp table and then merging into main redshift output table

Hi @rhall_2_0 and other Experts

 

I have got delimited .tsv files generated from one of the job. I need a way where I can read files one by one and output into Redshift DB table.

 

My job initial looks like this.

 

Screen Shot 2018-12-13 at 1.13.52 pm.png

 

Is there a way I can create multiple temp tables depending on my iteration and then later merge all these temp tables to output into main redshift table?

 

Thanks

Harshal.

Thirteen Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@Parikhharshal,if you were using the enterprise edition.

 

1) take the current file name from tFileList and remove the .csv or .tsv and use that name as temp take and action on table is create. the first flow is your design as shown.

 

2) in the second flow take tDBlist and list the table and iterate and use tOracleInput and toracleoutput .

 

Manohar B
Eight Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@manodwhb: Thanks for your reply. Sorry I didn’t understand your second step. What do you mean by using oracle input and oracle output with dblist? Have you got any design which you can share?
Thirteen Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@Parikhharshal,what was the DB type in tDBoutput?

Manohar B
Eight Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@manodwhb: it’s redshift.
Thirteen Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@Parikhharshal,ok..for merging you can use tRedshiftRow to insert from temp table to main table by using query.Untitled.png

Manohar B
Eight Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@manodwhb: Are you considering only one file? I have many files which I want to insert into temp tables and then merge all temp table into main table.

The job you showed will only do for one file and create one temp table?
Thirteen Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@Parikhharshal,with multiple files also that will work in tDBOuput the action on Table should be create and take the filename and remove .csv and then use that as a table name to create.

 

tDBrow ..you need to use the above table name as temp table to insert data from temp table to main table by writing query. 

Manohar B
Eight Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@manodwhb: Thanks a lot for your reply. I understand now.

According to your response, I’m assuming tdboutput will create multiple tables for multiple files.

How would I have to write the query in order to get all dynamic table names to read from it?

Is there anyway I can parallelise this whole thing so that simultaneously it writes to multiple tables?
Thirteen Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@Parikhharshal,

we the below type of query in tDBrow,for the stage table you need to take current filename .

 

insert into category (select * from category_stage);

Manohar B
Eight Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@manodwhb: Thanks for your reply. However, how would you write for multiple temp files and multiple temp files?
Thirteen Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@Parikhharshal,when you use on component ok to tDBRow from tDBOutput,tDBrow also execute with different table names to insert data into main table from temp/stage tables. 

Manohar B
Eight Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@manodwhb: I tried changing my job flow little bit and now everything comes straight into table and then I am doing other things.

 

Screen Shot 2018-12-17 at 12.05.21 pm.png

 

My tDBoutput component config looks like below:

 

Screen Shot 2018-12-17 at 12.06.12 pm.png

 

It keeps giving me an error here on this component and says: 

 

java.sql.SQLException: [Amazon][JDBC](11220) Parameters cannot be used with normal Statement objects, use PreparedStatements instead.

at com.amazon.exceptions.ExceptionConverter.toSQLException(Unknown Source)

 

Is it because I am using table name based on context?

Highlighted
Thirteen Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@Parikhharshal,it looks like table name context,as part of testing just use the one file and hard code the table name instead of context?

Manohar B
Eight Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@manodwhb: The context value will create multiple tables for me in a loop.

 

globalMap.put(

"tempTable",context.course_id+"-"+context.student_id

);

 

Course_id  Sutdent_id

          820   14345

          820   43555

          623    34664

          445    35656

and so on.... The table will be 820-14345, 820-43555, 623-34664, 445-35656. There is no issue with context.

 

 

Thirteen Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@Parikhharshal,instead of tDBOutput,use tDBRow and write a query to insert data.

 

 

Manohar B
Eight Stars

Re: Load multiple delimited files into temp table and then merging into main redshift output table

@manodwhb: If you see my job flow, tMap is connected to tDBoutput. So what query I can write using tDBrow? I do not think that's really possible. What are your thoughts?

Tutorial

Introduction to Talend Open Studio for Data Integration.

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.