Vertica Loading Very Slow

One Star

Vertica Loading Very Slow

Hi all,
I'm using TOS DI 5.2.2 in Red Hat. I have 25,000 files to load. my jobs as follows:
tfileList----> tFileInputDelimited---->tMap---->tVerticaOutput
Action on table is "Insert on table"
it's taking 35min. to load 50files!!!! and each file size is less then 5MB. But same job if i write it to file (instead of vertica) it's taking less time!!
What i'm missing here..!! i need to change any settings!!
Please suggest solution
Regards,
Akki
Four Stars

Re: Vertica Loading Very Slow

Hi Akki,
What is the need of tMap? if there are no joins, you can replace it by tJavaRow to add new columns or do some operations...
What is the commit size and batch size you have given? is it set to auto commit?
vaibhav
One Star

Re: Vertica Loading Very Slow

Inserting may be slow because referential or other constraints are being evaluated by the database prior to committing these records...
You can also try increasing your java heap size to make more memory available for processing.
As an alternative, use the tVerticaOutputBulk and tVerticaBulkExec components for faster loading. They're designed to load larger datasets faster, and typically leverage other database utilities on the back-end. For more info, see
https://help.talend.com/search/all?query=tVerticaOutputBulkExec&content-lang=en
One Star

Re: Vertica Loading Very Slow

hi sanvaibhav,
I'm using tMap because there are two flows from tMap. one flow insert all fileds from input files to vertica table.
other flow just takes one required field from file and insert that field to other file.
and i have given commit every 100rows in tverticaoutput other then that no settings i'm using in that component.
i don't tthink tMap is effecting my job speed in this case. because as i already mentioned if i replace vertica component with file output component, then its processing in high speed.
Please let me know any solution.
Hi willm,
As Vertica told if you have large file then go for Bulk insert. but in this case each file size is less then 5MB.
Regards,
Akki
Four Stars

Re: Vertica Loading Very Slow

You can increase number of records to 1000 and check the performance, every commit cycle takes time... we can reduce the commit cycle by increasing the number of rows...
- Also try giving auto commit option, it would take its default batch size and can observe the difference...
you can use tChronometerstart and tchronometerstop component to check exact execution time...
Whether single job is loading all the files or you have multiple subjobs to do this?
vaibhav
One Star

Re: Vertica Loading Very Slow

hi sanvaibhav,
           Ya single job is loading all files to vertica. and where is autocommit option in tVerticaoutput Component?? 
Regards,
Akki
Moderator

Re: Vertica Loading Very Slow

Hi,

Have you tried to use
tVerticalConnection in your job design. The auto commit option is available in tVerticalConnection component.
BTW, What's your job row rate?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: Vertica Loading Very Slow

hi sabrina,
  Thank you for reply. I have tried with tverticaConnection. but problem still same. Now i think we got solution for this.
Our vertica table have sequence number generation, but before this was not there. so we removed sequence for that table and checked loading, loading got fast!!
now still i don understand why sequence number will effect loading to table from Talend tool!! Is there any possibilities to use vertica sequence in tverticaoutput component.
Advice : If we cant use sequences of vertica in our talend job, please try to add this feature in upcoming release.
Regards,
Akki
One Star

Re: Vertica Loading Very Slow

Hi i am facing an issue in Talend Insert  i am using Oracle 11g to insert data and inserting data in 5 tables, out of 5 table Talent Insert batch able to insert but for 1 table it is taking 10 times more than all table inserts even records to be Inserts are same in all table.
Can any one help me and find out solution for the same.
Seventeen Stars

Re: Vertica Loading Very Slow

Vertica is a column based database and it is typically behaviour to be very slow in single dataset statements. 
To improve the performance I suggest you load at first the input data without any checks into the vertica database and use pure SQL statements to fill the (staging-) data into your target table. 
Column based database will usually works best with bulk loads instead of insert/update row by row. 
One Star

Re: Vertica Loading Very Slow

hi,
   thank u for reply Jlolling. Are you suggesting me to write a script to load files instead of talend job??
Regards,
Akki
Four Stars

Re: Vertica Loading Very Slow

Hi Akki,
Do you have indexes on your target vertica table? Because index would slow down the performance of inserts...
Vaibhav
One Star

Re: Vertica Loading Very Slow

hi all,
   Using script we loaded around 13 crore records data to 45columns table in 6min. Can we achieve this using talend jobs?? This what our concern now. because we have designed job to load table so if its not giving results as expected then we have to load uisng script only right?
Vibhav,
   we don have index on table, it columnar database so we are not creating any index for table.
Regards,
Akki
Four Stars

Re: Vertica Loading Very Slow

Hi Akki,
Loading data through database scripts is lightning fast and don't have much limitations... Similar performance you can't expect from ETL tool specifically based on Java... I would recommend to go for the data loading through script only, but you can enhance script to perform some auditing and logging using sp..
Vaibhav