Four Stars

howa to optimise tUniqRow and tSortRow

Is it better to put tSortRow before tUniqRow or vice versa for the best perfermence? Or to use tAgrregateSortedRow instead of tUniqRow? if not how to optimize tUniqRow? knowing that I use "disk option", and how the work crash. I am working on a file of 3 million lines

1 ACCEPTED SOLUTION

Accepted Solutions
Moderator

Re: howa to optimise tUniqRow and tSortRow

Usually you would want to put a tSortRow before a tUnique row. It makes more sense logically to sort your data before finding the unique values, especially if you are using tAggregateSortedRow. In terms of using tAggregateSortedRow or tUniqRow, it would depend on what you are trying to do. tAggregateSortedRow is meant to find metrics based on values and calculations, while tUniqRow is more for finding duplicates. 

4 REPLIES
Moderator

Re: howa to optimise tUniqRow and tSortRow

Hello,

Performance issue is usually caused by the DB connection or the job design, could you please upload some screenshots of your current job design?

Best regards

Sabrina

 

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Moderator

Re: howa to optimise tUniqRow and tSortRow

Usually you would want to put a tSortRow before a tUnique row. It makes more sense logically to sort your data before finding the unique values, especially if you are using tAggregateSortedRow. In terms of using tAggregateSortedRow or tUniqRow, it would depend on what you are trying to do. tAggregateSortedRow is meant to find metrics based on values and calculations, while tUniqRow is more for finding duplicates. 

Four Stars

Re: howa to optimise tUniqRow and tSortRow

 

below the job schema ..... i used sorting on disk in tSortRow and the problem comes from tUniqRow

                              tOracleInput
                                   |
                                   |
tFileInputPosionnel ---- tMap
                                    |
                                    |
                                 tUnit -------------------------->tSortRow------------>tUniqRow------------------>tFileOutputPosinnel
                                   |
                                   |
tOracleInput--------- tMap

Moderator

Re: howa to optimise tUniqRow and tSortRow

If you are running into a Java Heap Space issue you could increase the JVM so you can process more records or you could also use the "Use of Disk" setting on the tUniqRow advanced settings as well. This will process the data with files, making it use less system memory to process data.