I have built a process flow to extract from a CSV file - all fields are brought in as strings,
runs through a tConvertType where certain fields are converted to Integers, if they fail the conversion then they are filtered through a tMap and inserted into the Reject table. If there are any rejected records then an email is sent.
Duplicates are then caught using the tUniqRow, and any duplicate records are filtered through a tMap and inserted into the Reject table. If there are any duplicate records then an email is sent.
All valid records are then inserted into the output table.
Although this process works, I am sure it is not the most efficient way to process this data, is anyone able to provide any suggestions on how to clean this up, and perhaps remove some unnecessary components / steps?
tmap should not be the choice for tasks that could be achieved using other ways because it is a complex component carrying
so much options in itself but that comes at cost of performance.
you can use javaflex where map is used if the only requirement is to change data flow for the dboutput component.
You can avoid using tmap over here as you are not doing any filtration or any expression check so instead you can use tjavarow component to process the data..
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Pick up some tips and tricks with Context Variables
Learn how media organizations have achieved success with Data Integration
Accelerate your data lake projects with an agile approach