Hi All ,
I am reading zip files from a folder unzipping that using tFileUnarchive. After Unzipping , i am reading only xml files using tfilelist after tFileUnarchive. I am iterating 2 files in parallel. But when i do so my whole process runs as many times as many are the files.
My workflow looks like this:
tfilelist --> tfileunarchive --> tfilelist --> tfileinputxml --> tlogrow
i have 2 zip folders one contains 2 files (1 xml and 1jpeg) and other zip contains ( 1 xml ). I only want to process xml files after unzipping. When i iterate it with number of parallel execution as 2 it runs the workflow twice for 3 files.
Output that i am getting through this workflow is :
Please find attached screenshot of the workflow.
A parallelization-enabled Iterate connection allows the component that receives threads from the connection to read those threads in parallel.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences
Take a look at this video about Talend Integration with Databricks