Running same job multiple times for reading file from same folder

One Star

Running same job multiple times for reading file from same folder

hi,
I have a job which reads file from a folder and process it and load into db.
tfilelist----->iterate---->tfileinputdelimited----->main----->tmap----->main------>tmysqloutputbulkexec
if i run this job using two trunjob components,each tfilelist lists same files and both job will read same files but i want first job to read first file,and secondh job should read second file and etc . How to achieve this scenario.
Moderator

Re: Running same job multiple times for reading file from same folder

Hi,
if i run this job using two trunjob components,each tfilelist lists same files and both job will read same files but i want first job to read first file,and secondh job should read second file and etc

Could you please elaborate your scenario with an example with input and expected output values?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: Running same job multiple times for reading file from same folder

hi ,
Consider i have one months data in a folder named January there are around 20,000 files i have a job which reads a file and does some transformation on some fields and loads it into database. I want the same job to run multiple times and each read different files and load to db. But the job which i have created reads all files and loads multiple times the same file if i run it multiple times. I don't want that to happen, same job running multiple times should read different files , so loading will be faster.
One Star

Re: Running same job multiple times for reading file from same folder

hi,

If jan folder has :
fille1
file2
file3
file4
file5
file6
so my job should run may be twice so one job should read file1, and second job should read file2 so on ..
so problem is since im using tfilelist on eac job they both list same files and process same file twice how to avoid it
Community Manager

Re: Running same job multiple times for reading file from same folder

Hi banu
tFileList will list all the files in the specified folder and then iterate each file one by one whenever you run the job, in order to read a file again if it have been read and process, you can write the file name to a flag file that stores the file name of all processed file , and check the current file with the flag file to see if it have been processed, trigger the next action base on the result. For example:
tfilelist----->iterate---->tFileInputDelimted_1-main--tJavaRow--runIf--tfileinputdelimited_2----->main----->tmap----->main------>tmysqloutputbulkexec--oncomponentOK--tFixedFlowInput--main--tFileOutputDelimited
tFileInputDelimited_1: read all file name from the flag file.
on tJavaRow: compare the current file name with all file names read from the flag file.
if((String)globalMap.get("tFileList_1_CURRENT_FILE").equals(input_row.filename)){
globalMap.put("hasBeenProcessed", true);
break;
}else{
globalMap.put("hasBeenProcessed", false);
}

set the condition of RunIf as:
!(Boolean)globalMap.get("hasBeenProcessed")
on tFixedFlowInput, generate the current file name, append it to the flag file with tFileOutputDelimited.
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Running same job multiple times for reading file from same folder

hi shong,
Thank you for the reply , my doubt is if two jobs are acting on the same folder how can job1 know that job2 has processed the file2 and vice-versa Smiley Sad
Community Manager

Re: Running same job multiple times for reading file from same folder

Hi
In my previous post, the example job shows it will check if the current file has been processed before processing it.
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Running same job multiple times for reading file from same folder

hi shong,

i m trying the solution u mentioned but getting error can u show the settings of tFileInputDelimted_1 component ,
i have given file name/stream:"/opt/etl/fileflag.txt" and schema just :filename string 500
One Star

Re: Running same job multiple times for reading file from same folder

hi shong,

i m trying the solution u mentioned but getting error can u show the settings of tFileInputDelimted_1 component ,
i have given file name/stream:"/opt/etl/fileflag.txt" and schema just :filename string 500

the error is cannot convert from boolean to string
Four Stars

Re: Running same job multiple times for reading file from same folder

hi,

i am also tried this scenario.but i got the error 

cannot convert string to boolean

but

i changed .then also again shows the same error in tjavarow.

how to resolve it.

 

Six Stars

Re: Running same job multiple times for reading file from same folder

hi,

if its about parallel processing to make the process faster

then a simple workaround could be divide the files in two parts and change their names to some particular string so that you can  distinguish files of part one and two

and then use those particular strings as file mask in both the jobs respectively

in this way you can achieve the parallel processing.