how to check the flag file from S3 folder

Six Stars

how to check the flag file from S3 folder

Hi Everyone,

 

I have a scenario in which i am running a job from Talend which is pulling all the files from AWS S3 folder.But before doing so ,i need to check if the Flag file is present there then only i can proceed.

Suppose ,I have files like this:

 

File_trigger.Flag

abc.csv

bcd.csv

efg.csv

 

If the file_trigger.flag is present on AWS S3 then only my job starts pulling the files abc.csv ,bcd.csv etc...

 

I have made my job which is pulling all the files from S3 but now this is a new scenrio. Could you please let me know how to proceed with that?

 

Regards,

Mohit

Forteen Stars

Re: how to check the flag file from S3 folder

you could install aws cli commandline client, and setup it

 

after use tSystm with command 

 aws s3 ls s3://bucket/File_trigger.Flag

 

tSystem will return 0 if file exists, and 1 if not, use it in tRunIf trigger as value

((Integer)globalMap.get("tSystem_1_EXIT_VALUE"))>0
-----------
Six Stars

Re: how to check the flag file from S3 folder

Hi Vapukov,

 

AWS CLI command line is installed already.

Still i need to connect to AWS before using tsystem command.I know the connection and other things.

But How tsystem will fetch AWS connection details and trigger the above command from Talend.

 

regards,

Mohit

Forteen Stars

Re: how to check the flag file from S3 folder

if you have installed aws-cli, you know - it self-contained, it do not need anything for work, you must configure it separate

 

so in you job:

preJob - connection

Job:

- tSystem

- and after RunIf all other

-----------
Six Stars

Re: how to check the flag file from S3 folder

Thanks,I will try it Smiley Happy

Employee

Re: how to check the flag file from S3 folder

Hi,

 

    Below is another solution for your query. I have setup two sample files in S3 for testing.

image.png

 Below is the output of the flow with only flag file as output.

image.png

 

 

Below are the details of each components. I have added tLogrow components for printing and you can remove them in actual code.

image.png

 

image.png

 

image.png

 

 

Both approaches provide same result but one is employing command line method and other through GUI. 

 

If the reply has helped you, could you please mark the topic as resolved? Kudos are also welcome :-)

 

Warm Regards,

 

Nikhil Thampi

 

Highlighted
Forteen Stars

Re: how to check the flag file from S3 folder

I exclude this variant because as is it requires double reading for all files (as on picture) and with 1000 files it could be long

 

 

but, if using the full Flag file name as a Key prefix in tS3List - it could reduce the number of iteration to 1

unfortunately, right now I have not S3 access and cannot test it, but it could be a best choice if work

-----------
Employee

Re: how to check the flag file from S3 folder

@vapukov - Perfect idea !

 

I added the flag file name in the Key prefix and it reduced the number of iterations to 1.

 

image.png

 

 

The modified job flow is as below.

 

image.pngJob Output

 

 

Warm Regards,

 

Nikhil Thampi

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

6 Ways to Start Utilizing Machine Learning with Amazon We Services and Talend

Look at6 ways to start utilizing Machine Learning with Amazon We Services and Talend

Blog