AWS S3 file properties

Highlighted
Five Stars

AWS S3 file properties

Hello,

 

I would like to know if there is a way to get file properties (like size, timestamp) of a file places in S3 bucket. 

 

I have tried the following but unable to get file size and timestamps. Also does this differ if the file in S3 bucket is .gz file.

 tS3 Connection

tS3FileList --iterate--> tFIleProperties --iterate--> tIterateToFlow --main--> tLogRow

Thanks,

A

Moderator

Re: AWS S3 file properties

Hello,

So far, you have to retrieve your files from S3 and keep them on local firstly.

The work flow should be: tS3Connection-->tS3Get-->tFileProperties

Best regards

Sabrina

 
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Highlighted
Five Stars

Re: AWS S3 file properties

Thanks Sabrina for your reply. If that's the only way then I'll take it..

 

Highlighted
Six Stars

Re: AWS S3 file properties

As we have a client restriction not to download the file locally,canu please let us knw if there is a way other than downloading? or download/read file properties and delete in one component soo that we cant eve break this flow .ultimatley we shd nt have any control on seeing the file content.

Highlighted
Seven Stars

Re: AWS S3 file properties

Hi Sabrina,

tS3Connection-->tS3Get-->tFileProperties

In this approach once file gets downloaded(using ts3Get) it will have current time stamp on it. we will not get the actual time stamp which it has in S3 bucket.

how to get the time stamp of file without downloading from server?

Any inputs?

Thanks,
Sid
Thanks,
Sid
Mark as solution if this resolved your issue
Highlighted
Seven Stars

Re: AWS S3 file properties

Hi Folks,

 

Any suggestions here?

 

Thanks,

Sid

Thanks,
Sid
Mark as solution if this resolved your issue
Highlighted
Seven Stars

Re: AWS S3 file properties

Hi Guys,

 

I got the work around for this issue. It's using S3 Command line.

 

PFA job,component and console output images.

 

Thanks,

Sid

 

 

Thanks,
Sid
Mark as solution if this resolved your issue
Highlighted
Employee

Re: AWS S3 file properties

Hi Sid

I tried to follow your approach, but I get an error...see the thread : https://community.talend.com/t5/Design-and-Development/unable-to-execute-command-using-tSystem-compo...

 

 

Highlighted
Four Stars

Re: AWS S3 file properties

Hi,

 

I have got same constraints. Cannot download the file locally. 

 

We can use AWS CLI however then go off from using the components. As a suggestion can we have something in S3Get or a seperate component to read the file properties on S3?

Highlighted
Eight Stars

Re: AWS S3 file properties

Bumping this thread - has anything changed in the last 2 years? I can't see anything in the v7.0 docs.

 

I want to check to see if the files all have timestamps that are close together, i.e. there are no files left over from a previous day. There are hundreds of files and they are all many gigabytes so I can't download them!

Highlighted
Eight Stars

Re: AWS S3 file properties

Additionally, all of the suggestions of "download the file and check the timestamp" are no use since the tS3Get component gives the file a timestamp of the time the download was done, it does not replicate the S3 timestamp of the file onto the local file.

Highlighted
Eight Stars

Re: AWS S3 file properties

SOLVED!

 

If your tS3List component is called, for example, tS3List_1, simply add the following entry to a Date column in the subsequent tIterateToFlow:

 

objectSummary_tS3List_1.getLastModified()

 

This is relying on undocumented internal variables. I have tested it in Talend v6.3 and 7.1. Another drawback is that if you copy and paste a pair of components, the globalMap.get("tS3List_1_CURRENT_KEY") call will be updated with the new component name, but the internal variable objectSummary_tS3List_1 will not be.

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog