Issue with tS3List component

Not applicable

Issue with tS3List component

I'm currently using Talend Open Studio Data Integration 5.5.1.  I've created a job that first creates an S3 connection using tS3Connection.  Then, I'm using tS3List to list all of the objects in a specific bucket.  The tS3List component returns exactly 43,000 objects.  But, based on the list of objects returned, I believe it's cutting off the results. When I look at the actual data returned, I see that all of the objects in the list are coming back in alphabetical order and the object list does indeed seem to cut off with objects starting with a W.  Although I can't get an exact count, I believe this bucket has between 43,000 and 44,000 objects.  I know that Amazon's S3 APIs return 1000 objects per set and the tS3List component is able to page through each set of results.  However, I think what's happening is that it's not returning the last page of objects, which contains less than 1000 objects.  
Is this a known bug, or is there a configuration item that i'm not setting to deal with this?
thanks!
Moderator

Re: Issue with tS3List component

Hi,
Could you please upload your job setting screenshots into forum which is helpful for us to address your issue.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Not applicable

Re: Issue with tS3List component

Sure, here are the screenshots:
Moderator

Re: Issue with tS3List component

Hi,
Why did you use tRowGenerator component in your job design?
Have you checked component reference:https://help.talend.com/search/all?query=tS3List&content-lang=en with related scenario?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Not applicable

Re: Issue with tS3List component

This job is just a test job, to see how these S3 related components work.  I had previously been using some components that are on Talend Exchange.  So, I wasn't too concerned about using tRowGenerator vs tIterateToFlow at this time.
However, here's a screenshot of the same job using tIterateToFlow. I have the same issue, the tS3List component is returning exactly 43,000 objects.  I can also say that we've added files to this bucket in the last few days, yet the tS3List component has been consistent in returning exactly 43,000 rows.  So, that goes back to my original question.  Is the tS3List component not returning the last page of results, if it's less than 1,000 total objects?
Not applicable

Re: Issue with tS3List component

So, anyone have any updates or thoughts? Is this a real bug in the component that requires a JIRA ticket to resolve?
Moderator

Re: Issue with tS3List component

Hi,
Could you please open a workitem issue of DI project on Talend Bug Tracker? Our developer will check it to see if it is a bug.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Not applicable

Re: Issue with tS3List component

I've created TDI-30918 in JIRA to track this issue.
One Star

Re: Issue with tS3List component

Has anyone resolved this? It appears that Amazon S3 limits the "GET" operation of a bucket to 1,000. It's unclear (at least in the link below) if the "max-keys" parameter can be set to an amount greater than 1,000. Another option would be to set a "marker" which would start the next search at the 1,001 object in alphabetical order.
Amazon List Size Limit