One Star

tRedShiftOutputBulkExec - Bulk records into redshift via S3

Hi All,
I'm currently trying to load all of my salesforce data into my redshift cluster. Going from salesforce input for a field like campaign members directly to redshift output only got me about 2.5 rows/second which would take WAYYYYY too long.
I tried using the reshift s3 bulk exec loader but it asks for additional fields such as S3 key and Key password. Additionally the bucket and object key. I created a bucket but have no objects in my bucket so I opened a blank .csv file which I figured could be used for the object. 
When I ran without the key information I am getting an error message saying missing key AFTER the upload took place. When I put in a valid key and security key I get an error message before saying that the keys, salesforce, redshift, etc cannot be resolved to a variable. 
Any advice? It looks like if it could work it would go around 700 rows/s instead of 2-5 rows/s
3 REPLIES
Moderator

Re: tRedShiftOutputBulkExec - Bulk records into redshift via S3

Hi,
On which official version did you get that? What does your job design look like? Could you please upload the whole job design screenshots into forum which will he helpful for us to locate your issue?


Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: tRedShiftOutputBulkExec - Bulk records into redshift via S3

Hi xdshi,
I now tried using "tRedshiftOutputBulkExec" which is going much faster (~200 rows/s).
The job looks like this: 
Salesforce Input -> tMap -> tRedshiftOutputBulkExec
Unfortunately when I do the bulk update which does go fast I get the following error: 
Exception in component tRedshiftOutputBulkExec_1_tROB
com.amazonaws.services.s3.model.AmazonS3Exception: AWS authentication requires a valid Date or x-amz-date header (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: D15A0006E7D7848C), S3 Extended Request ID: hHQpY80ire08FL1gbcLhxLFgWFi29xuw9foE9HRxYROmmFTLPvvZYck54NOQxlswCGcKzHGp7uw=
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1077)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:725)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3710)
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1445)
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1305)
at local_project.leads_0_1.leads.tSalesforceInput_1Process(leads.java:4212)
at local_project.leads_0_1.leads.runJobInTOS(leads.java:4736)
 disconnected
at local_project.leads_0_1.leads.main(leads.java:4581)
Job leads ended at 16:37 29/10/2015. 


It asks for a date time but I have no space to add a date time. 
One Star

Re: tRedShiftOutputBulkExec - Bulk records into redshift via S3

I got the same error message 
objc: Class JavaLaunchHelper is implemented in both /Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/bin/java and /Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/libinstrument.dylib. One of the two will be used. Which one is undefined.
Exception in component tRedshiftOutputBulkExec_1_tROB
com.amazonaws.services.s3.model.AmazonS3Exception: AWS authentication requires a valid Date or x-amz-date header (
Service: Amazon S3; 
Status Code: 403; 
 Error Code: AccessDenied; Request ID: 7EFDC5F788DFF980), 
 S3 Extended Request ID: D2U16vK/bEfm9/e0WOMIJPB9gSxWnQQrDvCYiuTStjVzE6OMqpP634oJl9h4fQ5o80hVGl0rojc=
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1077)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:725)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3710)
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1445)
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1305)
at local_project.om_report_bulk_job_0_1.om_report_bulk_job.tPostgresqlInput_1Process(om_report_bulk_job.java:2314)
at local_project.om_report_bulk_job_0_1.om_report_bulk_job.runJobInTOS(om_report_bulk_job.java:2776)
at local_project.om_report_bulk_job_0_1.om_report_bulk_job.main(om_report_bulk_job.java:2633)

Some ideas how i can fix the issue? But  the load performance is ok. 10721 rows/secon