How to enable server side encryption in S3 when using Spark Local?

Question

I want to use Spark Local with Amazon S3, and need to have server side encryption enabled in S3 when creating the target file. When using EMR or tHDFSConfiguration, I can set this property in the tHDFSConfig connector using fs.s3n.server-side-encryption-algorithm, but when using Spark Local there is no tHDFSConfiguration. How can I set this property?

 

I've tried writing a local file and then using tS3Put, but this throws a compile error as there seems to be a jar collision when a Spark Local Job and a standard Job with tS3* are coupled together.

 

Answer

You can set Hadoop configuration properties in the Spark Advanced Properties by adding the spark.hadoop. prefix. In this case, it should be spark.hadoop.fs.s3n.server-side-encryption-algorithm, and it then should be automatically injected into any Hadoop configurations that the Spark local Job creates.

Version history
Revision #:
2 of 2
Last update:
‎11-15-2017 07:05 PM
Updated by:
 
Labels (1)
Tags (2)