Breaking up Hive query extract into multiple files?
I am using a Hive query as my source and the result written in a file on HDFS (7.7GB). My aim is to move this into S3 but there is a file limitation of 5GB on S3. Is there a way for me to break up this file into multiple chunks? tHDFSConnection --> tHiveConnection --> tHiveInput --> tMap --> tHDFSOutput
Re: Breaking up Hive query extract into multiple files?
Hello, The output of tHDFSOutput is a single 7.7 GB file ? Are you executing the job on a Hadoop cluster ? You can take a look at the tELTHive components (tELTHiveInput, tELTHiveMap, tELTHiveOutput) (), the output will be written to a Hive table but the whole job will be executed on cluster. If you're using a cluster with multiple machines, this would generate separate partition files that you can then move to S3.