Six Stars

Multiple empty files are created when loading data into HDFS using spark

Task:

I have group of messages in queue and they are consumed by consumer and get latest record among using spark streaming job and loaded into HDFSCapture.PNG

 

Issue:

1. Wanted to save data into a file as .csv but some number pattern is added to file name which is given in tfileOutput component

 

Capture.PNG

  

Example: give below i wanted to save data in maindata.csv but it is creating maindata.csv-1522775132000 folder and saving data in that folder

Capture.PNG

2. Creating 14 empty partitions files and inserting data into 15 partition file

 

Expected Output:

1. Can i insert data into maindata.csv ??

2. Can i determinate partitions according to data ??

 

Thanks in advance!!