I have group of messages in queue and they are consumed by consumer and get latest record among using spark streaming job and loaded into HDFS
1. Wanted to save data into a file as .csv but some number pattern is added to file name which is given in tfileOutput component
Example: give below i wanted to save data in maindata.csv but it is creating maindata.csv-1522775132000 folder and saving data in that folder
2. Creating 14 empty partitions files and inserting data into 15 partition file
1. Can i insert data into maindata.csv ??
2. Can i determinate partitions according to data ??
Thanks in advance!!
One solution option for Issue-1 is to check the 'Merge result to single file' option in tFileOutputDelimited component properties. Set the property 'Merge File Path' to your file path for maindata.csv.
This creates a file with a name of your choice, in the path defined by you, with all the part- files data merged into one file. Optionally you could remove the source directory and/or override target file.
Hope this helps.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences
Take a look at this video about Talend Integration with Databricks