One Star

Loading CSV to Hive when file has " enclosures

Hello,


I am new to Talend. I am trying to create a simple job in Talend that uses tHiveCreateTable component that will create a Hive table in a text file format. The file it points to is a CSV file. What I am doing works fine if the CSV file has simple texts separated by a comma. I am running into issues when the file CSV file has columns enclosed in double quotes. It loads quotes also into each of the column. It gets further complicated when the columns have a comma in between them. For example "$12,436.69". This value gets split into two columns.

 


I tried using set Serde row format in the component, but it give me this error

"Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: com.bizo.hive.serde.csv.CSVSerde"

 

Any ideas on how to resolve this? Or any other ideas on how to hand this?

 

Thanks

 

Tags (2)
1 REPLY
Seven Stars

Re: Loading CSV to Hive when file has " enclosures

Hi,

 

Make sure you are checking the CSV options in your file component properties as shown below. so that it ignores the text encloses as data and it wont load to target.

 

Thanks,
Sid
Please give a Kuto to the post if it is useful
Please put to resolved if it solves your issue.

 

 

11.JPG

Thanks,
Sid
Mark as solution if this resolved your issue