One Star

In Talend BigDataBatch job Parquet DataType Issues

Hi All
I'm new to Talend. Trying to fetch data from a parquet file in HDFS and apply transformations on in it and store it as a separate parquet file in HDFS.
Error: java.lang.ClassCastException: parquet.example.data.simple.LongValue cannot be cast to parquet.example.data.simple.BinaryValue
at parquet.example.data.simple.SimpleGroup.getString(SimpleGroup.java:121)
at parquet.example.data.GroupValueSource.getString(GroupValueSource.java:32)
at local_project.hadoop_1_0_1.hadoop_1$TalendParquetInputMapper_tFileInputParquet_1.map(hadoop_1.java:450)
at local_project.hadoop_1_0_1.hadoop_1$TalendParquetInputMapper_tFileInputParquet_1.map(hadoop_1.java:1)
at org.talend.hadoop.mapred.lib.ChainMapper.map(ChainMapper.java:63)
at org.talend.hadoop.mapred.lib.DelegatingMapper.map(DelegatingMapper.java:44)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
my main aim is in BigDataBatch job  tFileInputParquet input is take from hive parquet table it has  BigInt datatype for some columns,when i retrieve schema in Talend Bigint change to BigDecimal, BigDecimal is not supported by Talend, so i changed to String or Long and then typecast to Long in tMap. The main reason to conversion is the output parquet table has some columns are Bigint, so when i load directly it wil through error, so i am typecasting String to long at tmap ouput Schema columns and run in server then it was giveing error like (Error: java.lang.ClassCastException: parquet.example.data.simple.LongValue cannot be cast to parquet.example.data.simple.BinaryValue). when i loading data from  tFileInputParquet to tFileOutParquet ,i.e parquet to parquet getting those errors.
And also  when the datatype of both tFileInputParquet and tFileOutParquet has FLOAT,DOUBLE,DATE,LONG Datatypes in Talend it is giveing error, But it is supporting only STRING and INT. when datatypes are FLOAT,DOUBLE,DATE,LONG throwing error like, i.e The method getLong(String, int) is undefined for the type Group. And,i have added some screen shorts below which is releated to errors.

Where am i going wrong? Any help will be appreciated.
Thanks in advance.
Here is the screenshots
2 REPLIES
Moderator

Re: In Talend BigDataBatch job Parquet DataType Issues

Hi,
Thank you for your post! We can't see the screenshot on our side. Could you attach it on the forum, please? That would be great.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: In Talend BigDataBatch job Parquet DataType Issues

Hi sabrinaa,
i couldn't attach the screen shots what so ever.Tell me how to attach,
 my screen shorts are around 23KB to 10KB in png format but it was not loading.