Why is tHiveCreateTable not a Big Data batch job component?
I find this component in Standard job design.
What can I do to see it in my Palette for a big data batch job?
We have tHiveRow and tHiveCreateTable components in Standard Job, do you want to get supporting hive partition and bucketing in Big Data Batch Spark job?
So far, starting from V 6.4.1, we have already supported writing partitioned tables.
@xdshithanks for your reply. I cannot create a hive table in a Big Data Batch job in Talend Big Data Platform 7.X.
Yes, I would like to create a partitioned table, but right now, I cannot create ANY table in Big Data.
I am concerned with Big Data only.
Since Hive is a Big Data framework, my question is, why is it available in standard jobs, but not Big Data jobs.
Can somebody tell me what I can do to make this available in Big Data?
The bucketing feature is not yet supported by Spark 2.1 itself.
The Apache Spark jira (https://issues.apache.org/jira/browse/SPARK-19256) is in progress with some pull requests opened but not yet targeted for 2.2.
On the Talend side, we won't be able to work this until it's integrated into the Spark core.
If you use Parquet components (format I strongly advise to use behind Hive), you can already partition with Spark and easily combined with Hive DI createTable (declaring it as an external table_).
@xdshicould you please tell me how to use tHiveCreateTable, I can think about advanced features later?
I am looking for a tHiveCreateTable in Talend Big Data Platform, Big Data Batch job.
Are you able to use tHiveCreateTable component to create the hive external table in a DI standard job? And you can call a child spark job by tRunJob component.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences
Take a look at this video about Talend Integration with Databricks