I'm trying to connect to an existing hive database. I am able to do that using both thive and tjbdc components. My question is, what are the limitations of using tjdbc instead of thive? would there be any performance differences when dealing with huge data?
The tJDBC components are here to help you integrate your specific JDBC Connector for your Database.
tHiveInput is the dedicated component to the Hive database (the Hive data warehouse system). It can execute a given HiveQL query in order to extract the data from Hive.
Are you trying to create a spark job?
Yes, I m trying to build to a big data batch job running on spark. Thive components work fine in the standard jobs but they dont work in big data jobs, so, as an alternative I m using tjdbc components . So I would like to know if there would be any issues by using tjdbc in big data jobs instead of thive components to connect to hive in terms of performance.
The first 100 community members completing the Open Studio survey win a $10 gift voucher.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences
Take a look at this video about Talend Integration with Databricks