first of all, I haven't seen any Main Topic for Big Data Discussion, like in your old Forum. Only BD sandbox which currently available. So I decided to ask here.
Any of you can give better idea, which one should we choose when we want to do the Data Ingestion from RDBMS to HDFS/Hive.
Been thinking of these 2 ways, please give the idea which one is better (or any other ways better):
1. In Standard Job: tSqoopImport --component ok--> tHiveLoad
2. In Big data batch Job (Spark) : tXXXInput (RDBMS, such as Oracle/mssql/etc) --main Job--> tFileOutputDelimited (to put to the HDFS) --> Load to Hive from HDFS
or maybe any of you have any better solution?
You can import data from RDBMS to hadoop using sqoop without using tHiveLoad
Please take a look at a related scenario in component reference about:TalendHelpCenter:tSqoopImport
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences
Take a look at this video about Talend Integration with Databricks