I'm trying to read a SQL Server table and load it on a Hive table. I can see that tSqoopImport component can handle an ingestion phase but I cannot achieve the goal to load the data direcly in a Hive table.
I can accomplish the task outside Talend using the sqoop tool with this statement:
sqoop import --connect "jdbc:sqlserver://126.96.36.199:2012;database_name=STG" --username myuser -password mypassword --query "select * from STG.dbo.MYTABLE where \$CONDITIONS" --target-dir /tmp/loadSqoop --hive-import --create-hive-table --hive-table testbigdata.myhivetable --split-by myid
But cannot find anything similar in Talend.
I've extensively searched across the community forum and googled around too with no luck
Is this possible or do I have to change my job behaviour?
You can import data direct into Hive table - use advanced settings of Sqoop component for additional settings, but You can not import in other than text format. So if You want import into tables in Parquet, You must:
- import into text format table
- create parquet table (tHiveRow)
- insert into new_tavble select * from old_table
screenshots not available because weekend and no access to Hive till Monday, but it work
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Find out how to migrate from one database to another using the Dynamic schema
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences