THiveInput throws an exception:java.io.IOException

One Star

THiveInput throws an exception:java.io.IOException

Hello,
I am trying to execute a simple HIVE query with a select statement from Talend. The Hive connection succeeds and the job fails on trying to execute this query.
ENVIRONMENT:
Talend Big Data version: 5.2.0 Windows XP
Connecting to Apache 1.0.0 (Hive 0.9.0), Connection embedded
Thanks in advance for you help!
Exception:
java.io.IOException: Cannot run program "null/bin/hadoop" (in directory "C:\Talend\BigData\TOS_BD-r92826-V5.2.0"): CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessBuilder.start(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:267)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187)
at lc_d_a.hivepreprod_0_1.HivePreProd.tHiveInput_1Process(HivePreProd.java:702)
at lc_d_a.hivepreprod_0_1.HivePreProd.tHiveConnection_1Process(HivePreProd.java:447)
at lc_d_a.hivepreprod_0_1.HivePreProd.runJobInTOS(HivePreProd.java:1759)
at lc_d_a.hivepreprod_0_1.HivePreProd.main(HivePreProd.java:1624)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessImpl.create(Native Method)
at java.lang.ProcessImpl.<init>(Unknown Source)
at java.lang.ProcessImpl.start(Unknown Source)
... 15 more
Moderator

Re: THiveInput throws an exception:java.io.IOException

Hi,
Cannot run program "null/bin/hadoop"

From the error info, please make sure the Environment variables is correct.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: THiveInput throws an exception:java.io.IOException

Thanks for your replay.
Actually, it's not the problem with the environment variables.
It seems the neither THiveConnection nor the THiveInput connects to the remote server, even though I specify the host and port for the remote connection. Instead, it tries to execute a query locally on my Windows workstation.
How can I make Talend know, it needs to connect to a remote HIVE Thrift server...?
Cheers,
Agnieszka
One Star

Re: THiveInput throws an exception:java.io.IOException

I have identified the problem.
I was using the "embedded" connection. The Talend's job was showing the connection was fine but in the reality the generated Java code had a connection string, omitting my host and port specified in the settings. As a result, Talend was trying to execute a HIVE query locally on my Windows machine.
Why is "embedded" wrong? Why the tool pretends the HIVE remote connection worked fine? Why it tries to run HIVE query locally even though I have specified the remote host and the port? I would consider it a bug...
Community Manager

Re: THiveInput throws an exception:java.io.IOException

I have identified the problem.
I was using the "embedded" connection. The Talend's job was showing the connection was fine but in the reality the generated Java code had a connection string, omitting my host and port specified in the settings. As a result, Talend was trying to execute a HIVE query locally on my Windows machine.
Why is "embedded" wrong? Why the tool pretends the HIVE remote connection worked fine? Why it tries to run HIVE query locally even though I have specified the remote host and the port? I would consider it a bug...

Please report a bug on our bugtracker, export your job and attach it!
Thank you!
----------------------------------------------------------
Talend | Data Agility for Modern Business
Employee

Re: THiveInput throws an exception:java.io.IOException

Hello,
Here is the explanation:
You have two different ways to connect to hive: the standalone mode and the embedded mode.
The standalone mode is a direct JDBC connection to the Hive server. The Hive server usually runs on the port 10000.
The embedded mode is a kind of indirect connection since a hive server is embedded in your client job. You then connect to the hive metastore through Thrift. The Thrift server doesn't run on the same port.
Finally, in order to fix the issue you have met above, you would have to specify the jobtracker (there is an option in the components).
HTH,
Rémy.