I am using the tSystem component to run my python script in my Talend job. I would like to store the output of my python script to a context variable I defined in the Talend job, for every iteration the tSystem runs. How would I go about doing this?
Solved! Go to Solution.
Thank You for your reply. I am planning to store a string as output from my python script. The string does not have that many characters. I tried create and set an environmental variable in my tSystem component to my Python output but I kept on getting a null value. I then tried connecting a tJava component to my tSystem component and in the java code, set my context variable = (String)globalMap.get("tSystem_1_OUTPUT").
Just as a quick clarification, my python script takes in each file from fileList and parse its filename and returns it out from tSystem. By setting the context variable to the above code, I get the desired output but after my script parses the second file in the file list. That is, there is a lag from which I get the output I need. Would you know how I can fix this? Is it something with the return time in my python script?
Thank you for your response once again. My python does dump to standard output using sys.stdout.write. I do not have any sleep commands in my python script that causes the lag before the next iteration of the tFileList, however the lag still exists. Is there a way in which I could suspend the tFileList component's next iteration until my tSystem component returns the output to the global var?
Just as a quick note, my script does have loops and runs in time O(n) with the constant "c" being quite small despite the no of files as the script runs for each file specifically for each iteration. Could the time complexity of this script be an issue in which only linear-time complex scripts would not produce any lag. Other than that, the script is a normal pythonic script that has only one separate function in it.
Sorry for misunderstanding your previous response. The time.sleep() function in python did not work however, the tSleep component did the job. From this, I would only assume that the kernel only waits for the receive status and dumps the output immediately but i could also be wrong. The tSleep method is a way however I would say it is not efficient due to the minimum amount of time I could sleep with tSleep component is 1 second. By pushing thousands of files from my local file system to hbase, this could take quite a while for each file to pass, leave alone the size of the file itself.
If there is any other alternative other than the two methods you mentioned below, please do advise me, otherwise thank you so much for your help!
Watch the recorded webinar!
Accelerate your data lake projects with an agile approach
Create systems and workflow to manage clean data ingestion and data transformation.
Introduction to Talend Open Studio for Data Integration.