API URL null

Highlighted
Nine Stars

API URL null

 Hi Talend experts


I have got below job which reads next API URL until it finds one also it iterates for different course_ids. I have tweaked the design a bit to run parallel execution(the idea is to use parallel execution option of iterate) to execute multiple calls at a time.

Screen Shot 2018-11-14 at 4.04.35 pm.png

 

StoreCourseID:
//globalMap.put("canvas_id", row1.canvas_id);
globalMap.put("V_API_URL" + row1.canvas_id, "https://swinburneonline.instructure.com/api/v1/courses/"+row1.canvas_id +"/analytics/student_summaries?per_page=100");
globalMap.put("V_LOOP"+ row1.canvas_id, true);

tLoop:
((Boolean) globalMap.get("V_LOOP"+ row1.canvas_id))

tRestClient:

((String) globalMap.get("V_API_URL" + row1.canvas_id))

GetNextUrl
System.out.println("Current URL IS: "+globalMap.get("V_API_URL"+ row1.canvas_id)); (prints correctly)
System.out.println("Rest URL"+ globalMap.get("tRESTClient_1_HEADERS")); (doesn't print and errors out to null)
java.util.List <STRING> strList&nbsp; = ((java.util.Map<STRING>&gtSmiley WinkglobalMap.get("tRESTClient_1_HEADERS")).get("Link");

 

SetNextURlToCurrUrl:

if ((Boolean) globalMap.get("V_LOOP"+row1.canvas_id))
; {
System.out.println("URL1 IS: "+globalMap.get("V_API_URL"));
globalMap.put("V_API_URL"+row1.canvas_id,globalMap.get("next_url"+row1.canvas_id));
System.out.println("URL IS: "+globalMap.get("V_API_URL"));

 

In doing so I have run into problem where Rest URL is always getting NULL when I get it from GetNextURL (tjava) component. Not sure what's wrong. Any help is really appreciated!

 

@rhall_2_0 and @gr44: your input is really appreciated!!


Thanks
Harshal.


Accepted Solutions
Highlighted
Community Manager

Re: API URL null

OK. Here is a way that might help. Put the part of the job which iterates over the web service calls in a separate job. That job should receive a context variable which holds the course_id. Keep in your current job the part of the process which iterates over the course_ids. Then use your new child job to be called by the iterate link supply the course_id. You can then try and execute that child job in parallel.

 

By  doing this you are keeping the individual course queries in the same process. Therefore your "next_url" functionality will carry on the way it is.

View solution in original post


All Replies
Highlighted
Community Manager

Re: API URL null

If you are intending to use parallel execution you need to REALLY understand what you are doing. It won't gain you much here. I suggest you don't use it. Your problem is likely caused by the use of ....

globalMap.get("tRESTClient_1_HEADERS")

This is a single instance of an object that you are using in parallel. As such you have no idea which parallel flow the result will correspond to. 

Highlighted
Nine Stars

Re: API URL null

@rhall_2_0: How Do I do parallel execution to make entire flow faster? Just imagine I have got 5k course id read from fb and iterated to rest api URL and on top pagination happens. So this is making flow really slow. I am able to run only 2k course id for almost 2.5 hours as call happens for each course id and each page inside it. Let alone be remaining ~3k records.

There must be some way of improving performance for current flow. How do I parallelise then for multiple course ids together?
Highlighted
Community Manager

Re: API URL null

OK. Here is a way that might help. Put the part of the job which iterates over the web service calls in a separate job. That job should receive a context variable which holds the course_id. Keep in your current job the part of the process which iterates over the course_ids. Then use your new child job to be called by the iterate link supply the course_id. You can then try and execute that child job in parallel.

 

By  doing this you are keeping the individual course queries in the same process. Therefore your "next_url" functionality will carry on the way it is.

View solution in original post

Highlighted
Nine Stars

Re: API URL null

@rhall_2_0: Thanks for your reply. Not sure if I understood your problem but if you could just show design step here that would be fantastic. But I tried doing context way but it didn’t work as in context it can store one value at a time and I want parallelism(many ids being passed to many flow) to be happening at the same time.
Highlighted
Nine Stars

Re: API URL null

@rhall_2_0: Sorry for the earlier post. I did not quite understand. Then I implemented the way you mentioned and it is working.

 

Screen Shot 2018-11-16 at 10.36.28 am.png

 

It works fine.

 

Not sure what's the best value to parallelise. I am going to test out for 50 and see how it goes.

Highlighted
Community Manager

Re: API URL null

Make sure you test this thoroughly. You *may* find some timing issues, but this is a better way of attempting this.

Highlighted
Nine Stars

Re: API URL null

@rhall_2_0: I found out that there is throttle value set at source and I can’t do parallelism. So per token value I have certain resourcing limit. This is going to slow down everything 😕. However I have asked application team to see if at all I can set parallelism and what’s the best value I can have.

Yes you are right I had timing issues when I was testing for 50 or 100 parallel execution. How to tackle them?
Highlighted
Community Manager

Re: API URL null

I've just carried out a quick test and I don't see this issue in v6.5.1. What version are you using?

Highlighted
Nine Stars

Re: API URL null

@rhall_2_0: I’m using 6.4.1. How to avoid that type of situation?
Highlighted
Community Manager

Re: API URL null

I'm not sure. This *might* be a bug. I built a quick job like this....

 

tRowGenerator -------------------------------------> tFlowToIterate-------------------------> tJava

(Generating a single integer sequence) .       (Running 10 in parallel)                  (Printing the numbers)

 

This produced an output of .....

 

3

9

1

4

10

7

8

2

6

5

 

I extended the test to 100 with 100  in parallel and it still looked ok.....although I didn't spend too much time checking.

But I did this in v6.5.1. Try something similar and see if you get problems. If you do, it sounds like a bug in v6.4

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

Talend API Designer – Technical Overview

Take a look at this technical overview video of Talend API Designer

Watch Now

Getting Started with APIs

Find out how to get started with APIs

Read