Alternative to globalmap in talend bigdata spark

Five Stars

Alternative to globalmap in talend bigdata spark

Hello

I have a requirement to use globalMap in Spark bigdata suit.

I understand in 6.x version this is not supported , Are there any alternatives. I need to populate the globalmap with some static values and reference them in the tmap transformation. 

there are about 200 entries for the map. i can get that job working in DI but not in Bigdata enc.

My flor is 

file --> tjavaflex(populate globalmap) --> in another subjob access it in tmap.

 

Thanks!

 

Moderator

Re: Alternative to globalmap in talend bigdata spark

Hello,

In fact we have no access for "globalMap" in Spark Batch mode.(because of the different implementation way on Spark Batch compared to DI)
The reason is that it's difficult to have a synchronous "global" variable in distributed mode and in addition the globalMap is not totally serializable by default.

Here are some articles about context in spark job.

https://community.talend.com/t5/Design-and-Development/Using-the-Implicit-Context-Load-Feature-for-a...

https://community.talend.com/t5/Architecture-Best-Practices-and/Spark-Dynamic-Context/ta-p/33038

Hope it will shed some light on your requirement.

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Five Stars

Re: Alternative to globalmap in talend bigdata spark

Hello

I ran into that article too but my requirement dosent change.

i need to pass a hashmap to the spark jobs if not globalmap then some other way .

I tried using hashmap as context variable but couldn't  cast it back to hashmap from string  from a context value in sparkjob.

my flow was

Inputfile --> tjava(populate hashmap and add to context varibale the hashmap) --> pass context to spark job --> tmap access hashmap (error: cant cast string to hashmap) 

any suggestions?

Moderator

Re: Alternative to globalmap in talend bigdata spark

Hello,

Would you mind posting your current job setting screenshots on forum? Which will be helpful for us to get more information.

Please mask your sensitive data.

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Five Stars

Re: Alternative to globalmap in talend bigdata spark

my use case was to load a static business file which is of the forum k,v

The problem i cannot use it as a map lookup because  the different rows attributes, pass different attributes some static some non static to find a value . So to calculate x if i pass (col1) from main flow , to calculate y i pass col2. etc. so i cannot use any set of same columns as part of map key.

I resolved this by writing procedure to read this file and storing it as a string buffer then parse this string in loop with record separator and try to match and return the matched result.

 

 

 

Four Stars

Re: Alternative to globalmap in talend bigdata spark

I have a exactly same issue.

I have a csv file that I need to load into org.apache.commons.collections.map.MultiKeyMap() and pass it to bigdata context. So, I am using a Standard job to read CSV file-> tJavaRow to build a MultiKeyMap and store into a context called lookupMap of type Object.

After successfully loading the context, for testing I wrote a tJava to read the lookupMap and cast it to a MultiKeyMap, but I get an error saying that cant cast String to MultiKeyMap.

lookupMap in Context is defined as Object and not String. So why do I get the cast error?

Attached are the screenshots

Job, 

image.png

context,

image.png

tJavaRow (sets the context),

image.png

and tJava (reads the context).

Looks like the first line is the one is getting the cast error.

image.png

Here is the error.

Looks like a bug to me.

image.png

 

 

Four Stars

Re: Alternative to globalmap in talend bigdata spark

As an FYI, if it is Standard job to Standard job, I already have a solution, I can use the global map to save my MultiKeyMap.

But I have to pass this to a Big data job, as you mentioned globalMap doesn't work for big data job. So looks like context is the solution but I get this error.

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch

Downloads and Trials

Test drive Talend's enterprise products.

Downloads