I have a requirement to use globalMap in Spark bigdata suit.
I understand in 6.x version this is not supported , Are there any alternatives. I need to populate the globalmap with some static values and reference them in the tmap transformation.
there are about 200 entries for the map. i can get that job working in DI but not in Bigdata enc.
My flor is
file --> tjavaflex(populate globalmap) --> in another subjob access it in tmap.
In fact we have no access for "globalMap" in Spark Batch mode.(because of the different implementation way on Spark Batch compared to DI)
The reason is that it's difficult to have a synchronous "global" variable in distributed mode and in addition the globalMap is not totally serializable by default.
Here are some articles about context in spark job.
Hope it will shed some light on your requirement.
I ran into that article too but my requirement dosent change.
i need to pass a hashmap to the spark jobs if not globalmap then some other way .
I tried using hashmap as context variable but couldn't cast it back to hashmap from string from a context value in sparkjob.
my flow was
Inputfile --> tjava(populate hashmap and add to context varibale the hashmap) --> pass context to spark job --> tmap access hashmap (error: cant cast string to hashmap)
Would you mind posting your current job setting screenshots on forum? Which will be helpful for us to get more information.
Please mask your sensitive data.
my use case was to load a static business file which is of the forum k,v
The problem i cannot use it as a map lookup because the different rows attributes, pass different attributes some static some non static to find a value . So to calculate x if i pass (col1) from main flow , to calculate y i pass col2. etc. so i cannot use any set of same columns as part of map key.
I resolved this by writing procedure to read this file and storing it as a string buffer then parse this string in loop with record separator and try to match and return the matched result.
I have a exactly same issue.
I have a csv file that I need to load into org.apache.commons.collections.map.MultiKeyMap() and pass it to bigdata context. So, I am using a Standard job to read CSV file-> tJavaRow to build a MultiKeyMap and store into a context called lookupMap of type Object.
After successfully loading the context, for testing I wrote a tJava to read the lookupMap and cast it to a MultiKeyMap, but I get an error saying that cant cast String to MultiKeyMap.
lookupMap in Context is defined as Object and not String. So why do I get the cast error?
Attached are the screenshots
tJavaRow (sets the context),
and tJava (reads the context).
Looks like the first line is the one is getting the cast error.
Here is the error.
Looks like a bug to me.
As an FYI, if it is Standard job to Standard job, I already have a solution, I can use the global map to save my MultiKeyMap.
But I have to pass this to a Big data job, as you mentioned globalMap doesn't work for big data job. So looks like context is the solution but I get this error.
Watch the recorded webinar!
Create systems and workflow to manage clean data ingestion and data transformation.
Introduction to Talend Open Studio for Data Integration.
Test drive Talend's enterprise products.