Five Stars

JSON Array values to Rows … is there a better way?

 

Using: Talend Open Studio v6.5.

 

I have a JSON data structure like this inside a larger data structure:

{
   "dataSet": [{
        "_item": {
           "user": "bob",
           "list_of_colors": ["Red", "Gray","Silver","Black","White","Green"]
        }
   }]
}

What I want from this part of the data structure are 6 rows like this:
bob,"Red"
bob,"Gray"
bob,"Silver"
bob,"Black"
bob,"White"
bob,"Green"

 

However, I spent far too long getting to a “hack of a way” to do this. Maybe it is the only way to do this. If so, then use this as an example. But I hope there is a better way and someone can correct the sample project and let us all know. Smiley Happy (AKA: What is the better way?)

 

Attached is a sample project that shows multiple attempts I tried to get this to work.
      Please_fix: a definition of the problem
      try1: using tExractPositionalFields
      try2_take1: using tNormalize
      try2_take2: using tMap
      try2_take3_a_hack_that_works: tReplace and tNormalize

 

Hope that helps. And thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Fifteen Stars

Re: JSON Array values to Rows … is there a better way?

You just need to use two tExtractJSONField components to do this. I've done it very quickly to give you an idea.

The first tExtractJSONField is just used to get the user. The Loop is set to the first array and the Mappings simply get the user and the list_of_colorss array.

Screen Shot 2018-03-01 at 13.20.08.png

 

The second tExtractJSONField is uses the "list_of_colours" data extracted by the first tExtractJSONFields component. The Loop is just over the array. In the Mapping the "user" field is just a pass-through. Leave it blank. The colour is just an "@" as it is the element being looped on.

 

Screen Shot 2018-03-01 at 13.20.20.png

 

This will produce the data in the way you want.

Rilhia Solutions
2 REPLIES
Fifteen Stars

Re: JSON Array values to Rows … is there a better way?

You just need to use two tExtractJSONField components to do this. I've done it very quickly to give you an idea.

The first tExtractJSONField is just used to get the user. The Loop is set to the first array and the Mappings simply get the user and the list_of_colorss array.

Screen Shot 2018-03-01 at 13.20.08.png

 

The second tExtractJSONField is uses the "list_of_colours" data extracted by the first tExtractJSONFields component. The Loop is just over the array. In the Mapping the "user" field is just a pass-through. Leave it blank. The colour is just an "@" as it is the element being looped on.

 

Screen Shot 2018-03-01 at 13.20.20.png

 

This will produce the data in the way you want.

Rilhia Solutions
Five Stars

Re: JSON Array values to Rows … is there a better way?

@rhall_2_0  <-- THANK YOU!

 

I could not figure out the syntax/combinations to make this happen.

 

Your solution not only demonstrated the answer, but you also explained it very clearly as well.

I had tried to create this kind of solution too. But I failed to figure out several of the details and moved on to trying other variations before I posted my question.

 

 

What I learned from your reply:

 

1) Having an output defined as a column without a "Json query" will allow an existing data element to "pass through" the tExtractJSONField component.

             I would have expect that condition to replace it with a null value (or BLOCK the element) instead of allowing it to pass through. But I guess my expectations are just not "Talend'ized" enough yet.

             I really think it would make more sense if there was a special "passthrough from input" place holder value in the column table instead of a null column value. It would be so much more visual/clear if it said something instead of nothing. ( But maybe that is just me?  The text could even be displayed as "grayed out text" if needed. )

 

2) JSON query to select the "current node" is an AT sign ( @ ).

       I "speak" XPATH more than "Json query". So I would have guessed "." would have done that.

 

    If it helps someone else (who speaks XPATH) ...

   REF: https://help.talend.com/reader/hm5FaPiiOP31nUYHph0JwQ/LcNrhqdlyheCLpwyUwgJQw

      There is a linked ref to http://goessner.net/articles/JsonPath/  ( But it looks like a "draft" more than a "standard".)

 

3) I really hate the way the configuration of components in Talend sometimes need double quote, null values, or not those things. Depending on what you are trying to do.

        It almost seems like a "secret handshake" that I did not get a copy of the handbook that explains all of those rules. It is very frustrating and is easily the most confusing part of the Talend UI IMHO.

 

Thanks again.